Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On 2/17/07, Tom Lane [EMAIL PROTECTED] wrote: Hannu Krosing [EMAIL PROTECTED] writes: How easy/hard would it be to create unique indexes on tinterval (unique here meaning non-overlapping) ? Overlapping is not an equality relation (it fails the transitive law), so I'm not entirely sure what unique means in this context ... but I can promise you you can't make it work with btree. Hmm, let's assume two time intervals: A (with a0 as start and a1 as end times) B (woth b0 as start and b1 as end times) Now, we'd define operators as: A is left of B when a0 b0 AND a1 b0 A is right of B when a0 b1 AND a1 b1 A is equal to B if (a0 = b0 AND a0 = b1) OR (a1 = b0 AND a1 = b1) OR (a0 b0 AND a1 b1) Actually equal doesn't mean equal here, rather it says overlaps. Now, assuming UNIQUE INDEX on such table, the order would be preserved since no two intervals can overlap. And no overlapping data could be inserted without breaking ovelapivity. And of course non-unique index would produce garbage (since left of/right of wouldn't make any sense anymore). Interestingly, such non-overlapping datatypes could also make sense for network addresses (with netmasks). ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Howto change db cluster locale on-the-fly
Hi, we've made mistake and initdb database cluster in wrong locale :-( Now it's full of data. I've read in the docs that it's not possible to change locale. But I guess something like this would work: a) 1) drop all indexes on text/varchar columns 2) change cluster locale 3) create all indexes on text/varchar columns or even b) 1) change cluster locale 2) reindex all indexes on text/varchar columns [I'm aware that before reindex queries on top of these indexes would return wrong answers] Is it possible/safe to do a) or b)? How to do step change cluster locale? Where is this information stored? Or the only way is to rebuild the database cluster from scratch? Thanks, Kuba ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore fails with a custom backup file
On Sat, Feb 17, 2007 at 08:40:54PM +0100, Magnus Hagander wrote: IIRC, there was a warning from pg_dump. I don't recall exactly what, and don't have the space to re-run the test on my laptop here, but I think it was from: write_msg(modulename, WARNING: ftell mismatch with expected position -- ftell used\n); Ok, I've confirmed that the output is this: D:\prog\pgsql\inst\binpg_dump -Fc -Z0 test out pg_dump: [custom archiver] WARNING: ftell mismatch with expected position -- ftell used pg_dump: [custom archiver] WARNING: ftell mismatch with expected position -- ftell used pg_dump: [custom archiver] WARNING: ftell mismatch with expected position -- ftell used Three warnings for that one dump - my guess would be one for each table past the 2gb limit. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] pg_proc without oid?
I notice that this patch: http://developer.postgresql.org/cvsweb.cgi/pgsql/src/include/catalog/pg_proc.h.diff?r1=1.443r2=1.444 inserts a bunch of XML related rows in pg_proc without specifying oid. This breaks the fmgrtab generator on msvc. Most likely because I didn't think of that case. But since all other rows in pg_proc.h contain the oid, I just wanted to check if they're actually supposed to be withuot oid, or if that was a mistake? //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] anyelement2 pseudotype
Tom Lane wrote: I realized that I can probably fix ATAddForeignKeyConstraint to do the right thing by having it pass the two actual column types to can_coerce_type, thus allowing check_generic_type_consistency to kick in and detect the problem. Yeah, I came to the same conclusion. No amount of refactoring in parse_coerce.c is going to get the original concrete types back to compare. That should fix the problem with arrays, enums and any potential future generic types without mentioning them explicitly in there a la the hack there currently, thankfully. Cheers Tom ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Howto change db cluster locale on-the-fly
On Mon, Feb 19, 2007 at 09:27:06AM +0100, Jakub Ouhrabka wrote: But I guess something like this would work: a) 1) drop all indexes on text/varchar columns 2) change cluster locale 3) create all indexes on text/varchar columns You're going to miss the name columns, ie. every string index in pg_catalog. Also, there are shared tables which all used in every DB. You need to log into every DB in the cluster (don't forget template[01] and reindex everything. So, REINDEX DATABASE; seems to be a safer bet. In general this doesn't actually work since changing the locale may make two strings equal that wern't before, thus possibly breaking a unique index, but it may be possible. I'd suggest single user mode at least, and make backups! Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] New feature request: FlashBack Query
Hello, On Sat, 17 Feb 2007 06:49:42 -0800 (PST) RPK [EMAIL PROTECTED] wrote: PostgreSQL, already a mature database, needs to have more options for recovery as compared to proprietary databases. I just worked with Oracle's FlashBack query feature in Oracle 9i and FlashBack Table feature in 10g. Future versions of PostgreSQL must have similar features which enable users to bring Table(s) and/or Database(s) to a desired Time Stamp. There is a pgfoundry project which tries to achieve this: http://pgfoundry.org/projects/tablelog/ Kind regards -- Andreas 'ads' Scherbaum German PostgreSQL Usergroup: http://www.pgug.de ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_proc without oid?
Am Montag, 19. Februar 2007 10:16 schrieb Magnus Hagander: This breaks the fmgrtab generator on msvc. Most likely because I didn't think of that case. But since all other rows in pg_proc.h contain the oid, I just wanted to check if they're actually supposed to be withuot oid, or if that was a mistake? It's intentional. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] n-gram search function
On 2/19/07, Oleg Bartunov oleg@sai.msu.su wrote: pg_trgm was developed for spelling corrrection and there is a threshold of similarity, which is 0.3 by default. Readme explains what does it means. Yes, I read it. Similarity could be very low, since you didn't make separate column and length of the full string is used to normalize similarity. Yep, that's probably my problem. Ignored records are a bit longer than the others. I tried the tip in README.pg_trgm to generate a table with all the words. It can do the work in conjunction of tsearch2 and a bit of AJAX to suggest the full words to the users. The reason why I was not using tsearch2 is that it's sometimes hard to spell location names correctly. The only problem is that it is still quite slow on a 50k rows words table but I'll make further tests on a decent server this afternoon. -- Guillaume ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Short varlena headers and arrays
I had intended to make varlenas alignment 'c' and have the heaptuple.c force them to alignment 'i' if they required it. However I've noticed a problem that makes me think I should do this the other way around. The problem is that other places in the codebase use the alignment. In particular arrays do. Also toasting.c expects to get a worst-case size from att_align rather than a best-case. Also there's indextuple.c but probably I should get to that in this round anyways. So now I'm thinking it's best to leave them as alignment 'i' unless heaptuple.c thinks it can get away without aligning them. This means we don't have a convenient way for data types to opt out of this header compression. But the more I think about it the less convinced I am that we need that. The alignment inside the data type doesn't matter since you'll only be working with detoasted versions of them unless you specifically go out of your way to do otherwise. Once this is done it may be worth having arrays convert to short varlenas as well. Arrays of short strings hurt pretty badly currently: postgres=# select pg_column_size(array['a','b','c','d']); pg_column_size 56 (1 row) The only problem with this is if it's more likely for someone to stuff things in an array and then read them back out without detoasting than it is for someone to stuff them in a tuple. Probably the risk is the same. There is some code that assumes it understands how arrays are laid out in execQual.c and varlena.c. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Multiple Storage per Tablespace, or Volumes
Hi list, Here's a proposal of this idea which stole a good part of my night. I'll present first the idea, then 2 use cases where to read some rational and few details. Please note I won't be able to participate in any development effort associated with this idea, may such a thing happen! The bare idea is to provide a way to 'attach' multiple storage facilities (say volumes) to a given tablespace. Each volume may be attached in READ ONLY, READ WRITE or WRITE ONLY mode. You can mix RW and WO volumes into the same tablespace, but can't have RO with any W form, or so I think. It would be pretty handy to be able to add and remove volumes on a live cluster, and this could be a way to implement moving/extending tablespaces. Use Case A: better read performances while keeping data write reliability The first application of this multiple volumes per tablespace idea is to keep a tablespace both into RAM (tmpfs or ramfs) and on disk (both RW). Then PG should be able to read from both volumes when dealing with read queries, and would have to fwrite()/fsync() both volumes for each write. Of course, write speed will be constrained by the slowest volume, but the quicker one would then be able to take away some amount of read queries meanwhile. It would be neat if PG was able to account volumes relative write speed in order to assign pounds to each tablespace volumes; and have the planner or executor span read queries among volumes depending on that. For example if a single query has a plan containing several full scan (of indexes and/or tables) in the same tablespace, those could be done on different volumes. Use Case B: Synchronous Master Slave(s) Replication By using a Distributed File System capable of being mounted from several nodes at the same time, we could have a configuration where a master node has ('exports') a WO tablespace volume, and one or more slaves (depending on FS capability) configures a RO tablespace volume. PG has then to be able to cope with a RO volume: the data are not written by PG itself (local node point of view), so some limitations would certainly occur. Will it be possible, for example, to add indexes to data on slaves? I'd use the solution even without this, thus... When the master/slave link is broken, the master can no more write to tablespace, as if it was a local disk failure of some sort, so this should prevent nasty desync' problems: data is written on all W volumes or data is not written at all. I realize this proposal is the first draft of a work to be done, and that I won't be able to make a lot more than drafting this idea. This mail is sent on the hackers list in the hope someone there will find this is worth considering and polishing... Regards, and thanks for the good work ;) -- Dimitri Fontaine pgp4lUKkfwp0p.pgp Description: PGP signature
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
On 17/02/07, Warren Turkal [EMAIL PROTECTED] wrote: PERIOD(INT) is actually listed in the Dr. Snodgrass's book. However, I am not really sure about the semantics of the type. When would you use a PERIOD(INT)? It wouldn't be directly useful for temporal SQL, but I have a number of tables in a database application where a range of integers is mapped onto a 0-100 range (e.g. 1-5 might get mapped onto 1, 6-15 to 2, etc), which I'd like to store using a (preferably non-overlapping) period type. Also, please bring this discussion back on list so that it gets recorded. I didn't want to post your private reply to the list without your permission. Oops, meant to reply to the list originally. Ian ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_proc without oid?
On Mon, Feb 19, 2007 at 11:25:02AM +0100, Peter Eisentraut wrote: Am Montag, 19. Februar 2007 10:16 schrieb Magnus Hagander: This breaks the fmgrtab generator on msvc. Most likely because I didn't think of that case. But since all other rows in pg_proc.h contain the oid, I just wanted to check if they're actually supposed to be withuot oid, or if that was a mistake? It's intentional. Could you explain why, and what the expected result is? Since I can't find any other examples of people doing it :-) (will fix the vc stuff, of course, but still interested in knowing) //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps
One option that I've heard before is to have vacuum after a single iteration (ie, after it fills maintenance_work_mem and does the index cleanup and the second heap pass), remember where it was and pick up from that point next time. From my experience this is not acceptable... I have tables for which the index cleanup takes hours, so no matter how low I would set the maintenance_work_mem (in fact I set it high enough so there's only one iteration), it will take too much time so the queue tables get overly bloated (not happening either, they get now special cluster treatment). Cheers, Csaba. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, Feb 19, 2007 at 11:25:41AM +0100, Dimitri Fontaine wrote: Hi list, Here's a proposal of this idea which stole a good part of my night. I'll present first the idea, then 2 use cases where to read some rational and few details. Please note I won't be able to participate in any development effort associated with this idea, may such a thing happen! The bare idea is to provide a way to 'attach' multiple storage facilities (say volumes) to a given tablespace. Each volume may be attached in READ ONLY, READ WRITE or WRITE ONLY mode. You can mix RW and WO volumes into the same tablespace, but can't have RO with any W form, or so I think. Somehow this seems like implementing RAID within postgres, which seems a bit outside of the scope of a DB. Use Case A: better read performances while keeping data write reliability The first application of this multiple volumes per tablespace idea is to keep a tablespace both into RAM (tmpfs or ramfs) and on disk (both RW). For example, I don't beleive there is a restiction against having one member of a RAID array being a RAM disk. Use Case B: Synchronous Master Slave(s) Replication By using a Distributed File System capable of being mounted from several nodes at the same time, we could have a configuration where a master node has ('exports') a WO tablespace volume, and one or more slaves (depending on FS capability) configures a RO tablespace volume. Here you have the problem of row visibility. The data in the table isn't very useful without the clog, and that's not stored in a tablespace... Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
Bruce Momjian wrote: Tom Lane wrote: Jim C. Nasby [EMAIL PROTECTED] writes: On Thu, Feb 08, 2007 at 01:54:13PM -0500, Tom Lane wrote: I would be satisfied if the returned command tag were something else, maybe NO OPERATION. TABLE blah DID NOT EXIST might be less confusing... You're confusing a command tag with a notice. In the first place, we shouldn't assume that applications are ready to deal with indefinitely long command tags (the backend itself doesn't think they can be longer than 64 bytes); in the second place, they should be constant strings for the most part so that simple strcmp()s suffice to see what happened. Command tags are meant for programs to deal with, more than humans. Yep. Because IF EXISTS is in a lot of object destruction commands, adding a modified tag seems very confusing, because in fact the DROP TABLE did succeed, so to give any other tag seems incorrect. I don't understand -- what problem you got with NO OPERATION? It seemed a sound idea to me. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] wishlist items ..
Ühel kenal päeval, L, 2007-02-17 kell 13:35, kirjutas Lukas Kahwe Smith: Lukas Kahwe Smith wrote: I just wanted to bring up the wishlist todo items: http://developer.postgresql.org/index.php/Todo:WishlistFor83 What does/did the row Clustered/replication solutions refer to ? -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] wishlist items ..
Hannu Krosing wrote: Ühel kenal päeval, L, 2007-02-17 kell 13:35, kirjutas Lukas Kahwe Smith: Lukas Kahwe Smith wrote: I just wanted to bring up the wishlist todo items: http://developer.postgresql.org/index.php/Todo:WishlistFor83 What does/did the row Clustered/replication solutions refer to ? there was some discussion early on in 8.3 scoping to add some out of the box solutions for clustering and replication in order to reduce the barrier to entry for people who require such features. regards, Lukas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake: Hannu Krosing wrote: Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner: However, they don't have vacuum, we do. Right, and I think that is more or less because Oracle doesn't need it. Vacuum's main purpose (correct me if I am wrong) is to recover/mark rows that are no longer used, and Oracle essentially reuses the space immediately. Obviously with Oracle if you bloat out a table and delete a ton of rows then you have to rebuild the table, but that is more or less the same problem that PostgreSQL has and where vacuum full comes into play. The only benefit with the Oracle model is that you can achieve flashback, which is a very rarely used feature in my book. We can have flashbacks up to the last vacuum. It is just not exposed. Don't vacuum, and you have the whole history. (Actually you can't go for more than 2G transactions, or you get trx id rollover). To get a flashback query, you just have to construct a snapshot from that time and you are done. We don't store transaction times anywere, so the flashback has to be by transaction id, but there is very little extra work involved. We just don't have syntax for saying SELECT ... AS SEEN BY TRANSACTION XXX Well this is certainly interesting. What do we think it would take to enable the functionality? First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. also, we can't trust the DELETED flags in index pages, so we should forbid index scans, or just always re-check the visibility in heap. Otherways it would probably be enough to just scan tuples as usual, and check if they were visible to desired transaction, that is they were inserted before that transaction and they are not deleted before that trx. Of course this will not be true, once we have HOT/WIP with in-page vacuuming. -- Hannu Krosing Database Architect Skype Technologies OÜ Akadeemia tee 21 F, Tallinn, 12618, Estonia Skype me: callto:hkrosing Get Skype for free: http://www.skype.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
Am Montag, 19. Februar 2007 13:12 schrieb Alvaro Herrera: I don't understand -- what problem you got with NO OPERATION? It seemed a sound idea to me. It seems nonorthogonal. What if only some of the tables you mentioned did not exist? Do you get SOME OPERATION? There are also other cases where commands don't have an effect but we don't explicitly point that out. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New feature request: FlashBack Query
Hannu Krosing wrote: Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake: Hannu Krosing wrote: Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner: To get a flashback query, you just have to construct a snapshot from that time and you are done. We don't store transaction times anywere, so the flashback has to be by transaction id, but there is very little extra work involved. We just don't have syntax for saying SELECT ... AS SEEN BY TRANSACTION XXX Well this is certainly interesting. What do we think it would take to enable the functionality? First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. Would the xmin exported by that transaction prevent vacuum from removing any tuples still needed for the flashback snapshot? greetings, Florian Pflug ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] New feature request: FlashBack Query
Florian G. Pflug escribió: Hannu Krosing wrote: Ühel kenal päeval, P, 2007-02-18 kell 14:27, kirjutas Joshua D. Drake: Hannu Krosing wrote: Ühel kenal päeval, L, 2007-02-17 kell 22:49, kirjutas Chad Wagner: To get a flashback query, you just have to construct a snapshot from that time and you are done. We don't store transaction times anywere, so the flashback has to be by transaction id, but there is very little extra work involved. We just don't have syntax for saying SELECT ... AS SEEN BY TRANSACTION XXX Well this is certainly interesting. What do we think it would take to enable the functionality? First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. Would the xmin exported by that transaction prevent vacuum from removing any tuples still needed for the flashback snapshot? Sure, and that makes the mentioned lock unnecessary. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
Well this is certainly interesting. What do we think it would take to enable the functionality? First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction We could use something that controls global xmin. It would ensure, that global xmin does not advance bejond what still needs to be visible. This would probably be a sliding time window, or a fixed point in time that is released by the dba/user. Then all below is not really different from a situation where you had a long running tx. probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. also, we can't trust the DELETED flags in index pages, so we should forbid index scans, or just always re-check the visibility in heap. Otherways it would probably be enough to just scan tuples as usual, and check if they were visible to desired transaction, that is they were inserted before that transaction and they are not deleted before that trx. Of course this will not be true, once we have HOT/WIP with in-page vacuuming. Currently I think HOT does honor global xmin. There is no lookup for relevant xids, so parts of an update chain where only a previous tuple or a later tuple can be visible are reused. Else Hot would need to be told not to, in a scenario where a backend can choose a snapshot at will. Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] New feature request: FlashBack Query
First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. Would the xmin exported by that transaction prevent vacuum from removing any tuples still needed for the flashback snapshot? Sure, and that makes the mentioned lock unnecessary. Problem is, that that transaction sets a historic snapshot at a later time, so it is not yet running when vacuum looks at global xmin. So something else needs to hold up global xmin (see prev post). Andreas ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New feature request: FlashBack Query
Zeugswetter Andreas ADI SD wrote: First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction probably it is a good idea to take a lock on all tables involved to avoid a vacuum to be started on them when the query is running. Would the xmin exported by that transaction prevent vacuum from removing any tuples still needed for the flashback snapshot? Sure, and that makes the mentioned lock unnecessary. Problem is, that that transaction sets a historic snapshot at a later time, so it is not yet running when vacuum looks at global xmin. So something else needs to hold up global xmin (see prev post). I think to make this flashback stuff fly, you'd need to know the earliest xmin that you can still flashback too. Vacuum would advance that xmin, as soon as it starts working. So the case you'd need to protect against would be a race condition when you start a vacuum and a flashback transaction at the same time. But for that, some simple semaphore should suffice, and a well-thought-out ordering of the actions taken. In the long run, you'd probably want to store the commit-times of transactions somewhere, and add some guc that makes a vacuum assume that recently comitted transaction (say, in the last hour) are still considered active. That allow the dba to guarantee that he can always flashback at least a hour. greetings, Florian Pflug ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 13:12 schrieb Alvaro Herrera: I don't understand -- what problem you got with NO OPERATION? It seemed a sound idea to me. It seems nonorthogonal. What if only some of the tables you mentioned did not exist? Do you get SOME OPERATION? I'd say you get DROP TABLE as long as at least one table was dropped. There are also other cases where commands don't have an effect but we don't explicitly point that out. The precedent that I'm thinking about is that the command tag for COMMIT varies depending on what it actually did. regression=# begin; BEGIN regression=# select 1/0; ERROR: division by zero regression=# commit; ROLLBACK regression=# regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] n-gram search function
On Mon, 19 Feb 2007, Guillaume Smet wrote: On 2/19/07, Oleg Bartunov oleg@sai.msu.su wrote: pg_trgm was developed for spelling corrrection and there is a threshold of similarity, which is 0.3 by default. Readme explains what does it means. Yes, I read it. Similarity could be very low, since you didn't make separate column and length of the full string is used to normalize similarity. Yep, that's probably my problem. Ignored records are a bit longer than the others. I tried the tip in README.pg_trgm to generate a table with all the words. It can do the work in conjunction of tsearch2 and a bit of AJAX to suggest the full words to the users. The reason why I was not using tsearch2 is that it's sometimes hard to spell location names correctly. The only problem is that it is still quite slow on a 50k rows words table but I'll make further tests on a decent server this afternoon. You need to wait GiN support. Regards, Oleg _ Oleg Bartunov, Research Scientist, Head of AstroNet (www.astronet.ru), Sternberg Astronomical Institute, Moscow University, Russia Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/ phone: +007(495)939-16-83, +007(495)939-23-83 ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] pg_restore fails with a custom backup file
On Sat, Feb 17, 2007 at 01:28:22PM -0500, Tom Lane wrote: Magnus Hagander [EMAIL PROTECTED] writes: I'd also like a comment from at least one other patch reviewer that the methods used are good. It looks reasonable as far as it goes. One thought is that pg_dump really should have noticed that it was writing a broken archive. On machines where off_t is 32 bits, can't we detect the overflow situation? Tested on MSVC as well, works. Also tested and doesn't break the build on Linux (which shouldn't be affected at all). So, patch applied to HEAD and 8.2. //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] RFC: Temporal Extensions for PostgreSQL
Dawid Kuroczko [EMAIL PROTECTED] writes: ... Now, assuming UNIQUE INDEX on such table, the order would be preserved since no two intervals can overlap. And no overlapping data could be inserted without breaking ovelapivity. And of course non-unique index would produce garbage (since left of/right of wouldn't make any sense anymore). I think actually it doesn't work for unique indexes either :-( because of dead tuples. Consider that we have in the index ... (1,2) (6,8) DEAD (4,10) (12,14) ... Since under the given operators (6,8) and (4,10) are equal, btree will not guarantee that those index entries appear in any particular relative order. Thus the above is a legal index configuration. Now insert (3,5). This should surely be rejected because it overlaps (4,10). But what may well happen is that it gets compared to (1,2) --- OK, it's greater --- and to (6,8) --- OK, it's less --- and then the uniqueness check stops, because if it's less than (6,8) then there is no need to search further. Ooops. *This* is why the transitive law is essential. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
On 2/19/07, Tom Lane [EMAIL PROTECTED] wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 13:12 schrieb Alvaro Herrera: I don't understand -- what problem you got with NO OPERATION? It seemed a sound idea to me. It seems nonorthogonal. What if only some of the tables you mentioned did not exist? Do you get SOME OPERATION? I'd say you get DROP TABLE as long as at least one table was dropped. How about DROP TABLE cnt where 'cnt' is the number of tables dropped ? Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] New feature request: FlashBack Query
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction We could use something that controls global xmin. It would ensure, that global xmin does not advance bejond what still needs to be visible. This would probably be a sliding time window, or a fixed point in time that is released by the dba/user. Well there's another detail you have to cover aside from rolling back your xmin. You have to find the rest of the snapshot including knowing what other transactions were in-progress at the time you want to flash back to. If you just roll back xmin and set xmax to the same value you'll get a consistent view of the database but it may not match a view that was ever current. That is, some of the transactions after the target xmin may have committed before that xmin. So there was never a time in the database when they were invisible but your new xmin was visible. I think to do this you'll need to periodically record a snapshot and then later restore one of those saved snapshots. Not sure where would be a good place to record them. The WAL seems like a handy place but digging through the WAL would be annoying. Incidentally this is one of the things that would be useful for read-only access to PITR warm standby machines. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Howto change db cluster locale on-the-fly
Martijn van Oosterhout kleptog@svana.org writes: But I guess something like this would work: 1) drop all indexes on text/varchar columns 2) change cluster locale 3) create all indexes on text/varchar columns You're going to miss the name columns, ie. every string index in pg_catalog. But name is not locale-aware --- it just uses strcmp(). AFAIR there aren't any locale-dependent indexes in the system catalogs. So in principle you could hack pg_control, restart the postmaster, and then reindex every locale-dependent index. Hacking pg_control would be the hard part; you'll never get the CRC right if you do it manually. Possibly pg_resetxlog could be adapted to the purpose. I'd suggest single user mode at least, and make backups! Yup, a filesystem backup would be a *real* good idea. Not to mention testing the procedure on a toy installation. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_proc without oid?
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 10:16 schrieb Magnus Hagander: This breaks the fmgrtab generator on msvc. Most likely because I didn't think of that case. But since all other rows in pg_proc.h contain the oid, I just wanted to check if they're actually supposed to be withuot oid, or if that was a mistake? It's intentional. Kindly change that intention. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Short varlena headers and arrays
Gregory Stark [EMAIL PROTECTED] writes: Once this is done it may be worth having arrays convert to short varlenas as well. Elements of arrays are not subject to being toasted by themselves, so I don't think you can make that work. At least not without breaking wide swaths of code that works fine today. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] pg_proc without oid?
Am Montag, 19. Februar 2007 16:26 schrieb Tom Lane: Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 10:16 schrieb Magnus Hagander: This breaks the fmgrtab generator on msvc. Most likely because I didn't think of that case. But since all other rows in pg_proc.h contain the oid, I just wanted to check if they're actually supposed to be withuot oid, or if that was a mistake? It's intentional. Kindly change that intention. What is wrong? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
Am Montag, 19. Februar 2007 15:57 schrieb Tom Lane: The precedent that I'm thinking about is that the command tag for COMMIT varies depending on what it actually did. Some have also argued against that in the past, so I guess we just have different ideas of how it should work. Not a problem. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_proc without oid?
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 16:26 schrieb Tom Lane: Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 10:16 schrieb Magnus Hagander: This breaks the fmgrtab generator on msvc. It's intentional. Kindly change that intention. What is wrong? Well, in the first place Gen_fmgrtab.sh is producing garbage: #define F_CURSOR_TO_XML DATAINSERT #define F_CURSOR_TO_XMLSCHEMA DATAINSERT #define F_QUERY_TO_XML DATAINSERT #define F_QUERY_TO_XML_AND_XMLSCHEMA DATAINSERT #define F_QUERY_TO_XMLSCHEMA DATAINSERT #define F_TABLE_TO_XML DATAINSERT #define F_TABLE_TO_XML_AND_XMLSCHEMA DATAINSERT #define F_TABLE_TO_XMLSCHEMA DATAINSERT #define F_BYTEAOUT 31 #define F_CHAROUT 33 const FmgrBuiltin fmgr_builtins[] = { { 0, cursor_to_xml, 5, true, false, cursor_to_xml }, { 0, cursor_to_xmlschema, 4, true, false, cursor_to_xmlschema }, { 0, query_to_xml, 4, true, false, query_to_xml }, { 0, query_to_xml_and_xmlschema, 4, true, false, query_to_xml_and_xmlschema }, { 0, query_to_xmlschema, 4, true, false, query_to_xmlschema }, { 0, table_to_xml, 4, true, false, table_to_xml }, { 0, table_to_xml_and_xmlschema, 4, true, false, table_to_xml_and_xmlschema }, { 0, table_to_xmlschema, 4, true, false, table_to_xmlschema }, { 31, byteaout, 1, true, false, byteaout }, The fact that that table is broken means you're incurring expensive linear searches to invoke these functions. It's only by chance that it works at all... In the second place, if you don't want to predetermine OIDs for your functions then they shouldn't be in hardwired pg_proc.h rows at all. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] n-gram search function
On 2/19/07, Oleg Bartunov oleg@sai.msu.su wrote: You need to wait GiN support. OK. Thanks. If you need testers for this one, feel free to contact me. I'm very interested in testing pg_trgm in conjunction with tsearch2. -- Guillaume ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_proc without oid?
Tom Lane [EMAIL PROTECTED] writes: In the second place, if you don't want to predetermine OIDs for your functions then they shouldn't be in hardwired pg_proc.h rows at all. Is there any place to hook in to create things like procedures or other SQL objects that don't really need hard coded OIDs? It seems like we could make the catalogs much easier to maintain by ripping out everything that isn't needed by the system tables themselves and having initdb create them by running a plain SQL script. In particular I'm looking towards having all the operators and associated hardware except for the basic btree operators for types used by the system tables be created in plain SQL. It might also reduce the pain OID conflicts cause. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Le lundi 19 février 2007 16:33, Tom Lane a écrit : Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I though moving some knowledge about data availability into PostgreSQL code could provide some valuable performance benefit, allowing to organize reads (for example parallel tables scan/indexes scan to different volumes) and obtaining data from 'quicker' known volume (or least used/charged). You're both saying RAID/LVM implementations provide good enough performances for PG not having to go this way, if I understand correctly. And distributed file systems are enough to have the replication stuff, without PG having to deal explicitly with the work involved. May be I should have slept after all ;) Thanks for your time and comments, regards, -- Dimitri Fontaine ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] pg_proc without oid?
Am Montag, 19. Februar 2007 16:50 schrieb Tom Lane: Well, in the first place Gen_fmgrtab.sh is producing garbage: Uh, ok, that needs fixing. In the second place, if you don't want to predetermine OIDs for your functions then they shouldn't be in hardwired pg_proc.h rows at all. Where else would you put them? -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] pg_proc without oid?
Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 16:50 schrieb Tom Lane: In the second place, if you don't want to predetermine OIDs for your functions then they shouldn't be in hardwired pg_proc.h rows at all. Where else would you put them? SQL script maybe, much along the lines Greg was just mentioning. (I'd been thinking myself earlier that pg_amop/amproc/etc would be a whole lot easier to maintain if we could feed CREATE OPERATOR CLASS commands to the bootstrap process.) But getting there will take nontrivial work; you can't just decide to leave out a few OIDs on the spur of the moment. Magnus, I'd suggest reverting whatever you did to your MSVC script, so we'll find out the next time someone makes this mistake... regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] New feature request: FlashBack Query
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On Mon, Feb 19, 2007 at 04:00:09PM +0100, Florian G. Pflug wrote: [...] In the long run, you'd probably want to store the commit-times of transactions somewhere, and add some guc that makes a vacuum assume that recently comitted transaction (say, in the last hour) are still considered active [...] Funny how some things recur: http://archives.postgresql.org/pgsql-hackers/2007-01/msg01301.php (says I and seeks shelter beneath a big rock ;-) Regards - -- tomás -BEGIN PGP SIGNATURE- Version: GnuPG v1.4.1 (GNU/Linux) iD8DBQFF2c3RBcgs9XrR2kYRAh1PAJ442IXzr0CjN0w5a3BpwBrKgVGvsgCcCmyh mnM5AUTHo4uIZ/WCnWxLVM0= =1aUG -END PGP SIGNATURE- ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Dimitri Fontaine [EMAIL PROTECTED] writes: You're both saying RAID/LVM implementations provide good enough performances for PG not having to go this way, if I understand correctly. There's certainly no evidence to suggest that reimplementing them ourselves would be a productive use of our time. regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, Feb 19, 2007 at 05:10:36PM +0100, Dimitri Fontaine wrote: RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I though moving some knowledge about data availability into PostgreSQL code could provide some valuable performance benefit, allowing to organize reads (for example parallel tables scan/indexes scan to different volumes) and obtaining data from 'quicker' known volume (or least used/charged). Well, organising requests to be handled quickly is a function of LVM/RAID, so we don't go there. However, speeding up scans by having multiple requests is an interesting approach, as would perhaps a different random_page_cost for different tablespaces. My point is, don't try to implement the mechanics of LVM/RAID into postgres, instead, work on providing ways for users to take advantage of these mechanisms if they have them. Look at it as if you have got LVM/RAID setup for your ideas, how do you get postgres to take advantage of them? Have a nice day, -- Martijn van Oosterhout kleptog@svana.org http://svana.org/kleptog/ From each according to his ability. To each according to his ability to litigate. signature.asc Description: Digital signature
Re: [HACKERS] pg_proc without oid?
Peter Eisentraut wrote: Am Montag, 19. Februar 2007 16:50 schrieb Tom Lane: Well, in the first place Gen_fmgrtab.sh is producing garbage: Uh, ok, that needs fixing. In the second place, if you don't want to predetermine OIDs for your functions then they shouldn't be in hardwired pg_proc.h rows at all. Where else would you put them? But _why_ wouldn't you want to have fixed OIDs for the functions? I'm not seeing the benefit. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Chatter on DROP SOMETHING IF EXISTS
Tom Lane wrote: Peter Eisentraut [EMAIL PROTECTED] writes: Am Montag, 19. Februar 2007 13:12 schrieb Alvaro Herrera: I don't understand -- what problem you got with NO OPERATION? It seemed a sound idea to me. It seems nonorthogonal. What if only some of the tables you mentioned did not exist? Do you get SOME OPERATION? I'd say you get DROP TABLE as long as at least one table was dropped. If we went with DROP TABLE if any table was dropped, and NO OPERATION for none, I am fine with that. What I didn't want was a different NO OPERATION-type of message for every object type. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Confusing message on startup after a crash while recovering
Tom Lane wrote: Florian G. Pflug [EMAIL PROTECTED] writes: I'd suggest that the text is changed to something along the line of: database system was interrupted while in recovery at ... If this has occurred more than once some data may be corrupted and you may need to restore from the last backup. It seems the real problem is that it's not specifying *which* data is probably corrupted. Maybe: HINT: If recovery fails repeatedly, it probably means that the recovery log data is corrupted; you may have to restore from your last full backup. IMHO that wording would be fine too - the important points for me is to clearly state that corrupted data is maybe the _cause_ of the crash, and not the _effect_ of the crash. And for the sake of consistency, the message for abort-during-recovery and abort-during-archivelog-replay should be similar. Also, do we want to suggest use of pg_resetxlog in the message? I'd rather add some documentation of how to use pg_resetxlog to the manual if it's not already there, any maybe reference that chapter in a HINT message. In that manual chapter you can warn about the dangers of pg_resetxlog, and put in an advice to backup the database before using it. I think such a warning is important, because any documentation of pg_resetxlog is targeted at users know are not familiar with postgres internals, and those users are likely to shoot themselves in their foot if you point them to pg_resetxlog. greetings, Florian Pflug ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On 2/19/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Guillaume Smet escribió: On 2/19/07, Alvaro Herrera [EMAIL PROTECTED] wrote: So add the session ID (%c) to log_line_prefix. It could work if log_line_prefix was added before every line but it's definitely not the case: myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT * FROM lieu LIMIT 10; if you execute: SELECT * FROM lieu LIMIT 10; Interesting. I wonder why didn't you report this as a bug before? Maybe we could have discussed it and fixed it. Perhaps because I thought it was not really a bug but the intended behaviour. Syslog has the same behaviour and it's quite logical when you consider how queries are logged (I've spent a few hours in the logging code). Syslog has exactly the same behaviour but adds the necessary context information. I'm pretty sure I have explained the problem a few times on the lists though but perhaps it was just on IRC. From the feedback I have on pgFouine, very few people think it's a real problem, probably because they don't use query logging as we do: our production servers have it enabled all the time and we have a high load on them so this particular case is a common case for us. (Second try to move this discussion to -hackers) -- Guillaume ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
Guillaume Smet escribió: On 2/19/07, Alvaro Herrera [EMAIL PROTECTED] wrote: Guillaume Smet escribió: On 2/19/07, Alvaro Herrera [EMAIL PROTECTED] wrote: So add the session ID (%c) to log_line_prefix. It could work if log_line_prefix was added before every line but it's definitely not the case: myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT * FROM lieu LIMIT 10; if you execute: SELECT * FROM lieu LIMIT 10; Interesting. I wonder why didn't you report this as a bug before? Maybe we could have discussed it and fixed it. Perhaps because I thought it was not really a bug but the intended behaviour. Syslog has the same behaviour and it's quite logical when you consider how queries are logged (I've spent a few hours in the logging code). Syslog has exactly the same behaviour but adds the necessary context information. If it adds necessary context then it clear does not have the same behavior, because the problem is precisely that the context is missing. I'd propose adding a log_entry_prefix separate from log_line_prefix; the entry prefix would contain most of the stuff, and log_line_prefix would be a minimal thing intended to be put in front of each _line_, so the example you show above could be myuser mydb 45d9d615.4abe LOG: duration : 185.223 ms, statement : SELECT * 45d9d615.4abe FROM lieu 45d9d615.4abe LIMIT 10; where you have log_entry_prefix=%d %u log_line_prefix=%c Really, prefixing with a tab does not strike me as a great idea precisely because it's ambiguous. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 6: explain analyze is your friend
[HACKERS] How can I merge a TargetEntry with a tuple?
Hi, I started working again on my IDENTITY/GENERATED patch. My question is $SUBJECT. This code is in rewriteTargetlist(): new_attr = build_column_default() new_tle = makeTargetEntry((Expr *) new_expr, ...) Now, in ExecInsert() I have to compute the default for IDENTITY/GENERATED between ExecConstraints() and heap_insert(). How can I create a Datum out of either an Expr or a TargetEntry (that contains the computed constant out of the default expression) so I can use it after I did an heap_deform_tuple()? Best regards, Zoltán Böszörményi ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] wishlist items ..
On Mon, Feb 19, 2007 at 01:28:46PM +0100, Lukas Kahwe Smith wrote: Hannu Krosing wrote: Ãhel kenal päeval, L, 2007-02-17 kell 13:35, kirjutas Lukas Kahwe Smith: Lukas Kahwe Smith wrote: I just wanted to bring up the wishlist todo items: http://developer.postgresql.org/index.php/Todo:WishlistFor83 What does/did the row Clustered/replication solutions refer to ? there was some discussion early on in 8.3 scoping to add some out of the box solutions for clustering and replication in order to reduce the barrier to entry for people who require such features. I believe this has been started as a replication hooks project, although I'm unsure of its current status. Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On 2/19/07, Alvaro Herrera [EMAIL PROTECTED] wrote: If it adds necessary context then it clear does not have the same behavior, I mean log_line_prefix behaviour is the same. The other information are syslog specific. I'd propose adding a log_entry_prefix separate from log_line_prefix; the entry prefix would contain most of the stuff, and log_line_prefix would be a minimal thing intended to be put in front of each _line_, so the example you show above could be It could be a good idea. It won't make me use stderr output but it will allow other people to do so without any disadvantage :). Really, prefixing with a tab does not strike me as a great idea precisely because it's ambiguous. Sure. -- Guillaume ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_proc without oid?
Tom Lane wrote: SQL script maybe, much along the lines Greg was just mentioning. I would welcome that, although a similar suggestion was rejected a few years ago, which is why I didn't pursue it here. you can't just decide to leave out a few OIDs on the spur of the moment. I still don't understand why that would be a problem, aside from the fmgrtab problem that is specific to pg_proc. Other system catalogs also have mixed entries with and without explicit OIDs. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I expect that someone will point out that Windows doesn't support RAID or LVM, and we'll have to reimplement it anyway. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Peter Eisentraut wrote: Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I expect that someone will point out that Windows doesn't support RAID or LVM, and we'll have to reimplement it anyway. Windows supports both RAID and LVM. //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Peter Eisentraut wrote: Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I expect that someone will point out that Windows doesn't support RAID or LVM, and we'll have to reimplement it anyway. windows supports software raid just fine since Windows 2000 or so ... Stefan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Short varlena headers and arrays
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: Once this is done it may be worth having arrays convert to short varlenas as well. Elements of arrays are not subject to being toasted by themselves, so I don't think you can make that work. At least not without breaking wide swaths of code that works fine today. You think it's more likely there are places that build arrays and then read the items back without passing through detoast than there are places that build tuples and do so? Btw I ran into some problems with system tables. Since many of them are read using the GETSTRUCT method and in that method the first varlena field should be safely accessible, i would have to not skip the alignment for the first varlena field in system tables. Instead I just punt on all system tables. The only one that seems like it'll be loss on is pg_statistic and there the biggest problem is the space wasted inside the arrays, not before the varlena fields. Also, int2vector and oidvector don't expect to be toasted so I've skipped them as well. If we want to have an escape hatch they would have to be so marked. For now I just hard coded them. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] New feature request: FlashBack Query
Gregory Stark wrote: Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction We could use something that controls global xmin. It would ensure, that global xmin does not advance bejond what still needs to be visible. This would probably be a sliding time window, or a fixed point in time that is released by the dba/user. Well there's another detail you have to cover aside from rolling back your xmin. You have to find the rest of the snapshot including knowing what other transactions were in-progress at the time you want to flash back to. If you just roll back xmin and set xmax to the same value you'll get a consistent view of the database but it may not match a view that was ever current. That is, some of the transactions after the target xmin may have committed before that xmin. So there was never a time in the database when they were invisible but your new xmin was visible. [...] Incidentally this is one of the things that would be useful for read-only access to PITR warm standby machines. Couldn't you define things simply to be that you get a consistent view including all transactions started before x transaction? This is time travel lite, but low overhead which I think is a key benefit of this approach. A huge value for this would be in the oops, I deleted my data category. Postgresql rarely looses data, but clients seem to have a habit of doing so, and then going oops. This seems to happen most often when facing something like a reporting deadline where they are moving lots of stuff around and making copies and sometimes delete the wrong company recordset or equivalent, even with confirmation dialogs at the app level. This would give a quick and easy oops procedure to the client. DBA set's guc to 1hr, tells client, if you make a big mistake, stop database server as follows and call. Frankly, would bail a few DBA's out as well. The key is how lightweight the setup could be, which matters because clients are not always willing to pay for a PITR setup. The low overhead would mean you'd feel fine about setting guc to 1hr or so. As a % of total installed instances I suspect the % with PITR is small. I've got stuff I snapshot nightly, but that's it. So don't have an easy out from the oops query either. - August ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On Mon, 19 Feb 2007, Guillaume Smet wrote: Why not simply put something like %log_table% in the sql file and let the admin replace it with sed or whatever he likes? This is a reasonable approach. I would suggest that no special characters be used though, so that the SQL could be used as-is by a DBA who doesn't even know about or want to use tools like sed. I also think the default should be a pg_ name like the pg_log I suggested, partly because I'd like this to be completely internal one day--just push the logs into the database directly without even passing through an external file first. Also, something like pg_log is unlikely to cause a conflict with existing tables. I would bet there's already databases out there with tables called log_table, for example, but everyone already avoids naming application tables starting with pg_. A workable syntax might be INSERT INTO pg_log ... The redundant quotation marks will make it easier to do a search/replace to change the table name without worrying about accidentally impacting the text of the message, so that even people who aren't aware how to build a regular expression that only modifies the first match will probably be OK. I consider using the same name as the default log directory helpful, but would understand that others might consider it confusing to overload the name like that. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Magnus Hagander wrote: Windows supports both RAID and LVM. Oh good, so we've got that on record. :) -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] New feature request: FlashBack Query
August Zajonc wrote: Gregory Stark wrote: Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: First we must run the query in serializable mode and replace the snapshot with a synthetic one, which defines visibility at the start of the desired transaction We could use something that controls global xmin. It would ensure, that global xmin does not advance bejond what still needs to be visible. This would probably be a sliding time window, or a fixed point in time that is released by the dba/user. Well there's another detail you have to cover aside from rolling back your xmin. You have to find the rest of the snapshot including knowing what other transactions were in-progress at the time you want to flash back to. If you just roll back xmin and set xmax to the same value you'll get a consistent view of the database but it may not match a view that was ever current. That is, some of the transactions after the target xmin may have committed before that xmin. So there was never a time in the database when they were invisible but your new xmin was visible. [...] Incidentally this is one of the things that would be useful for read-only access to PITR warm standby machines. Couldn't you define things simply to be that you get a consistent view including all transactions started before x transaction? This is time travel lite, but low overhead which I think is a key benefit of this approach. I was thinking along the same line. Flashback is probably ony really usefull on databases that are mostly read-only, but with a few users who update data. You'd use flashback to undo catastrophic changes done by accident, and probably will gladly accept that you undo a little more work than strictly necessary. On the contrary, if you're running a online shop were people buy stuff 24/7, and, say, somebody accidentally deletes some producs, than you won't want to loose the orders happened during that last hour, but will rather try to regenerate that products from your last backup. So I don't think that it's too important what snapshot you get exactly, making the xmin=xmax idea feasable. The same holds true for PITR warm standby (readonly queries on pitr slaves). This would be used for reporting, or load-balancing of searches in fairly static data - all of which won't depend on the exact snapshot you get. greetings, Florian Pflug ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Stefan Kaltenbrunner wrote: Peter Eisentraut wrote: Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I expect that someone will point out that Windows doesn't support RAID or LVM, and we'll have to reimplement it anyway. windows supports software raid just fine since Windows 2000 or so ... Longer than that... it supported mirroring and raid 5 in NT4 and possibly even NT3.51 IIRC. Joshua D. Drake Stefan ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Deadlock with pg_dump?
On Tue, 2007-02-13 at 22:19 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Thu, 2006-10-26 at 18:45 -0400, Tom Lane wrote: Chris Campbell [EMAIL PROTECTED] writes: Is there additional logging information I can turn on to get more details? I guess I need to see exactly what locks both processes hold, and what queries they were running when the deadlock occurred? Is that easily done, without turning on logging for *all* statements? log_min_error_statement = error would at least get you the statements reporting the deadlocks, though not what they're conflicting against. Yeh, we need a much better locking logger for performance analysis. We really need to dump the whole wait-for graph for deadlocks, since this might be more complex than just two statements involved. Deadlocks ought to be so infrequent that we can afford the log space to do this - plus if we did this it would likely lead to fewer deadlocks. For 8.3 I'd like to have a log_min_duration_lockwait (secs) parameter that would allow you to dump the wait-for graph for any data-level locks that wait too long, rather than just those that deadlock. Many applications experience heavy locking because of lack of holistic design. That will also show up the need for other utilities to act CONCURRENTLY, if possible. Old email, but I don't see how our current output is not good enough? test= lock a; ERROR: deadlock detected DETAIL: Process 6855 waits for AccessExclusiveLock on relation 16394 of database 16384; blocked by process 6795. Process 6795 waits for AccessExclusiveLock on relation 16396 of database 16384; blocked by process 6855. This detects deadlocks, but it doesn't detect lock waits. When I wrote that it was previous experience driving me. Recent client experience has highlighted the clear need for this. We had a lock wait of 50 hours because of an RI check; thats the kind of thing I'd like to show up in the logs somewhere. Lock wait detection can be used to show up synchronisation points that have been inadvertently designed into an application, so its a useful tool in investigating performance issues. I have a patch implementing the logging as agreed with Tom, will post to patches later tonight. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. What would be nice is to be able to detach one of the volumes, and know the span of the data in there without being able to access the data. The problem that a lot of warehouse operators have is something like this: We know we have all this data, but we don't know what we will want to do with it later. So keep it all. I'll get back to you when I want to know something. It'd be nice to be able to load up all that data once, and then shunt it off into (say) read-only media. If one could then run a query that would tell one which spans of data are candidates for the search, you could bring back online (onto reasonably fast storage, for instance) just the volumes you need to read. A -- Andrew Sullivan | [EMAIL PROTECTED] Users never remark, Wow, this software may be buggy and hard to use, but at least there is a lot of code underneath. --Damien Katz ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Andrew Sullivan wrote: On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. The dreaded words from a developers mouth to every manager in the world. Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, 19 Feb 2007, Joshua D. Drake wrote: Longer than that... it supported mirroring and raid 5 in NT4 and possibly even NT3.51 IIRC. Mirroring and RAID 5 go back to Windows NT 3.1 Advanced Server in 1993: http://support.microsoft.com/kb/114779 http://www.byte.com/art/9404/sec8/art7.htm The main source of confusion about current support for this feature is that the desktop/workstation version of Windows don't have it. For Windows XP, you need the XP Professional version to get dynamic disk support; it's not in the home edition. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] proposed todo: use insert/update returning anywhere a table is allowed
Added to TODO: * Allow INSERT/UPDATE ... RETURNING inside a SELECT 'FROM' clause http://archives.postgresql.org/pgsql-general/2006-09/msg00803.php http://archives.postgresql.org/pgsql-hackers/2006-10/msg00693.php --- Jeff Davis wrote: On Tue, 2007-02-06 at 16:54 -0500, Merlin Moncure wrote: It would be great to be able to join to update, returning, etc. It looks like the code was deliberately tied off as-is and I was surprised not to see a todo for this. the basic idea is to be able to do things like: select * from (update foo set id = 1 returning *) q; which currently syntax errors out on 'set' which is a bit odd. here are tom's brief notes on it and also some context on how select combined with insert/update might influence triggers: http://archives.postgresql.org/pgsql-hackers/2006-10/msg00693.php Also a relevent thread: http://archives.postgresql.org/pgsql-general/2006-09/msg00803.php Regards, Jeff Davis ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Joshua D. Drake wrote: Andrew Sullivan wrote: On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. The dreaded words from a developers mouth to every manager in the world. Yea, I just instinctively hit delete when I saw that phrase. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] New feature request: FlashBack Query
On Mon, 19 Feb 2007 20:30:59 +0100, Florian G. Pflug [EMAIL PROTECTED] said: August Zajonc wrote: Gregory Stark wrote: Couldn't you define things simply to be that you get a consistent view including all transactions started before x transaction? This is time travel lite, but low overhead which I think is a key benefit of this approach. I was thinking along the same line. Flashback is probably ony really usefull on databases that are mostly read-only, but with a few users who update data. You'd use flashback to undo catastrophic changes done by accident, and probably will gladly accept that you undo a little more work than strictly necessary. On the contrary, if you're running a online shop were people buy stuff 24/7, and, say, somebody accidentally deletes some producs, than you won't want to loose the orders happened during that last hour, but will rather try to regenerate that products from your last backup. Hopefully people doing order systems are using PITR or similar :) For the time travel light case, it's just a matter of clear definition. You get all transactions that were *started* before and up to x trx. If the transaction rolled back you still won't see it, so you're still getting a consistent view. But if it committed after your marker you will see it. That seems ok to me. In fact, I suspect folks think of transactions as happening more or less when they get sent to the DB, so this may map more directly to what people expect. The one caveat would be that if you started a long running transaction, then did the oops trx 5 minutes later, and then started time travel *before* the long running trx committed. In that case you wouldn't see that long running trx, so the definition would need to be modified to be something like all trx started before x, that were no longer running when you time travel. Don't know if it is worth a NOTICE in the logs if you time travel back, but there are id's of transactions from before your xmin that are still running (and if you waited a bit might become visable in your time travel view). If Jan gets his way with a timestamp on trx commit, then you can do started before x time, which may be more user friendly. For PITR I'd imagine you might actually be able to get the visability right no? Havn't looked deeply enough into the wal logs to understand how the partial playback scanario works. If the wal logs are ordered on trx commit time, then you'd get proper visability. - August ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On Mon, 19 Feb 2007, Alvaro Herrera wrote: We already have a combined GUC option that is used to change two different things (DateStyle) and I regularly see people confused about how to use it. You already have a combined GUC option called log_destination that's sitting in the appropriate area of the configuration file, doing something similar to what's needed for the new feature. People confused by that are already confused. Also, sql is not really a destination -- it is a format. A log file with a different name is another destination. eventlog is certainly a different format and it's sitting happily as an option there. I haven't heard anyone make a useful argument yet as to how insert/sql logs are any different than the current way that stderr, syslog, and eventlog are all possibilities now for log_destination, each with their own little quirks (and in the case of syslog, their own additional GUC parameters). That way you can choose to have one or the other, or both if you're really dumb. The fact that you're characterizing people who might want both as really dumb tells me you're not familiar with enterprise logging requirements. I already commented on situations where wanting both types of output going at once is going to absolutely be a requirement in some environments for this feature addition to be useful; there are a lot of large operations that rely heavily on features like syslog to help manage their systems. Most of the places I've worked at, the syslog server where the analysis is running wasn't necessarily even in the same state as the machine generating the log entries. I know I can't deploy this feature unless it operates in parallel with the existing text-based info going to syslog, both because of that and because of transition issues--I can't disrupt the existing logs to test a new log mechanism until that new mechanism has proven itself. I'll probably deploy it with both turned on forever once it's available. As for your comments on syslog vs. stderr, I completely agree with Guillaume's response to you on that subject. The stderr output is difficult to use for the reasons he describes, but the kind of environments that use complicated logging aren't relying on that anyway. I wouldn't get distracted by fixing that implementation when it's functional enough for most who are satisfied with stderr output. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
Andrew Sullivan wrote: On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. What would be nice is to be able to detach one of the volumes, and know the span of the data in there without being able to access the data. The problem that a lot of warehouse operators have is something like this: We know we have all this data, but we don't know what we will want to do with it later. So keep it all. I'll get back to you when I want to know something. You should be able to do that with tablespaces and VACUUM FREEZE, the point of the latter being that you can take the disk containing the read only data offline, and still have the data readable after plugging it back in, no matter how far along the transaction ID counter is. -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] referential Integrity and SHARE locks
Added to TODO: * Allow UPDATEs on only non-referential integrity columns not to conflict with referential integrity locks http://archives.postgresql.org/pgsql-hackers/2007-02/msg00073.php --- Jan Wieck wrote: On 2/8/2007 2:46 PM, Marc Munro wrote: On Thu, 2007-08-02 at 14:33 -0500, Tom Lane wrote: Marc Munro [EMAIL PROTECTED] writes: Yes in this case, T1 must abort because the record it was going to update has disappeared from underneath it. I don't see how this is significantly different from the same race for the record if the table had no RI constraints. The only difference that I can see, is that T1 now has some locks that it must relinquish as the transaction aborts. No, the difference is there would have been no error at all before; if the record were deleted before T1 got to it then it wouldn't have attempted to update it. I really don't think you can make it work to perform updates or deletes on a record you have not yet locked. The record would be locked before the update or delete is attempted, however it would not be locked until the referential integrity constraints have succeeded in acquiring their locks. It is becoming clear to me that I am missing something but I still don't know what it is. If anyone can see it and explain it I'd really appreciate it. I think you are missing the fact that the exclusive row lock on UPDATE is taken before any triggers are fired at all, even BEFORE ROW triggers. This is necessary in order to prevent the row being updated or removed concurrently while the triggers are executing. Since BEFORE ROW triggers can modify the content of the row (including the foreign key), the RI check and lock of the referenced row cannot happen before other BR triggers are completed. In order to make your idea fly, the RI check trigger on INSERT or UPDATE would have to be fired before taking the row lock considering the NEW values for referencing columns as they are thus far. Since the row isn't locked at this time, it can change or disappear while the RI trigger is executing, so the check and lock has to be redone later with the actual row that got locked and after all BR triggers are done with it. Jan -- #==# # It's easier to get forgiveness for being wrong than for being right. # # Let's break this rule - forgive me. # #== [EMAIL PROTECTED] # ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Proposed adjustments in MaxTupleSize andtoastthresholds
Added to TODO: * Consider allowing configuration of TOAST thresholds http://archives.postgresql.org/pgsql-hackers/2007-02/msg00213.php --- Simon Riggs wrote: On Mon, 2007-02-05 at 19:18 -0500, Jan Wieck wrote: On 2/5/2007 11:52 AM, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Sounds like a good time to suggest making these values configurable, within certain reasonable bounds to avoid bad behaviour. Actually, given what we've just learned --- namely that choosing these values at random is a bad idea --- I'd want to see a whole lot of positive evidence before adding such a configuration knob. Some of the evidence is TOAST itself. Every time you do not SET a column that has been toasted into external storage during an UPDATE, you win because the columns data isn't read during the scan for the row to update, it isn't read during heap_update(), it isn't actually updated at all (the toast reference is copied as is and the external value reused), and not a single byte of the external data is bloating WAL. If someone knows that 99% of their updates will not hit certain text columns in their tables, actually forcing them to be compressed no matter what and to be stored external if they exceed 100 bytes will be a win. Yes, thats the main use case. Of course, this is a bit different from Simon's approach. What I describe here is a per pg_attribute configuration to enforce a certain new toaster behavior. Since we already have something that gives the toaster a per column cluestick (like not to bother trying to compress), it might be much easier to implement then Simon's proposal. It would require that the toaster goes over the initial heap tuple for those specially configured columns even if the tuple is below the toast threshold, which suggests that a pg_class.relhasspecialtoastneeds could be useful. But I think as for fine tuning capabilities, a column insensitive maximum tuple size is insufficient anyway. Well, sounds like we both want the same thing. The only discussion seems to be about user interface. Setting it per column is much better for very fine tuning, but setting them in isolation doesn't help decide what to do when you have lots of medium length strings where the sum exceeds the toast target. IMHO it would be better to have an col-level storage priority (default 0) and then an table-level settable toast target. So we start applying the storage handling mechanisms on the highest priority columns and keep going in descending order until we are under the limit for the table. ALTER TABLE foo ALTER COLUMN foocol SET STORAGE EXTERNAL PRIORITY 5 WITH (toast_target = 400); /* must be MAXALIGNed value */ Equal priorities are allowed, in which case lowest attribute id wins, i.e. current behaviour remains the default. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] libpq docs about PQfreemem
I have updated the PQfree documentation; patch attached. Backpatched to 8.2.X. --- Zeugswetter Andreas ADI SD wrote: future for some reason. (the doc for the other functions say you have to use PQfreemem without mentioning any exceptions) Thoughts? Rip out or update? Are you saying that almost all Win32 binaries and libraries now can free across DLLs? You can under very narrow conditions. You need to force code generation for Multithreaded DLL run-time libraries (e.g. in VC6 msvcrt.dll) for all exe's and dll's. This is bad for debugging, memory checkers and probably impossible when using different compilers. So you really need PQfreemem. Andreas -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/libpq.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/libpq.sgml,v retrieving revision 1.231 diff -c -c -r1.231 libpq.sgml *** doc/src/sgml/libpq.sgml 16 Feb 2007 16:37:29 - 1.231 --- doc/src/sgml/libpq.sgml 19 Feb 2007 22:04:28 - *** *** 2617,2625 functionPQescapeBytea/function, functionPQunescapeBytea/function, and functionPQnotifies/function. !It is needed by Microsoft Windows, which cannot free memory across !DLLs, unless multithreaded DLLs (option/MD/option in VC6) are used. !On other platforms, this function is the same as the standard library function functionfree()/. /para /listitem /varlistentry --- 2617,2629 functionPQescapeBytea/function, functionPQunescapeBytea/function, and functionPQnotifies/function. !It is particularly important that this function, rather than !functionfree()/, be used on Microsoft Windows. This is because !allocating memory in a DLL and releasing it in the application works !only if multithreaded/single-threaded, release/debug, and static/dynamic !flags are the same for the DLL and the application. On non-Microsoft !Windows platforms, this function is the same as the standard library !function functionfree()/. /para /listitem /varlistentry ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Howto change db cluster locale on-the-fly
Hi Tom, Hacking pg_control would be the hard part; you'll never get the CRC right if you do it manually. Possibly pg_resetxlog could be adapted to the purpose. thanks for your valuable answer! I looked at pg_resetxlog.c but I'm no pg internals' expert - would something like this work? 1) normally shut down database 2) hack pg_resetxlog to set locale to wanted value 3) run pg_resetxlog -f (rewrite pg_control - everything would be guessed with the exception of overloaded locale) 4) start database We won't miss any transactions and there won't be any inconsistency in data because server was normally shut down, right? Thanks, Kuba Tom Lane napsal(a): Martijn van Oosterhout kleptog@svana.org writes: But I guess something like this would work: 1) drop all indexes on text/varchar columns 2) change cluster locale 3) create all indexes on text/varchar columns You're going to miss the name columns, ie. every string index in pg_catalog. But name is not locale-aware --- it just uses strcmp(). AFAIR there aren't any locale-dependent indexes in the system catalogs. So in principle you could hack pg_control, restart the postmaster, and then reindex every locale-dependent index. Hacking pg_control would be the hard part; you'll never get the CRC right if you do it manually. Possibly pg_resetxlog could be adapted to the purpose. I'd suggest single user mode at least, and make backups! Yup, a filesystem backup would be a *real* good idea. Not to mention testing the procedure on a toy installation. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Deadlock with pg_dump?
On Mon, 2007-02-19 at 19:38 +, Simon Riggs wrote: On Tue, 2007-02-13 at 22:19 -0500, Bruce Momjian wrote: Simon Riggs wrote: On Thu, 2006-10-26 at 18:45 -0400, Tom Lane wrote: Chris Campbell [EMAIL PROTECTED] writes: Is there additional logging information I can turn on to get more details? I guess I need to see exactly what locks both processes hold, and what queries they were running when the deadlock occurred? Is that easily done, without turning on logging for *all* statements? log_min_error_statement = error would at least get you the statements reporting the deadlocks, though not what they're conflicting against. Yeh, we need a much better locking logger for performance analysis. We really need to dump the whole wait-for graph for deadlocks, since this might be more complex than just two statements involved. Deadlocks ought to be so infrequent that we can afford the log space to do this - plus if we did this it would likely lead to fewer deadlocks. For 8.3 I'd like to have a log_min_duration_lockwait (secs) parameter that would allow you to dump the wait-for graph for any data-level locks that wait too long, rather than just those that deadlock. Many applications experience heavy locking because of lack of holistic design. That will also show up the need for other utilities to act CONCURRENTLY, if possible. Old email, but I don't see how our current output is not good enough? test= lock a; ERROR: deadlock detected DETAIL: Process 6855 waits for AccessExclusiveLock on relation 16394 of database 16384; blocked by process 6795. Process 6795 waits for AccessExclusiveLock on relation 16396 of database 16384; blocked by process 6855. This detects deadlocks, but it doesn't detect lock waits. When I wrote that it was previous experience driving me. Recent client experience has highlighted the clear need for this. We had a lock wait of 50 hours because of an RI check; thats the kind of thing I'd like to show up in the logs somewhere. Lock wait detection can be used to show up synchronisation points that have been inadvertently designed into an application, so its a useful tool in investigating performance issues. I have a patch implementing the logging as agreed with Tom, will post to patches later tonight. Patch for discussion, includes doc entries at top of patch, so its fairly clear how it works. Output is an INFO message, to allow this to trigger log_min_error_statement when it generates a message, to allow us to see the SQL statement that is waiting. This allows it to generate a message prior to the statement completing, which is important because it may not ever complete, in some cases, so simply logging a list of pids won't always tell you what the SQL was that was waiting. Other approaches are possible... Comments? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com Index: doc/src/sgml/config.sgml === RCS file: /projects/cvsroot/pgsql/doc/src/sgml/config.sgml,v retrieving revision 1.108 diff -c -r1.108 config.sgml *** doc/src/sgml/config.sgml 1 Feb 2007 00:28:16 - 1.108 --- doc/src/sgml/config.sgml 6 Feb 2007 12:31:49 - *** *** 2936,2941 --- 2936,2965 /listitem /varlistentry + varlistentry id=guc-log-lock-waits xreflabel=log_lock_waits + termvarnamelog_lock_waits/varname (typeboolean/type)/term + indexterm +primaryvarnamelog_lock_waits/ configuration parameter/primary + /indexterm + listitem +para + Controls whether log messages are produced when a statement is forced + to wait when trying to acquire locks on database objects. The threshold + time is the value of the xref linkend=guc-deadlock-timeout parameter. + The log messages generated are intended for use during specific + investigations into application performance issues and subsequent tuning. + It is designed for use in conjunction with varnamelog_min_error_statement/. + Log messages indicating long lock waits might indicate problems with + applications accessing the database or possibly disconnection issues. + If no such problem exist it might indicate that varnamedeadlock_timeout/ + could be set higher. Log messages might also indicate that certain + deadlocks have been avoided. In those cases, decreasing the value of + varnamedeadlock_timeout/ might resolve lock wait situations faster, + thereby reducing contention. By default, this form of logging is literaloff/. +/para + /listitem + /varlistentry + /variablelist /sect2 /sect1 Index: src/backend/storage/lmgr/deadlock.c === RCS file: /projects/cvsroot/pgsql/src/backend/storage/lmgr/deadlock.c,v retrieving revision
Re: [HACKERS] --enable-debug does not work with gcc
Added to developer's FAQ: PIsrc/Makefile.custom/I can be used to set environment variables, like ICUSTOM_COPT/I, that are used for every compile. --- Magnus Hagander wrote: If we did what you suggest, then --enable-debug would cause performance degradation, which would cause people to not use it, which would result in most binaries being completely undebuggable rather than only partially. Doesn't sound like a good tradeoff to me. Personally, in my development tree I use a Makefile.custom containing # back off optimization unless profiling ifeq ($(PROFILE),) CFLAGS:= $(patsubst -O2,-O1,$(CFLAGS)) endif -O1 still generates uninitialized variable warnings but the code is a lot saner to step through ... not perfect, but saner. It's been a workable compromise for a long time. I don't recommend developing with -O0, exactly because it disables some mighty valuable warnings. Agreed. I use -O1 by default myself, unless I am doing performance testing. Something for the developers FAQ perhaps? I confess I did not know of Makefile.custom :-D //Magnus ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] TopPlan, again
On Sun, 2007-02-18 at 18:19 -0500, Tom Lane wrote: While thinking about having a centralized plan cache for managing plan invalidation, I got annoyed again about the fact that the executor needs access to the Query tree. This means that we'll be storing *three* representations of any cached query: raw parsetree for possible regeneration, plus parsed Query tree and Plan tree. ... After looking over the code it seems that the executor needs a limited subset of the Query fields, namely ... into intoOptions intoOnCommit (why is this separate from intoOptions?) intoTableSpaceName ... which I think we should put into a new TopPlan node type. All else sounds good, but why would we be caching a plan that used these fields? Anybody re-executing a CREATE TABLE AS SELECT on the same table isn't somebody we should be helping. ISTM that we'd be able to exclude them from the TopPlan on that basis, possibly creating an Into node to reduce the clutter. Couple of incidental points on plan invalidation: - We need to consider how the planner uses parameter values. Currently the unnamed query utilises the first bind parameters to plan the query. Doing that when we have a central plan cache will definitely cause problems in some applications which currently repeatedly re-specify the same parameter on their session only, but differ across sessions. Sounds bizarre, but assuming that all users of the same query want it optimised the same way is not a good assumption in all cases. I'm completely in favour of a centralized plan cache in all other ways... - I'd like to make it impossible to re-plan the output columns of queries with unspecified output columns e.g. * or foo.* This makes it possible for the results of the query to change during re-execution. I've never seen an application that used dynamic query that allowed for the possibility that the result metadata might change as we re-execute and allowing it would seem likely to break more applications than we'd really want. It will also allow us to remove the Metadata call from the v3 Protocol at Exec time, as David Strong suggested last year on pgsql-jdbc. - It would be good to allow for exec-time constraint exclusion, which would allow caching plans that used by CE and stable functions (e.g. col CURRENT_DATE). That may change the design, even though thats not an 8.3 thing at all. -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] TopPlan, again
Simon Riggs [EMAIL PROTECTED] writes: After looking over the code it seems that the executor needs a limited subset of the Query fields, namely ... which I think we should put into a new TopPlan node type. All else sounds good, but why would we be caching a plan that used these fields? Um, what's your point? I certainly have no desire to support two different Executor APIs depending on whether we think the command might be worth cacheing or not. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] ToDo: add documentation for operator IS OF
Pavel Stehule wrote: Hello, I miss doc for this operator Strang IS [NOT] OF wasn't documented, especially seeing it was added in PostgreSQL 7.3. Anyway, documented and backpatched to 8.2.X. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + Index: doc/src/sgml/func.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/func.sgml,v retrieving revision 1.365 diff -c -c -r1.365 func.sgml *** doc/src/sgml/func.sgml 19 Feb 2007 22:51:44 - 1.365 --- doc/src/sgml/func.sgml 20 Feb 2007 00:24:15 - *** *** 411,416 --- 411,432 literalIS NOT NULL/literal, respectively, except that the input expression must be of Boolean type. /para + +para + indexterm + primaryIS OF/primary + /indexterm + indexterm + primaryIS NOT OF/primary + /indexterm + It is possible to check the data type of an expression using the + constructs + synopsis + replaceableexpression/replaceable IS OF (typename, ...) + replaceableexpression/replaceable IS NOT OF (typename, ...) + /synopsis + They return a boolean value based on whether the expression's data + type is one of the listed data types. /sect1 sect1 id=functions-math Index: doc/src/sgml/syntax.sgml === RCS file: /cvsroot/pgsql/doc/src/sgml/syntax.sgml,v retrieving revision 1.115 diff -c -c -r1.115 syntax.sgml *** doc/src/sgml/syntax.sgml 6 Feb 2007 09:16:07 - 1.115 --- doc/src/sgml/syntax.sgml 20 Feb 2007 00:24:15 - *** *** 847,853 row entrytokenIS/token/entry entry/entry !entryliteralIS TRUE/, literalIS FALSE/, literalIS UNKNOWN/, literalIS NULL//entry /row row --- 847,854 row entrytokenIS/token/entry entry/entry !entryliteralIS TRUE/, literalIS FALSE/, literalIS !UNKNOWN/, literalIS NULL/, literalIS OF//entry /row row ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] ToDo: add documentation for operator IS OF
Bruce Momjian wrote: Pavel Stehule wrote: Hello, I miss doc for this operator Strang IS [NOT] OF wasn't documented, especially seeing it was added in PostgreSQL 7.3. Anyway, documented and backpatched to 8.2.X. Here's the reason -- see this thread: http://archives.postgresql.org/pgsql-patches/2003-08/msg00062.php Joe ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] ToDo: add documentation for operator IS OF
Joe Conway wrote: Bruce Momjian wrote: Pavel Stehule wrote: Hello, I miss doc for this operator Strang IS [NOT] OF wasn't documented, especially seeing it was added in PostgreSQL 7.3. Anyway, documented and backpatched to 8.2.X. Here's the reason -- see this thread: http://archives.postgresql.org/pgsql-patches/2003-08/msg00062.php Wow, interesting. I do remember that now. Should I revert the documentation addition and add a comment to gram.y? -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] autovacuum next steps
Gregory Stark wrote: If we could have autovacuum interrupt a vacuum in mid-sweep, perform a cycle of vacuums on smaller tables, then resume, that problem would go away. That sounds too difficult though, but perhaps we could do something nearly as good. I think to make vacuum has this interrupted-resume capability is quite useful for large table. It can provide more flexibility for autovacuum to create a good schedule scheme. Sometimes it takes a whole day to vacuum the large table (Hundreds-GB table may qualify); setting the cost_delay make it even lasts for several days. If the system has maintenance time, vacuum task of the large table can be split to fit into the maintenance time by interrupted-resume feature. One option that I've heard before is to have vacuum after a single iteration (ie, after it fills maintenance_work_mem and does the index cleanup and the second heap pass), remember where it was and pick up from that point next time. Even a single iteration may take a long time, so it is not so much useful to have a break in the boundary of the iteration. I think it is not so difficult to get vacuum to remember where it leaves and start from where it leaves last time. The following is a basic idea. A typical vacuum process mainly have the following phases: Phase 1. scan heap Phase 2. scan and sweep index Phase 3. sweep heap Phase 4. update FSM Phase 5. truncate CLOG Where vacuum is interrupted, we can just save the collected information into the disk, and restore it later when vacuum restarts. When vacuum process is interrupted, we can remember the dead tuple list and the block number it has scanned in phase 1; the indexes it has cleanup in phase 2; the tuples it has swept in phase 3. Before exiting from vacuum, we can also merge the free space information into FSM. We are working on this feature now. I will propose it latter to discuss with you. Best Regards Galy Lee -- NTT Open Source Software Center ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Mon, Feb 19, 2007 at 02:50:34PM -0500, Andrew Sullivan wrote: On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. What would be nice is to be able to detach one of the volumes, and know the span of the data in there without being able to access the data. The problem that a lot of warehouse operators have is something like this: We know we have all this data, but we don't know what we will want to do with it later. So keep it all. I'll get back to you when I want to know something. It'd be nice to be able to load up all that data once, and then shunt it off into (say) read-only media. If one could then run a query that would tell one which spans of data are candidates for the search, you could bring back online (onto reasonably fast storage, for instance) just the volumes you need to read. Isn't this one of the big use cases for table partitioning? Cheers, D -- David Fetter [EMAIL PROTECTED] http://fetter.org/ phone: +1 415 235 3778AIM: dfetter666 Skype: davidfetter Remember to vote! ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
Greg Smith [EMAIL PROTECTED] writes: A workable syntax might be INSERT INTO pg_log ... Why is this still under discussion? I thought we'd agreed that COPY format was the way to go. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] New feature request: FlashBack Query
August Zajonc [EMAIL PROTECTED] writes: The key is how lightweight the setup could be, which matters because clients are not always willing to pay for a PITR setup. The low overhead would mean you'd feel fine about setting guc to 1hr or so. This would have exactly the same performance consequences as always having an hour-old open transaction. I'm afraid that describing it as low overhead is mere wishful thinking: it would cripple vacuuming of high-update tables and greatly increase the typical load on pg_clog and pg_subtrans. We already know that pg_subtrans contention can be a source of context-swap storms, with the size of the window back to GlobalXmin being the controlling factor for how bad it gets. It's possible that this last could be addressed by separating the concept of old enough to be vacuumed from GlobalXmin, but it's certainly not a trivial thing. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pgsql: Fix for plpython functions; return true/false for boolean,
I am still waiting for a plpython patch that has Python version checking. --- Guido Goldstein wrote: Peter Eisentraut wrote: Guido Goldstein wrote: Is it possible to tell me which python versions you want to support? The issue isn't so much which versions we want to support. There is certainly some flexibility with that. But when a patch breaks the buildfarm a) unannounced and b) without any apparent feature gain, then people get annoyed. If this breaks the buildfarm it's not my failure. Except you can tell me what I've got to do with the buildfarm. If you mean that plpython didn't compile, fine; simply tell the people what version they should consider when sending in patches. I've checked the patch with postgres 8.1.3 and 8.2.1 with python 2.4 and 2.5 on intel 32 bit and amd 64 bit systems; all systems running linux. *And* it's not a feature patch but a bug-fixing one! Python is a language with strong typing, so silently converting a datatype is a bug -- not a feature. Btw, you'll lose the type information of boolean columns in trigger functions (NEW and OLD dicts, no explicit parameters), which does cause problems. That said, we certainly try to support a few more versions of Python [...] If you want to support python 2.3 use the attached patch, which also works for the newer python versions. The Python 2.3 branch is the oldest _officially_ supported python version. Anyway, to circumvent the above mentiond point a) I herewith anncounce that the included patch might break the buildfarm. Cheers Guido ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] Log levels for checkpoint/bgwriter monitoring
I have a WIP patch that adds the main detail I have found I need to properly tune checkpoint and background writer activity. I think it's almost ready to submit (you can see the current patch against 8.2 at http://www.westnet.com/~gsmith/content/postgresql/patch-checkpoint.txt ) after making it a bit more human-readable. But I've realized that along with that, I need some guidance in regards to what log level is appropriate for this information. An example works better than explaining what the patch does: 2007-02-19 21:53:24.602 EST - DEBUG: checkpoint required (wrote checkpoint_segments) 2007-02-19 21:53:24.685 EST - DEBUG: checkpoint starting 2007-02-19 21:53:24.705 EST - DEBUG: checkpoint flushing buffer pool 2007-02-19 21:53:24.985 EST - DEBUG: checkpoint database fsync starting 2007-02-19 21:53:42.725 EST - DEBUG: checkpoint database fsync complete 2007-02-19 21:53:42.726 EST - DEBUG: checkpoint buffer flush dirty=8034 write=279956 us sync=17739974 us Remember that Load distributed checkpoint discussion back in December? You can see exactly how bad the problem is on your system with this log style (this is from a pgbench run where it's postively awful--it really does take over 17 seconds for the fsync to execute, and there are clients that are hung the whole time waiting for it). I also instrumented the background writer. You get messages like this: 2007-02-19 21:58:54.328 EST - DEBUG: BGWriter Scan All - Written = 5/5 Unscanned = 23/54 This shows that we wrote (5) the maximum pages we were allowed to write (5) while failing to scan almost half (23) of the buffers we meant to look at (54). By taking a look at this output while the system is under load, I found I was able to do bgwriter optimization that used to take me days of frustrating testing in hours. I've been waiting for a good guide to bgwriter tuning since 8.1 came out. Once you have this, combined with knowing how many buffers were dirty at checkpoint time because the bgwriter didn't get to them in time (the number you want to minimize), the tuning guide practically writes itself. So my question is...what log level should all this go at? Right now, I have the background writer stuff adjusting its level dynamically based on what happened; it logs at DEBUG2 if it hits the write limit (which should be a rare event once you're tuned properly), DEBUG3 for writes that scanned everything they were supposed to, and DEBUG4 if it scanned but didn't find anything to write. The source of checkpoint information logs at DEBUG1, the fsync/write info at DEBUG2. I'd like to move some of these up. On my system, I even have many of the checkpoint messages logged at INFO (the source of the checkpoint and the total write time line). It's a bit chatty, but when you get some weird system pause issue it makes it easy to figure out if checkpoints were to blame. Given how useful I feel some of these messages are to system tuning, and to explaining what currently appears as inexplicable pauses, I don't want them to be buried at DEBUG levels where people are unlikely to ever see them (I think some people may be concerned about turning on things labeled DEBUG at all). I am aware that I am too deep into this to have an unbiased opinion at this point though, which is why I ask for feedback on how to proceed here. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] \copy (query) delimiter syntax error
I think I'll go with Tom's Plan B for HEAD, but not do anything more for 8.2 than has already been done. cheers andrew Bruce Momjian wrote: Did we come to a conclusion on this? --- Andrew Dunstan wrote: Andrew Dunstan wrote: Tom Lane wrote: Andrew Dunstan [EMAIL PROTECTED] writes: The consequence will be, though, that psql will accept a syntax for \copy (query) ... that the corresponding backend command would reject were we not transforming it. That strikes me as potentially confusing. Perhaps. What about plan B: remove the legacy syntax support in \copy? IIRC it has not been documented since 7.2, so maybe we can finally throw it overboard. Thoughts? I like it for 8.3 - but maybe my present patch would be better for 8.2, as it involves less behaviour change. While we decide this issue, which can be worked around in any case, I am going to commit the part of the patch that nobody has objected to (and which will fix Michael's original complaint), on HEAD and 8.2 stable, so we can get some testing going. cheers andrew ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
Greg Smith [EMAIL PROTECTED] writes: On Mon, 19 Feb 2007, Alvaro Herrera wrote: Also, sql is not really a destination -- it is a format. A log file with a different name is another destination. eventlog is certainly a different format and it's sitting happily as an option there. I haven't heard anyone make a useful argument yet as to how insert/sql logs are any different than the current way that stderr, syslog, and eventlog are all possibilities now for log_destination, each with their own little quirks (and in the case of syslog, their own additional GUC parameters). Since the sql format doesn't make any sense for syslog or eventlog output, I tend to agree that treating it as a destination is a reasonable answer. It's going to be a bit non-orthogonal no matter which way we jump, but this seems like the most natural and useful extension from where we are. To me anyway ... YMMV ... regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
On Mon, 19 Feb 2007, Tom Lane wrote: Why is this still under discussion? I thought we'd agreed that COPY format was the way to go. Joshua Drake said COPY would be a good option, but INSERT is probably what I would use as the default. The most use I see for this is something where I am tailing out the log and inserting live into a log db... and I completely agreed with him--that's also how all the applications I'd like to build around this feature are expected to operate. No one said anything else on this topic to defend COPY as the right choice until you just brought it back up here. The arguments for COPY are performance and that you don't need to specify the table name. INSERT is slower and you need a name, but it's easier to build a UNIX tool style pipeline to import it in real-time. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] New feature request: FlashBack Query
On 2/17/07, Joshua D. Drake [EMAIL PROTECTED] wrote: My understanding is that the main difference is that rollbacks are inexpensive for us, but expensive for Oracle. Yes, Oracle is optimized for COMMIT, we're optimized for ROLLBACK :) In all seriousness, last time I checked Oracle's MVCC was covered by two patents. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Short varlena headers and arrays
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: Elements of arrays are not subject to being toasted by themselves, so I don't think you can make that work. At least not without breaking wide swaths of code that works fine today. You think it's more likely there are places that build arrays and then read the items back without passing through detoast than there are places that build tuples and do so? The former is valid per the coding rules, the latter is not, so... regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] WIP patch - INSERT-able log statements
Greg Smith [EMAIL PROTECTED] writes: The arguments for COPY are performance and that you don't need to specify the table name. INSERT is slower and you need a name, but it's easier to build a UNIX tool style pipeline to import it in real-time. I can't believe that any production situation could tolerate the overhead of one-commit-per-log-line. So a realistic tool for this is going to have to be able to wrap blocks of maybe 100 or 1000 or so log lines with BEGIN/COMMIT, and that is exactly as difficult as wrapping them with a COPY command. Thus, I disbelieve your argument. We should not be designing this around an assumed use-case that will only work for toy installations. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Plan invalidation design
On Sat, 2007-02-17 at 12:48 -0500, Tom Lane wrote: Relcache inval casts a fairly wide net; for example, adding or dropping an index will invalidate all plans using the index's table whether or not they used that particular index, and I believe that VACUUM will also result in a relcache inval due to updating the table's pg_class row. I think this is a good thing though --- for instance, after adding an index it seems a good idea to replan to see if the new index is useful, and replanning after a VACUUM is useful if the table has changed size enough to warrant a different plan. OTOH this might mean that plans on a high-update-traffic table never survive very long because of autovacuum's efforts. If that proves to be a problem in practice we can look at ways to dial down the number of replans, but for the moment I think it's more important to be sure we *can* replan at need than to find ways to avoid replans. Just some info on that: In an update-intensive scenario, I'm seeing VACUUMs every 2 minutes on the heaviest hit tables on CVS HEAD on a medium-powered 4-CPU server. Re-planning multiple queries on 100+ sessions every few minutes would not be good. It seems a reasonable working assumption that HOT will reduce that requirement considerably, but its something to watch. Thanks for drawing attention to it. Presumably ANALYZE would have the same effect? -- Simon Riggs EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Monday 19 February 2007 15:08, Bruce Momjian wrote: Joshua D. Drake wrote: Andrew Sullivan wrote: On Mon, Feb 19, 2007 at 10:33:24AM -0500, Tom Lane wrote: Martijn van Oosterhout kleptog@svana.org writes: Somehow this seems like implementing RAID within postgres, RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. Ok, warning, this is a you know what would be sweet moment. The dreaded words from a developers mouth to every manager in the world. Yea, I just instinctively hit delete when I saw that phrase. Too bad... I know oracle can do what he wants... possibly other db systems as well. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
[HACKERS] HOT WIP Patch - version 2
Reposting - looks like the message did not get through in the first attempt. My apologies if multiple copies are received. This is the next version of the HOT WIP patch. Since the last patch that I sent out, I have implemented the HOT-update chain pruning mechanism. When following a HOT-update chain from the index fetch, if we notice that the root tuple is dead and it is HOT-updated, we try to prune the chain to the smallest possible length. To do that, the share lock is upgraded to an exclusive lock and the tuple chain is followed till we find a live/recently-dead tuple. At that point, the root t_ctid is made point to that tuple. In order to preserve the xmax/xmin chain, the xmax of the root tuple is also updated to xmin of the found tuple. Since this xmax is also RecentGlobalXmin and is a committed transaction, the visibility of the root tuple still remains the same. The intermediate heap-only tuples are removed from the HOT-update chain. The HOT-updated status of these tuples is cleared and their respective t_ctid are made point to themselves. These tuples are not reachable now and ready for vacuuming. This entire action is logged in a single WAL record. During vacuuming, we keep track of number of root tuples vacuumed. If this count is zero, then the index cleanup step is skipped. This would avoid unnecessary index scans whenever possible. This patch should apply cleanly on current CVS head and pass all regression tests. I am still looking for review comments from the first WIP patch. If anyone has already looked through it and is interested in the incremental changes, please let me know. I can post that. Whats Next ? - ISTM that the basic HOT-updates and ability to prune the HOT-update chain, should help us reduce the index bloat, limit the overhead of ctid following in index fetch and efficiently vacuum heap-only tuples. IMO the next important but rather less troublesome thing to tackle is to reuse space within a block without complete vacuum of the table. This would help us do much more HOT-updates and thus further reduce index/heap bloat. I am thinking of reusing the DEAD heap-only tuples which gets removed from the HOT-update chain as part of pruning operation. Since these tuples, once removed from the chain, are neither reachable nor have any index references, could be readily used for storing newer versions of the same or other rows in the block. How about setting LP_DELETE on these tuples as part of the prune operation ? LP_DELETE is unused for heap tuples, if I am not mistaken. Other information like length and offset are is maintained as it is. When we run out space for update-within-the-block, we traverse through all the line pointers looking for LP_DELETEd items. If any of these items have space large enough to store the new tuple, that item is reused. Does anyone see any issue with doing this ? Also, any suggestions about doing it in a better way ? If the page gets really fragmented, we can try to grab a VACUUM-strength lock on the page and de-fragment it. The lock is tried conditionally to avoid any deadlocks. This is done in the heap_update() code path, so would add some overhead, but may still prove better than putting the tuple in a different block and having corresponding index insert(s). Also, since we are more concerned about the large tables, the chances of being able to upgrade the exclusive lock to vacuum-strength lock are high. Comments ? If there are no objections, I am planning to work on the first part while Nikhil would take up the second task of block level retail-vacuum. Your comments on these issues and the patch are really appreciated. Thanks, Pavan -- EnterpriseDB http://www.enterprisedb.com NewHOT-v2.0.patch.gz Description: GNU Zip compressed data ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Multiple Storage per Tablespace, or Volumes
On Monday 19 February 2007 11:27, Martijn van Oosterhout wrote: On Mon, Feb 19, 2007 at 05:10:36PM +0100, Dimitri Fontaine wrote: RAID and LVM too. I can't get excited about re-inventing those wheels when perfectly good implementations already exist for us to sit on top of. I though moving some knowledge about data availability into PostgreSQL code could provide some valuable performance benefit, allowing to organize reads (for example parallel tables scan/indexes scan to different volumes) and obtaining data from 'quicker' known volume (or least used/charged). Well, organising requests to be handled quickly is a function of LVM/RAID, so we don't go there. However, speeding up scans by having multiple requests is an interesting approach, as would perhaps a different random_page_cost for different tablespaces. On one of my systems I have 1 tablespace for read data (99-1), 1 for read mostly data (90-10), and 1 for write mostly (40-60). The breakdown is based on a combination of the underlying hardware and usage patterns of the tables involved. I suspect that isn't that uncommon really. I've often thought that being able to set guc variables to a specific tablespace (like you can do for users) would allow for a lot of flexibility in tuning queries that go across different tablespaces. -- Robert Treat Build A Brighter LAMP :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings