[HACKERS] pg_restore and create FK without verification check
--- ow [EMAIL PROTECTED] wrote: IMHO, not only data need to loaded before FK constraints are created but also there has got to be a feature to allow creation of an FK constraint WITHOUT doing the verification that all loaded/existing records satisfy the FK constraint. The ability to create a FK constraint without verification of existing records should exist by itself (maybe only to superuser) and also as an option in pg_restore. More details: http://archives.postgresql.org/pgsql-admin/2003-11/msg00308.php http://archives.postgresql.org/pgsql-admin/2003-11/msg00323.php It appears there's not a lot of interest in discussing the possibility of FK constraint creation WITHOUT the verification check. How then should one handle the situation with pg_restore and large dbs where creation of FK constraint(s) may take hours? Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] statistics about tamp tables ...
Recently I have come across a simple issue which made me think about it. When we create a tmp table (SELECT INTO, CREATE TABLE AS) the planner won't know anything about its content after creating it. Many people use temp tables heavy when the amount of data for a certain analysis has to be reduced significantly. Frequently the same tmp table is queried quite frequently. In order to speed those scenarios up it can be useful to vacuum those tmp tables so that the planner will find more clever joins. Is it possible and does it make sense to generate those statistics on the fly (during CREATE TABLE AS)? Maybe we could have a GUC which tells the system whether to generate statistics or not. test=# select * from test; id 4 4 (2 rows) test=# VACUUM test ; VACUUM test=# explain select * from test ; QUERY PLAN Seq Scan on test (cost=0.00..1.02 rows=2 width=4) (1 row) test=# select * into tmp from test; SELECT test=# explain select * from tmp; QUERY PLAN --- Seq Scan on tmp (cost=0.00..20.00 rows=1000 width=4) (1 row) Best regards, Hans -- Cybertec Geschwinde u Schoenig Ludo-Hartmannplatz 1/14, A-1160 Vienna, Austria Tel: +43/2952/30706 or +43/660/816 40 77 www.cybertec.at, www.postgresql.at, kernel.cybertec.at ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_restore and create FK without verification check
ow wrote: It appears there's not a lot of interest in discussing the possibility of FK constraint creation WITHOUT the verification check. How then should one handle the situation with pg_restore and large dbs where creation of FK constraint(s) may take hours? I'd prefer a backup/restore method that dumps physical data, so at restore time there's no need for recreation of FKs. But I didn't see any feedback on this proposal either. Regards, Andreas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_restore and create FK without verification check
Andreas Pflug kirjutas K, 26.11.2003 kell 12:09: ow wrote: It appears there's not a lot of interest in discussing the possibility of FK constraint creation WITHOUT the verification check. How then should one handle the situation with pg_restore and large dbs where creation of FK constraint(s) may take hours? I'd prefer a backup/restore method that dumps physical data, so at restore time there's no need for recreation of FKs. But I didn't see any feedback on this proposal either. Was this proposal a separate one from using WAL logs for PITR ? - Hannu ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
Hannu Krosing wrote: Andreas Pflug kirjutas K, 26.11.2003 kell 12:09: ow wrote: It appears there's not a lot of interest in discussing the possibility of FK constraint creation WITHOUT the verification check. How then should one handle the situation with pg_restore and large dbs where creation of FK constraint(s) may take hours? I'd prefer a backup/restore method that dumps physical data, so at restore time there's no need for recreation of FKs. But I didn't see any feedback on this proposal either. Was this proposal a separate one from using WAL logs for PITR ? Yes, I mentioned it just a few days when discussing dependency in pg_dump. This is somewhat complementary to WAL and PITR. I'm seeking for a fast way to dump and restore a complete database, like physical file copy, without shutting down the backend. I was thinking of a BACKUP command that streams out the files including any indexes and non-vacuumed tuples. A database recreated from that wouldn't be as clean as a pg_dump/pg_restored database, but it would be up much faster, and there wouldn't be any dependency problem. This doesn't really replace pg_dump/pg_restore, because it probably wouldn't be able to upgrade a cluster. Still, it would be helpful for disaster recovery. Regards, Andreas ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_restore and create FK without verification check
--- Andreas Pflug [EMAIL PROTECTED] wrote: Yes, I mentioned it just a few days when discussing dependency in pg_dump. This is somewhat complementary to WAL and PITR. I'm seeking for a fast way to dump and restore a complete database, like physical file copy, without shutting down the backend. I was thinking of a BACKUP command that streams out the files including any indexes and non-vacuumed tuples. A database recreated from that wouldn't be as clean as a pg_dump/pg_restored database, but it would be up much faster, and there wouldn't be any dependency problem. This doesn't really replace pg_dump/pg_restore, because it probably wouldn't be able to upgrade a cluster. Still, it would be helpful for disaster recovery. I think creating a FK without verification check is still needed, especially in case if: 1) original db is corrupted 2) during cluster upgrade 3) there's a need to BACKUP/RESTORE a *schema* instead of db. Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
ow wrote: --- Andreas Pflug [EMAIL PROTECTED] wrote: Yes, I mentioned it just a few days when discussing dependency in pg_dump. This is somewhat complementary to WAL and PITR. I'm seeking for a fast way to dump and restore a complete database, like physical file copy, without shutting down the backend. I was thinking of a BACKUP command that streams out the files including any indexes and non-vacuumed tuples. A database recreated from that wouldn't be as clean as a pg_dump/pg_restored database, but it would be up much faster, and there wouldn't be any dependency problem. This doesn't really replace pg_dump/pg_restore, because it probably wouldn't be able to upgrade a cluster. Still, it would be helpful for disaster recovery. I think creating a FK without verification check is still needed, especially in case if: 1) original db is corrupted 2) during cluster upgrade Agreed. This might be useful for replication purposes too; in MSSQL, you can write CREATE TRIGGER ... NOT FOR REPLICATION. I'd like to see a transaction safe way (ENABLE/DISABLE TRIGGER command) for this. 3) there's a need to BACKUP/RESTORE a *schema* instead of db. Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.4final regression failure on uw713
Thansk Tom, I had vague memory about that :) Better check he? On Tue, 25 Nov 2003, Tom Lane wrote: Date: Tue, 25 Nov 2003 19:35:07 -0500 From: Tom Lane [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: pgsql-hackers list [EMAIL PROTECTED] Subject: Re: [HACKERS] 7.4final regression failure on uw713 [EMAIL PROTECTED] writes: Don't know if it's bad, but make check reports a regression failure on join. I believe we'd determined that this is an acceptable platform-specific behavior. regards, tom lane -- Olivier PRENANT Tel: +33-5-61-50-97-00 (Work) 6, Chemin d'Harraud Turrou +33-5-61-50-97-01 (Fax) 31190 AUTERIVE +33-6-07-63-80-64 (GSM) FRANCE Email: [EMAIL PROTECTED] -- Make your life a dream, make your dream a reality. (St Exupery) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
[HACKERS] protocol compatibility between 7.2 and 7.4
I know this is an attempt to save myself reading the mailing list, but still the issue remains: the psql from version 7.4 does not talk to a 7.2.4 database. The CHANGELOG indicates, that both server and libraries keep compatibility with versions after 6.3 - still there is no switch in psql to specify usage of the pre-7.3 protocol and the 7.2 server and 7.4 psql apparently do not negotiate. Daniel ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] protocol compatibility between 7.2 and 7.4
Daniel Kalchev writes: The CHANGELOG indicates, that both server and libraries keep compatibility with versions after 6.3 - still there is no switch in psql to specify usage of the pre-7.3 protocol and the 7.2 server and 7.4 psql apparently do not negotiate. It's automatic. If you disagree, please post details. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_restore and create FK without verification check
Hello I was asking about this too, one or two weeks ago. It appears there's not a lot of interest in discussing the possibility of FK constraint creation WITHOUT the verification check. How then should one handle the situation with pg_restore and large dbs where creation of FK constraint(s) may take hours? I'd prefer a backup/restore method that dumps physical data, so at restore time there's no need for recreation of FKs. But I didn't see any feedback on this proposal either. Was this proposal a separate one from using WAL logs for PITR ? My question then was: Q2: New situation: Why is it not a good idea to backup the database files of a cluster incl. all c_log and x_log (log files last) to get a physicaly hot backup. In principle it is the same situation like a server which is crashing (not a once but during some time). After restoring, it should do a redo and rollback automatically like after a crash. This methode (physical hot backup) would increas backup and restore times dramatically. The answer from Robert Treat was: Essentially I think you're right, it should behave much like a crashing server. The main reason why people don't recommend it is that (depending on your os setup) there is the potential to lose data that has been commited but not actually written to disk. Note that you shouldn't get corrupted data from this, but in many cases losing data is just as bad so we don't recomend it. If you really want to do this, you should really either shut down the database or get LVM going. I did not yet many tests. But in principle I was able to hot-backup a cluster or only one database and restore it. But the answer from Robert makes me a little afraid. It means for me he/they do not trust theire recovery mechanism. A definitive answer from Robert is still out. In my opinion a high grade professional database system (like PostgreSQL is or want to be) should have some hot backup features. Otherwise you are NEVER able to handle VLDB's. They were discussing about a 32 TB PostgreSQL database. And I bet my next bonus this year :-), that they are also not able to backup and restore it in a reasonable time. Regards Oli --- Oli Sennhauser Database-Engineer (Oracle PostgreSQL) Rebenweg 6 CH - 8610 Uster / Switzerland Phone (+41) 1 940 24 82 or Mobile (+41) 79 450 49 14 e-Mail [EMAIL PROTECTED] Website http://mypage.bluewin.ch/shinguz/PostgreSQL/ Secure (signed/encrypted) e-Mail with a Free Personal SwissSign ID: http://www.swisssign.ch Import the SwissSign Root Certificate: http://swisssign.net/cgi-bin/trust/import smime.p7s Description: S/MIME Cryptographic Signature
Re: [HACKERS] pg_restore and create FK without verification check
Q2: New situation: Why is it not a good idea to backup the database files of a cluster incl. all c_log and x_log (log files last) to get a physicaly hot backup. In principle it is the same situation like a server which is crashing (not a once but during some time). After restoring, it should do a redo and rollback automatically like after a crash. This methode (physical hot backup) would increas backup and restore times dramatically. The answer from Robert Treat was: Essentially I think you're right, it should behave much like a crashing server. The main reason why people don't recommend it is that (depending on your os setup) there is the potential to lose data that has been commited but not actually written to disk. Note that you shouldn't get corrupted data from this, but in many cases losing data is just as bad so we don't recomend it. If you really want to do this, you should really either shut down the database or get LVM going. The key issue here is to have a pg_control file to start from with a finished checkpoint from before you start to backup. Then you need to ensure that you have all logfiles from checkpoint until backup finishes. The last thing to backup must be the last active x_log. It would prbbly be a good idea to not have a vacuum running concurrently :-) And then you need to do a lot of tests, since nobody else does it yet. I think this is an issue, since it is such high risk, nobody will step up easily and say that it is safe. Andreas ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_restore and create FK without verification check
Andreas Pflug [EMAIL PROTECTED] writes: This is somewhat complementary to WAL and PITR. I'm seeking for a fast way to dump and restore a complete database, like physical file copy, without shutting down the backend. I was thinking of a BACKUP command that streams out the files including any indexes and non-vacuumed tuples. A database recreated from that wouldn't be as clean as a pg_dump/pg_restored database, but it would be up much faster, and there wouldn't be any dependency problem. It's already intended to support this as part of the PITR work. The idea is you force a checkpoint and then make a tar-format dump of the database tree (tar or whatever floats your boat, but anyway a filesystem-level backup). The database need not be stopped while you do this, and you don't need a filesystem that can do snapshots or anything fancy like that. The tar dump itself most likely does not represent a consistent state of the database by the time you are done making it. That is okay, because you have also been archiving off to tape (or someplace) all the WAL data generated since that pre-dump checkpoint. You can continue archiving the WAL series for however far forward from the original dump you feel like. If you need to recover, you reload the database from the tar dump and then replay the WAL series against it. This is indistinguishable from a crash recovery situation --- the inconsistent tar dump looks just like a disk that has received some but not all of the updates since the last checkpoint. Replay will fix it. The cool thing about this is you can actually bring the DB to the state it was in at any specific point in time covered by your WAL archive --- just run the WAL replay as far as you want, then stop. Solves the junior DBA deleted all my data Saturday morning problem, thus PITR. Now the uncool thing is you need massive amounts of secondary storage to archive all that WAL data, if your installation has heavy update activity. But it seems to me it would address the need you mention above --- you'd just not bother to continue archiving WAL past the end of the dump operation. In principle you could do this today, but we don't have enough support code in place to make it work smoothly, eg WAL segment files aren't labeled with enough identifying information to let you manage an archive full of 'em. Still it doesn't seem that far away. This doesn't really replace pg_dump/pg_restore, because it probably wouldn't be able to upgrade a cluster. Right, any such physical dump would be limited to restoring a whole cluster as-is: no imports into other clusters, no selectivity, no fancy games. The main reason is you'd have to dump and restore pg_clog along with the data files. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] statistics about tamp tables ...
=?ISO-8859-1?Q?Hans-J=FCrgen_Sch=F6nig?= [EMAIL PROTECTED] writes: Recently I have come across a simple issue which made me think about it. When we create a tmp table (SELECT INTO, CREATE TABLE AS) the planner won't know anything about its content after creating it. Run ANALYZE on the temp table, if you intend to use it enough to justify gathering stats about it. VACUUM is more work than needed. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] Providing anonymous mmap as an option of sharing memory
Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: I was looking thr. the source and thought it would be worth to seek opinion on this proposal. This has been discussed and rejected before. See the archives. I went thr. this for details. http://developer.postgresql.org/cvsweb.cgi/pgsql-server/doc/TODO.detail/mmap There seem to be two objections to mmap. 1. If a backend from last crashed running postmaster exists then it might have file etc. open and that is in general not such a good idea 2. For replacing stdio for data and WAL files with mmap, mmap does not guarantee order of IO which defeats WAL. I covered only first point in my post. IMO it is not such a unsolvable problem. If a postmaster crashes hard but leaves a backend running, would it clean pid file etc? I don't think so. So if a postmaster can start on a 'pid-clean' state, then it is guaranteed to be no childs left around. There were issues where linux not supporting MAP_SHARE and MAP_ANONYMOUS simaltenously but they are quite old messages, from 1998, talking of linux 2.0.x. I don't think it is still true anymore but need to check. Too bad, freeBSD M_NOSYNC is not a standard otherwise even for point 2, it could have been considered. Did I miss something? Shridhar ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
--- Tom Lane [EMAIL PROTECTED] wrote: This doesn't really replace pg_dump/pg_restore, because it probably wouldn't be able to upgrade a cluster. Right, any such physical dump would be limited to restoring a whole cluster as-is: no imports into other clusters, no selectivity, no fancy games. The main reason is you'd have to dump and restore pg_clog along with the data files. But that would not help people who would HAVE to use pg_dump/pg_restore (e.g. to backup/restore a single schema), would it? Depending on the db size, etc., creation of FK constraint(s) may take many hours. How should this be handled then? Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] NetBSD Sparc OK
FYI, I've successfully tested on NetBSD Sparc and added it to the supported list for 7.4.1. -- Peter Eisentraut [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] detecting poor query plans
Neil == Neil Conway [EMAIL PROTECTED] writes: Neil It occurred to me that these kinds of poor planning Neil decisions could easily be detected by PostgreSQL itself: Neil after we've finished executing a plan, we can trivially Neil compare the # of results produced by each node in the query Neil tree with the # of results the planner expected that node to Neil produce (look at EXPLAIN ANALYZE, for example). If the Indeed. This is the approach being followed by the LeO project (Learning Optimizer) at IBM Almaden. http://www.almaden.ibm.com/software/dm/SMART/leo.shtml There is a vldb paper that describes it .. -- Pip-pip Sailesh http://www.cs.berkeley.edu/~sailesh ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] detecting poor query plans
Neil Conway [EMAIL PROTECTED] writes: It occurred to me that these kinds of poor planning decisions could easily be detected by PostgreSQL itself: after we've finished executing a plan, we can trivially compare the # of results produced by each node in the query tree with the # of results the planner expected that node to produce (look at EXPLAIN ANALYZE, for example). If the estimate is off by a significant margin (say, 300%), we could perhaps emit a HINT suggesting that the user re-run ANALYZE I think such a thing would have such a low signal-to-noise ratio as to be useless :-(. As you note, there are many places where the planner's estimate is routinely off by more than 3x (or any other threshold you might pick instead). In some situations that doesn't really matter, as the same plan would have gotten picked anyway. Also, since 7.2 came out what we've seen more and more is cases where the row count estimate is acceptably good, but the wrong plan was picked anyway because of deficiencies in the cost equations. The question you really want to know about is not whether the row count estimate is close, it's whether another plan could have done better. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] A rough roadmap for internationalization fixes
There are no such libraries. I keep hearing ICU, but that is much too bloated. At least it is kind of standard and also something what will be maintained for foreseeable future, it also has a compatible license and is available on all platforms of interest to postgresql. And it is used for DB/2 and Informix, so it must be quite feature complete for DB relevant stuff. Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Providing anonymous mmap as an option of sharing memory
Shridhar Daithankar [EMAIL PROTECTED] writes: I covered only first point in my post. IMO it is not such a unsolvable problem. If a postmaster crashes hard but leaves a backend running, would it clean pid file etc? I don't think so. So if a postmaster can start on a 'pid-clean' state, then it is guaranteed to be no childs left around. And that helps how? The problem is to detect whether there are any children left from the old postmaster, when what you have to work from is the pid-file it left behind. In any case, you're still handwaving away the very real portability issues around mmap. Linux is not the universe, and Linux+BSD isn't either. We might still have considered it, despite the negatives, if anyone had been able to point to any actual *advantages* of mmap. There are none. Yes, the SysV shmem API is old and ugly and crufty, but it does what we need it to do. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] detecting poor query plans
Neil Conway [EMAIL PROTECTED] writes: It occurred to me that these kinds of poor planning decisions could easily be detected by PostgreSQL itself: after we've finished executing a plan, we can trivially compare the # of results produced by each node in the query tree with the # of results the planner expected that node to produce There's a dual to this as well. If the results were very close but the actual time taken to run the node doesn't match the cost calculated then some optimizer parameter needs to be adjusted. Either one of the cost_* parameters or random_page_cost, or effective_cache_size or... I'm not sure it's as obvious what to put in the HINT though. Ideally these results would have to be gathered and pumped through linear optimization algorithms which is a lot more work. -- greg ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_restore and create FK without verification check
ow [EMAIL PROTECTED] writes: --- Tom Lane [EMAIL PROTECTED] wrote: Right, any such physical dump would be limited to restoring a whole cluster as-is: no imports into other clusters, no selectivity, no fancy games. But that would not help people who would HAVE to use pg_dump/pg_restore (e.g. to backup/restore a single schema), would it? Depending on the db size, etc., creation of FK constraint(s) may take many hours. How should this be handled then? Quite honestly, I think they should check their foreign keys. In a partial restore situation there is no guarantee that the referenced table and the referencing table are being restored at the same time from the same dump. An override in that situation looks like a great tool for shooting yourself in the foot. People might be more interested in debating this topic with you if we hadn't discussed it at length just a couple months back. There wasn't consensus then that we had to offer an escape hatch, and you've not offered any argument that wasn't made before. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.4final regression failure on uw713
On Tue, 25 Nov 2003, Tom Lane wrote: [EMAIL PROTECTED] writes: Don't know if it's bad, but make check reports a regression failure on join. I believe we'd determined that this is an acceptable platform-specific behavior. Yep. Due to the new qsort() in 713UP3. LER regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Larry Rosenman http://www.lerctr.org/~ler Phone: +1 972-414-9812 E-Mail: [EMAIL PROTECTED] US Mail: 1905 Steamboat Springs Drive, Garland, TX 75044-6749 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] detecting poor query plans
Tom Lane [EMAIL PROTECTED] writes: I think such a thing would have such a low signal-to-noise ratio as to be useless :-(. As you note, there are many places where the planner's estimate is routinely off by more than 3x (or any other threshold you might pick instead). I wonder, perhaps we could add a certainty parameter to the estimated query plan + result sizes + costs produced by the planner. That way, when we run into a planner deficiency we can basically mark the relevant portion of the query tree as a WAG, and not bother with emitting hints for it. In some situations that doesn't really matter, as the same plan would have gotten picked anyway. The hint is NOT the chosen plan was non-optimal; the hint is the query planner did not produce an accurate row count estimate for this node in the query tree. The chosen query plan may or may not be optimal -- we're merely pointing out that we chose the plan we did on shakey grounds. The hint might just as well indicate a problem with another query that happens to apply a similar predicate to the column in question. The question you really want to know about is not whether the row count estimate is close, it's whether another plan could have done better. Perhaps, but is there a reasonable way to answer the second question? -Neil ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Materialized views proposal
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I apologize if this post is inappropriate. Doing some development work, me and my co-worker discussed some optimizations strategies. One of the ideas that came up was materialized views. Trading disk space to summarize queries, and paying for a trigger waterfall on certain kinds of updates seems like a small price to pay right now in our situation. I searched through the archives and I couldn't seem to find anything relevant. As I've no familiarity with the internals, but I do know the front-end pretty well, what can I do to help implement materialized views? Is such an idea feasible? Is there some reason why it just isn't a good idea? (And if not, what is a better idea?) If someone is willing to guide me, I can probably contribute a fair amount of C code. I do have experience with C. The bird's eye view of the project would probably be turning a statement (is there such a statement in SQL92?) in the creation of a table, the initial population of the table, and the creation of several triggers strategically placed so that the data in the materialized view table is always in sync. Direct inserts or updates on the materialized view should be illegal, except by the triggers. However, perhaps like views work today, we can allow rules to be added to the table. Certain restrictions on the materialized views should be enforced: No mutables, in particular. - -- Jonathan Gardner [EMAIL PROTECTED] Live Free, Use Linux! -BEGIN PGP SIGNATURE- Version: GnuPG v1.2.1 (GNU/Linux) iD8DBQE/xNzcWgwF3QvpWNwRAkbiAJ4m1zRPe+y3Tha0649QXH30y9eITwCfTjsv ow9Nwnnwrc6x9QaAB1AfHWQ= =Ofb5 -END PGP SIGNATURE- ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_restore and create FK without verification check
--- Tom Lane [EMAIL PROTECTED] wrote: Quite honestly, I think they should check their foreign keys. Generally speaking, I agree. The problem is that verification of FK constraint(s) may take too long, depending on the size of the db and other conditions. In my case, on test data, it takes about 1 hour to create tables and copy the data, then about 40 min to create indexes, then 4.5 hours to create one (1) FK constraint. In production, we'll have 10-100x more data than we have for testing. If we have a problem in production, the time necessary to restore the db is simply going to kill us. People might be more interested in debating this topic with you if we hadn't discussed it at length just a couple months back. There wasn't consensus then that we had to offer an escape hatch, and you've not offered any argument that wasn't made before. I'm simply presenting a problem for which I currently do not see any solution (it's very important for us to be able to restore db within a reasonable amount of time). If there's no solution and none is planned, then we cannot use pgsql, can we? Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
[HACKERS] building outside source tree
Building PostgreSQL outside the source tree is slightly broken: (Using the current CVS HEAD code) $ mkdir ~/test-pg-build $ cd ~/test-pg-build $ ../pgsql/configure [ output omitted] $ make [ output omitted; make succeeds ] $ touch ../pgsql/configure # i.e. cvs up $ make The final make command invokes 'config.status --recheck', which invokes '../pgsql/configure --no-create --no-recursion'. The re-run of configure succeeds, but the rest of the make fails abruptly: make: *** No rule to make target `src/Makefile.global.in', needed by `src/Makefile.global'. Stop. When 'configure' has been updated, we use config.status to re-run configure with the arguments that were previously specified. The problem is caused by the '--no-create' flag passed to configure by config.status: in configure.in we don't set the 'vpath_build' variable if '--no-create' has been specified. (Therefore, when src/Makefile.global is recreated after configure is re-run, the 'vpath_build' variable isn't set, and a vpath build (i.e. a build outside the source tree) obviously fails.) Any comments on how we should fix this? -Neil ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Materialized views proposal
Jonathan Gardner kirjutas K, 26.11.2003 kell 19:03: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 I apologize if this post is inappropriate. Doing some development work, me and my co-worker discussed some optimizations strategies. One of the ideas that came up was materialized views. Trading disk space to summarize queries, and paying for a trigger waterfall on certain kinds of updates seems like a small price to pay right now in our situation. I searched through the archives and I couldn't seem to find anything relevant. As I've no familiarity with the internals, but I do know the front-end pretty well, what can I do to help implement materialized views? First, You could start by implementing materialized views manually, using tables and triggers, to get the feel of what should be generated. Next, still working from frontend, try to make some code (probably in some RAD/scripring language) which does the transform from CREATE VIEW (or whatever) syntax (or manually constructed syntax tree) into create table + create trigger statements. If you have the above working well, talk about moving this to backend. Is such an idea feasible? Is there some reason why it just isn't a good idea? (And if not, what is a better idea?) If someone is willing to guide me, I can probably contribute a fair amount of C code. I do have experience with C. What is needed is good algorithms. Writing C code is secondary to that. Similar problem has kept us from implementing updatable views for quite some time. Hannu ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
ow [EMAIL PROTECTED] writes: --- Tom Lane [EMAIL PROTECTED] wrote: Quite honestly, I think they should check their foreign keys. What should I do if I *know* there will be a FK failure but I want to correct it manually. Perhaps by creating all the necessary target records, perhaps by deleting or updating the dead references. Perhaps with a mix of these. As it stands I have to delete the FK constraint, load the table, and fix the data. Then recreate the FK constraint -- with the danger that I'll get the definition wrong -- and wait for the constraint to be verified. If I could disable and reenable the constraint the danger that I would get the definition wrong would be eliminated. And if I had already done the work to ensure there were no broken relationships I would optionally be able to skip the redundant automatic check. I could even have done the verification myself while the data wasn't live for example. The database is a tool. It's annoying to have a tool that tries to outsmart the user. -- greg ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: This is somewhat complementary to WAL and PITR. I'm seeking for a fast way to dump and restore a complete database, like physical file copy, without shutting down the backend. I was thinking of a BACKUP command that streams out the files including any indexes and non-vacuumed tuples. A database recreated from that wouldn't be as clean as a pg_dump/pg_restored database, but it would be up much faster, and there wouldn't be any dependency problem. It's already intended to support this as part of the PITR work. The idea is you force a checkpoint and then make a tar-format dump of the database tree (tar or whatever floats your boat, but anyway a filesystem-level backup). The database need not be stopped while you do this, and you don't need a filesystem that can do snapshots or anything fancy like that. The tar dump itself most likely does not represent a consistent state of the database by the time you are done making it. That is okay, because you have also been archiving off to tape (or someplace) all the WAL data generated since that pre-dump checkpoint. You can continue archiving the WAL series for however far forward from the original dump you feel like. If you need to recover, you reload the database from the tar dump and then replay the WAL series against it. This is indistinguishable from a crash recovery situation --- the inconsistent tar dump looks just like a disk that has received some but not all of the updates since the last checkpoint. Replay will fix it. The cool thing about this is you can actually bring the DB to the state it was in at any specific point in time covered by your WAL archive --- just run the WAL replay as far as you want, then stop. Solves the junior DBA deleted all my data Saturday morning problem, thus PITR. Now the uncool thing is you need massive amounts of secondary storage to archive all that WAL data, Shouldn't be a problem, since there are few databases out there worldwide exceeding today's average disk capacity... if your installation has heavy update activity. But it seems to me it would address the need you mention above --- you'd just not bother to continue archiving WAL past the end of the dump operation. PITR is cool, no question, it's more than I've been requesting. When the database server burns, I'd be quite happy if I could restore to my latest tape's point in time, since the WAL log disk probably isn't functional too. So having a fast backup of the snapshot when the backup CHECKPOINT was issued would be enough, no WAL replay needed. In principle you could do this today, but we don't have enough support code in place to make it work smoothly, eg WAL segment files aren't labeled with enough identifying information to let you manage an archive full of 'em. Still it doesn't seem that far away. So I issue CHECKPOINT, and tar the cluster or database. Still, I got two questions: - how to restore a single database - while tar is running, CHECKPOINT(n+1) might be recorded in some files, while others have CHECKPOINT(n). How does the backend know to rollback to CHECKPOINT(n)? Regards, Andreas ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] detecting poor query plans
On Wed, Nov 26, 2003 at 11:59:33AM -0500, Neil Conway wrote: In some situations that doesn't really matter, as the same plan would have gotten picked anyway. The hint is NOT the chosen plan was non-optimal; the hint is the query planner did not produce an accurate row count estimate for this node in the query tree. Maybe it could only be done for SeqScan and IndexScan nodes, which are probably the most common source of bad estimates related to poor statistics. -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) At least to kernel hackers, who really are human, despite occasional rumors to the contrary (LWN.net) ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] Optimizer cost calculation
It seems to me that the root cause of some of the optimizer failures that come is the optimizer's attempt to assign a single expected cost value to every choice. In fact it seems it should have also a minimum cost and maximum cost in addition to the expected cost. Often the optimizer is faced with two possible choices, one of which appears slightly better but in fact has a much larger failure mode than the alternative. For example, consider a simple join between two tables where the optimizer thinks approximately 10% of the second table will be used. It will probably be just at the threshold of using a full table scan with a merge or hash join instead of the simple nested loop. In fact the nested loop has only a small linear penalty (2-4 times slower even if the *entire* table is read) if it's mistakenly chosen. Whereas if the selectivity is estimated wrong and only a few records are needed the full table scan can be thousands of times slower than the nested loop. If the nested loop calculated the min/max/expected costs based on 1 row, the full table, and the expected number of records and found that while the expected value is slightly higher than the equivalent for the merge join, the max is 2x higher than the merge join but the minimum is thousands of times smaller, then it should consider choosing the nested loop because of the greater risk of choosing the merge join. -- greg ---(end of broadcast)--- TIP 5: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faqs/FAQ.html
Re: [HACKERS] pg_restore and create FK without verification check
Greg Stark wrote: If I could disable and reenable the constraint the danger that I would get the definition wrong would be eliminated. And if I had already done the work to ensure there were no broken relationships I would optionally be able to skip the redundant automatic check. I could even have done the verification myself while the data wasn't live for example. Since FKs are implemented as trigger, you could disable all triggers on the table right now, no? Could be a bit more comfortable, I agree, and hope for an upcoming DISABLE TRIGGER command. While talking about this: I could add ENABLE/DISABLE TRIGGER functionality to pgadmin3. Unfortunately, on pg_trigger.tgenabled there's still the comment not presently checked everywhere it should be, so disabling a trigger by setting this to false does not work reliably. I wonder if this is still true for 7.4. I can't imagine that this should be so hard to fix. Regards, Andreas ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_restore and create FK without verification check
Andreas Pflug [EMAIL PROTECTED] writes: In principle you could do this today, but we don't have enough support code in place to make it work smoothly, eg WAL segment files aren't labeled with enough identifying information to let you manage an archive full of 'em. Still it doesn't seem that far away. So I issue CHECKPOINT, and tar the cluster or database. Still, I got two questions: - how to restore a single database You don't. As I said, any physical backup is going to be all-or-nothing. These techniques are not a replacement for pg_dump. - while tar is running, CHECKPOINT(n+1) might be recorded in some files, while others have CHECKPOINT(n). How does the backend know to rollback to CHECKPOINT(n)? That's part of the management code that we need to write before this will really be very useful; you need to be able to associate the starting time of a tar dump with the most recent previous CHECKPOINT in the WAL logs. Not hard in principle, just takes some code we ain't got yet. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Providing anonymous mmap as an option of sharing memory
Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: I covered only first point in my post. IMO it is not such a unsolvable problem. If a postmaster crashes hard but leaves a backend running, would it clean pid file etc? I don't think so. So if a postmaster can start on a 'pid-clean' state, then it is guaranteed to be no childs left around. And that helps how? The problem is to detect whether there are any children left from the old postmaster, when what you have to work from is the pid-file it left behind. In any case, you're still handwaving away the very real portability issues around mmap. Linux is not the universe, and Linux+BSD isn't either. We might still have considered it, despite the negatives, if anyone had been able to point to any actual *advantages* of mmap. There are none. Yes, the SysV shmem API is old and ugly and crufty, but it does what we need it to do. Plus many operating systems can lock SvssV shmem into RAM to prevent it from being swapped out. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_restore and create FK without verification check
Tom Lane wrote: - how to restore a single database You don't. As I said, any physical backup is going to be all-or-nothing. These techniques are not a replacement for pg_dump. That's sad. I've been backing up and restoring single databases from a cluster frequently, so I'd really like the database to be selectable. That's part of the management code that we need to write before this will really be very useful; you need to be able to associate the starting time of a tar dump with the most recent previous CHECKPOINT in the WAL logs. Not hard in principle, just takes some code we ain't got yet. So I lay back patiently (more or less :-) Regards, Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] pg_restore and create FK without verification check
On Wed, 26 Nov 2003, ow wrote: People might be more interested in debating this topic with you if we hadn't discussed it at length just a couple months back. There wasn't consensus then that we had to offer an escape hatch, and you've not offered any argument that wasn't made before. I'm simply presenting a problem for which I currently do not see any solution (it's very important for us to be able to restore db within a reasonable amount of time). If there's no solution and none is planned, then we cannot use pgsql, can we? You can make your own solution, that's the nice thing about open source stuff. If you wanted to go the SET variable route to control alter time checks of CHECK and FOREIGN KEY constraints, it's almost certainly less than an hours worth of work. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] ALTER COLUMN/logical column position
Hannu Krosing wrote: You are just shifting the interface problems to a place needing way more changes in the backend. There will be some problems either way. Not quite. Certainly, basing internal storage on attstoragenum is more work in the backend, but less (precisely: zero) work on an unknown number of frontend tools and apps. also, tools needing knowledge should start using information schema as much as they can, making internal reshufflings less of a problem. We had this discussion. information_schema doesn't deliver enough info needed for admin tools. This way, the user interface doesn't change, and all those SELECT ... FROM pg_attribute ORDER BY attnum continue delivering the expected result. Depending on what you expect ;) Usually, nobody should care about the column ordering, but for those unfortunate guys that rely on a specific SELECT * ordering the list of columns displayed in admin tools must show that ordering; this is what current admin tools expect from attnum. No SQL user would ever care about internal storage details/pointers/counters, so any admin tool would need to ORDER BY CASE WHEN version=7.5 THEN attpos ELSE attnum END (and the unique key to pg_attribute, as seen from the tool, changes from refoid/attnum to refoid/attindex too). If you expect the above to give you all active columns as orderd as they are stored, then it does not give you what you expect. Btw, most of these concerns (and more) were already iterated when DROP column was done causing gaps in attnum. There were a lot of doomsday profecies, but in the end it went quite smoothly. I don't bother about missing attnum values, even 1,2,3,5,6,8 is nicely ordered. The tools needing internal knowledge about storage (meaning any tool doing select .. from pg_...) have always needed some upgrades for new verions. Yes, but changes to pg_... should retain the usual meanings as much as possible, so older tools continue to work. The discussed change is problematic because old tools *seem* to work ok, but their attnum interpretation would be wrong. IMHO, The only behaviour visible to common user we should worry about is SELECT * , and a special column for solving this is _the_ easiest way to do it. Surely this is the easiest way. But it has the biggest impact on clients too. I'm just imagining what would happen to pgAdmin3. The column number would have to display attpos (this is what the user is interested in to see the ordering), while index, FK and so forth will continue to display attnum. This seems quite unwanted to me. --- Are there any comments on the proposed lean way to alter columns for trivial type changes? Regards, Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] 4 Clause license?
On Thu, 2003-11-20 at 08:34, Rod Taylor wrote: I think maybe the simplest thing is for me to prepare a patch that rips that code out and replaces it with a (slightly simpler - less umask hacking required, I think) piece of code that I will write. The FreeBSD folks sorted it out for us. Everyones names should be in the copyright for the file. The licence portion should be the 3 clause version -- no advertising clause. I think borrowing should be encouraged, and now that we know what license / copyright we need to carry over, this can be done without worry. Ignore that.. I hadn't read Terry's statement yet -- it got snagged by the spam filter for some reason. Since we have other code which has the same issue, I think this needs to be solved. Can anyone show what Apple did? -- Rod Taylor rbt [at] rbt [dot] ca Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL PGP Key: http://www.rbt.ca/rbtpub.asc signature.asc Description: This is a digitally signed message part
Re: [HACKERS] A rough roadmap for internationalization fixes
On Tue, Nov 25, 2003 at 04:19:05PM -0500, Tom Lane wrote: UCS-2 is impractical without some *extremely* wide-ranging changes in the backend. To take just the most obvious point, doesn't it require allowing embedded zero bytes in text strings? If you're going to use unicode in the rest of the backend, you'll have to be able to deal with them anyway. You can't use normal C string functions. Kurt ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] pg_restore and create FK without verification check
On Wed, 26 Nov 2003, Tom Lane wrote: Quite honestly, I think they should check their foreign keys. In a partial restore situation there is no guarantee that the referenced table and the referencing table are being restored at the same time from the same dump. An override in that situation looks like a great tool for shooting yourself in the foot. People might be more interested in debating this topic with you if we hadn't discussed it at length just a couple months back. There wasn't consensus then that we had to offer an escape hatch, and you've not offered any argument that wasn't made before. I actually thought the majority in the past discussion thought that an escape hatch was a good idea, but that the discussion broke down in trying to determine what sort of hatch that might be (iirc, it got off into the general discussion of disabling constraints for normal operation as opposed to at alter time). ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] detecting poor query plans
Greg Stark [EMAIL PROTECTED] writes: There's a dual to this as well. If the results were very close but the actual time taken to run the node doesn't match the cost calculated then some optimizer parameter needs to be adjusted. I was thinking about this, but I couldn't think of how to get it to work properly: (1) The optimizer's cost metric is somewhat bogus to begin with. ISTM that translating a cost of X into an expected runtime of Y msecs is definitely not trivial to do. (2) The size of a node's result relation does not depend upon anything outside of PostgreSQL, whereas the exact time it takes to produce that result relation depends on a wide collection of external factors. For example, if the system is under heavy load, queries will take longer than normal to run. Or, if the query invokes a function that happens to occasionally block waiting on some resource, the execution time of the query could be wildly unpredictable. (3) ISTM we couldn't produce a really helpful hint message, even if we somehow resolved #1 and #2 -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_restore and create FK without verification check
On Wed, 2003-11-26 at 12:43, Andreas Pflug wrote: Greg Stark wrote: If I could disable and reenable the constraint the danger that I would get the definition wrong would be eliminated. And if I had already done the work to ensure there were no broken relationships I would optionally be able to skip the redundant automatic check. I could even have done the verification myself while the data wasn't live for example. Since FKs are implemented as trigger, you could disable all triggers on the table right now, no? Could be a bit more comfortable, I agree, and hope for an upcoming DISABLE TRIGGER command. ISTM I've done this before... from a pg_dump -Fc backup first do a pg_dump -s restore (schema only) and then a pg_dump -a --disable-triggers to load the data without check foreign keys. Theres certainly potential for trouble with this method I suppose but it seems like it accomplish what the original poster requires... Robert Treat -- Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] detecting poor query plans
Neil Conway [EMAIL PROTECTED] writes: I was thinking about this, but I couldn't think of how to get it to work properly: (1) The optimizer's cost metric is somewhat bogus to begin with. ISTM that translating a cost of X into an expected runtime of Y msecs is definitely not trivial to do. At least for all the possible plans of a given query at a specific point in time the intention is that the cost be proportional to the execution time. the exact time it takes to produce that result relation depends on a wide collection of external factors. That's a valid point. The ms/cost factor may not be constant over time. However I think in the normal case this number will tend towards a fairly consistent value across queries and over time. It will be influenced somewhat by things like cache contention with other applications though. On further thought the real problem is that these numbers are only available when running with explain on. As shown recently on one of the lists, the cost of the repeated gettimeofday calls can be substantial. It's not really feasible to suggest running all queries with that profiling. -- greg ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] detecting poor query plans
Greg Stark [EMAIL PROTECTED] writes: At least for all the possible plans of a given query at a specific point in time the intention is that the cost be proportional to the execution time. Why is this relevant? Given a cost X at a given point in time, the system needs to derive an expected runtime Y, and compare Y with the actual runtime. I think that producing Y given an arbitrary X involves so many parameters as to be practically impossible for us to compute with any degree of accuracy. That's a valid point. The ms/cost factor may not be constant over time. However I think in the normal case this number will tend towards a fairly consistent value across queries and over time. It might be true in the normal case, but that doesn't seem very helpful to me: in general, the mapping of plan costs to execution time can vary wildly over time. Spewing hints to the log whenever the system's workload changes, a checkpoint occurs, or the system's RAID array hiccups doesn't sound like a useful feature to me. On further thought the real problem is that these numbers are only available when running with explain on. As shown recently on one of the lists, the cost of the repeated gettimeofday calls can be substantial. That sounds more like an implementation detail than the real problem to me -- I think this proposed feature has more fundamental issues. -Neil ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_restore and create FK without verification check
ow [EMAIL PROTECTED] writes: --- Tom Lane [EMAIL PROTECTED] wrote: Quite honestly, I think they should check their foreign keys. Generally speaking, I agree. The problem is that verification of FK constraint(s) may take too long, depending on the size of the db and other conditions. In my case, on test data, it takes about 1 hour to create tables and copy the data, then about 40 min to create indexes, then 4.5 hours to create one (1) FK constraint. If you're seeing this on 7.4, I'd like to see the details of the exact commands being issued. If it's not 7.4, it's not a relevant measurement. regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Limiting factors of pg_largeobject
Joshua D. Drake [EMAIL PROTECTED] writes: pg_largeobject uses a loid identifier for the loid. What do we think it would take to move that identifier to something like bigint? Breaking all the client-visible LO APIs, for one thing ... I don't really know the underlying internals of pg_largeboject but it does seem that if we made it have: 1. A larger identifier 2. An identifier that is not typed to the underlying system (oid) 3. The ability to be indexed We may benefit. Am I on crack? I don't see what you're getting at with #2 and #3 at all. OID is perfectly indexable. As for #1, it'd theoretically be useful, but I'm not sure about the real-world usefulness. If your large objects are even moderately large (for whatever value of large applies this week), you're not likely to be expecting to cram 4 billion of them into your database. If we were doing LOs over for some reason it might be interesting to consider this, but I think they're largely a legacy feature at this point, and not worth that kind of effort. It would be better to put the development effort on creating serial access capability to toasted fields, whereupon LOs would really be obsolete. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] detecting poor query plans
Greg Stark [EMAIL PROTECTED] writes: That's a valid point. The ms/cost factor may not be constant over time. However I think in the normal case this number will tend towards a fairly consistent value across queries and over time. It will be influenced somewhat by things like cache contention with other applications though. I think it would be interesting to collect the numbers over a long period of time and try to learn something from the averages. The real hole in Neil's original suggestion was that it assumed that comparisons based on just a single query would be meaningful enough to pester the user about. On further thought the real problem is that these numbers are only available when running with explain on. As shown recently on one of the lists, the cost of the repeated gettimeofday calls can be substantial. It's not really feasible to suggest running all queries with that profiling. Yeah. You could imagine a simplified-stats mode that only collects the total runtime (two gettimeofday's per query is nothing) and the row counts (shouldn't be impossibly expensive either, especially if we merged the needed fields into PlanState instead of requiring a separately allocated node). Not sure if that's as useful though. regards, tom lane ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Limiting factors of pg_largeobject
Breaking all the client-visible LO APIs, for one thing ... Erck. 1. A larger identifier 2. An identifier that is not typed to the underlying system (oid) 3. The ability to be indexed We may benefit. Am I on crack? I don't see what you're getting at with #2 and #3 at all. OID is perfectly indexable. Well number 2 is that we have a limit on total number of OID's yes? Which means we could theorectically run out of OID's because of pg_largeobject. The ability to be indexed is obviously there but one problem we have is that you can't create an index on a system table at least not a user level index. Is there system level indexes that I am unaware of? As for #1, it'd theoretically be useful, but I'm not sure about the real-world usefulness. If your large objects are even moderately large (for whatever value of large applies this week), you're not likely to be expecting to cram 4 billion of them into your database. If we were doing LOs over for some reason it might be interesting to consider this, but I think they're largely a legacy feature at this point, and not worth that kind of effort. It would be better to put the development effort on creating serial access capability to toasted fields, whereupon LOs would really be obsolete. regards, tom lane ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Limiting factors of pg_largeobject
Joshua D. Drake [EMAIL PROTECTED] writes: The ability to be indexed is obviously there but one problem we have is that you can't create an index on a system table at least not a user level index. Is there system level indexes that I am unaware of? pg_largeobject already has an index (which is used by all the LO operations). Again, don't see what the width of the object ID column has to do with it. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] Limiting factors of pg_largeobject
pg_largeobject already has an index (which is used by all the LO operations). Again, don't see what the width of the object ID column has to do with it. I was more after the not having an OID than the width of the ID column. regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pg_restore and create FK without verification check
On Wed, 26 Nov 2003, Tom Lane wrote: ow [EMAIL PROTECTED] writes: --- Tom Lane [EMAIL PROTECTED] wrote: Quite honestly, I think they should check their foreign keys. Generally speaking, I agree. The problem is that verification of FK constraint(s) may take too long, depending on the size of the db and other conditions. In my case, on test data, it takes about 1 hour to create tables and copy the data, then about 40 min to create indexes, then 4.5 hours to create one (1) FK constraint. If you're seeing this on 7.4, I'd like to see the details of the exact commands being issued. If it's not 7.4, it's not a relevant IIRC, he was. I think the thing causing the difference between his times and the ones we saw typically when doing the tests was that he didn't have an index on the fktable's referencing column. ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pg_restore and create FK without verification check
Stephan Szabo wrote: IIRC, he was. I think the thing causing the difference between his times and the ones we saw typically when doing the tests was that he didn't have an index on the fktable's referencing column. A common mistake, can't count how often I created this one... And not easy to find, because EXPLAIN won't explain triggers. I'm planning to create some kind of fk index wizard in pgAdmin3, which finds out about fks using columns that aren't covered by an appropriate index. Maybe this check could be performed (as a NOTICE) when the fk is created? Regards, Andreas ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] detecting poor query plans
On further thought the real problem is that these numbers are only available when running with explain on. As shown recently on one of the lists, the cost of the repeated gettimeofday calls can be substantial. It's not really feasible to suggest running all queries with that profiling. Yeah. You could imagine a simplified-stats mode that only collects the total runtime (two gettimeofday's per query is nothing) and the row counts (shouldn't be impossibly expensive either, especially if we merged the needed fields into PlanState instead of requiring a separately allocated node). Not sure if that's as useful though. How about a PGC_POSTMASTER GUC variable which tells postgres to collect details on the planner's performance and comparison to actual run times. Optionally, we could also have the executor run some/all of the possible plans (presumably only useful for SELECTs) and keep details on the performance of each. At postmaster shutdown (some other time?) a report could be produced profiling all queries. The reason I suggest this is it would have zero impact on production databases but would allow DBAs to profile their databases with real usage patterns in development environments. Gavin ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] Limiting factors of pg_largeobject
Joshua D. Drake [EMAIL PROTECTED] writes: I was more after the not having an OID than the width of the ID column. We're still at cross-purposes then. pg_largeobject doesn't have OIDs (in the sense of per-row OIDs). What I thought you were complaining about was the chosen datatype of the LO identifier column (loid), which happens to be OID. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] pg_restore and create FK without verification check
Stephan Szabo [EMAIL PROTECTED] writes: On Wed, 26 Nov 2003, Tom Lane wrote: If you're seeing this on 7.4, I'd like to see the details of the exact commands being issued. If it's not 7.4, it's not a relevant IIRC, he was. I think the thing causing the difference between his times and the ones we saw typically when doing the tests was that he didn't have an index on the fktable's referencing column. I'm not convinced it should matter all that much, for the single-query test method that 7.4 uses. That's why I wanted to see details. regards, tom lane ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pg_restore and create FK without verification check
--- Andreas Pflug [EMAIL PROTECTED] wrote: Stephan Szabo wrote: IIRC, he was. I think the thing causing the difference between his times and the ones we saw typically when doing the tests was that he didn't have an index on the fktable's referencing column. A common mistake, can't count how often I created this one... Wrong. It's a mistake to think that you always HAVE to have an index on FK column. See the links below for more details: http://archives.postgresql.org/pgsql-admin/2003-11/msg00317.php http://archives.postgresql.org/pgsql-admin/2003-11/msg00319.php __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] Limiting factors of pg_largeobject
We're still at cross-purposes then. pg_largeobject doesn't have OIDs (in the sense of per-row OIDs). What I thought you were complaining about was the chosen datatype of the LO identifier column (loid), which happens to be OID. O.k. that was my main concern, which per your statement is unfounded ;). If I don't have to worry about it, then that solves the issue :) regards, tom lane -- Command Prompt, Inc., home of Mammoth PostgreSQL - S/ODBC and S/JDBC Postgresql support, programming shared hosting and dedicated hosting. +1-503-667-4564 - [EMAIL PROTECTED] - http://www.commandprompt.com Mammoth PostgreSQL Replicator. Integrated Replication for PostgreSQL ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] pg_restore and create FK without verification check
A common mistake, can't count how often I created this one... And not easy to find, because EXPLAIN won't explain triggers. I'm planning to create some kind of fk index wizard in pgAdmin3, which finds out about fks using columns that aren't covered by an appropriate index. Maybe this check could be performed (as a NOTICE) when the fk is created? Weird - I'm planning the exact same thing for phpPgAdmin! Great minds think alike :P Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
[HACKERS] 7.5 Plans
Hi everyone, I'm just interested in what everyone's personal plans for 7.5 development are? Shridar, Gavin and myself are trying to get the tablespaces stuff off the ground. Hopefully we'll have a CVS set up for us to work in at some point (we didn't think getting a branch and commit privs was likely). Changing all our $Id$ tags to $Postgres$ would make a separate CVS a lot easier, hint hint :) What's everyone else wanting to work on? Chris ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_restore and create FK without verification check
Tom Lane wrote: Andreas Pflug [EMAIL PROTECTED] writes: In principle you could do this today, but we don't have enough support code in place to make it work smoothly, eg WAL segment files aren't labeled with enough identifying information to let you manage an archive full of 'em. Still it doesn't seem that far away. So I issue CHECKPOINT, and tar the cluster or database. Still, I got two questions: - how to restore a single database You don't. As I said, any physical backup is going to be all-or-nothing. These techniques are not a replacement for pg_dump. But this is just an artifact of the fact that the WAL is a single instance-wide entity, rather than a per-database entity. But since databases are completely separate entities that cannot be simultaneously accessed by any query (corrections welcome), there isn't any reason in principle that the WAL files cannot also be created on a per-database basis. I'm sure, of course, that doing so would bring with it a new set of problems and tradeoffs, so it might not be worth it... -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] detecting poor query plans
Gavin Sherry wrote: On further thought the real problem is that these numbers are only available when running with explain on. As shown recently on one of the lists, the cost of the repeated gettimeofday calls can be substantial. It's not really feasible to suggest running all queries with that profiling. Yeah. You could imagine a simplified-stats mode that only collects the total runtime (two gettimeofday's per query is nothing) and the row counts (shouldn't be impossibly expensive either, especially if we merged the needed fields into PlanState instead of requiring a separately allocated node). Not sure if that's as useful though. How about a PGC_POSTMASTER GUC variable which tells postgres to collect details on the planner's performance and comparison to actual run times. Optionally, we could also have the executor run some/all of the possible plans (presumably only useful for SELECTs) and keep details on the performance of each. At postmaster shutdown (some other time?) a report could be produced profiling all queries. The reason I suggest this is it would have zero impact on production databases but would allow DBAs to profile their databases with real usage patterns in development environments. Great idea. Under ideal situations, it shouldn't be needed, but things are often less than idea. -- Bruce Momjian| http://candle.pha.pa.us [EMAIL PROTECTED] | (610) 359-1001 + If your life is a hard drive, | 13 Roberts Road + Christ can be your backup.| Newtown Square, Pennsylvania 19073 ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] pg_restore and create FK without verification check
On Thu, Nov 27, 2003 at 12:40:28AM +0100, Andreas Pflug wrote: A common mistake, can't count how often I created this one... And not easy to find, because EXPLAIN won't explain triggers. That's a pity. And the lack of EXPLAINing function execution, too. Maybe it's not that hard to do? -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Just treat us the way you want to be treated + some extra allowance for ignorance.(Michael Brusser) ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.5 Plans
On Thu, Nov 27, 2003 at 10:27:22AM +0800, Christopher Kings-Lynne wrote: What's everyone else wanting to work on? I want to get the nested transaction patch thingie sorted out. I feel it's not that far away. After that, maybe - try using a pg_shareddepend shared catalog to check user dependencies - get VACUUM FULL to REINDEX instead of munging individual index tuples - try to come out with something to enable aggregates optimization by using an aggregate-specific function to check the plan tree Anyway, if I get nested xacts in 7.5 I'll be more than happy ... (I will probably be doing lots of translation work too, or maybe enable someone else to do it ...) -- Alvaro Herrera (alvherre[a]dcc.uchile.cl) Dios hizo a Adán, pero fue Eva quien lo hizo hombre. ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] pg_restore and create FK without verification check
Kevin Brown [EMAIL PROTECTED] writes: Tom Lane wrote: You don't. As I said, any physical backup is going to be all-or-nothing. These techniques are not a replacement for pg_dump. But this is just an artifact of the fact that the WAL is a single instance-wide entity, rather than a per-database entity. But since databases are completely separate entities that cannot be simultaneously accessed by any query (corrections welcome), there isn't any reason in principle that the WAL files cannot also be created on a per-database basis. WAL is not the bottleneck ... as I already mentioned today, pg_clog (and more specifically the meaning of transaction IDs) is what really makes a cluster an indivisible whole at the physical level. If you want to do separate physical dumps/restores, the answer is to set up separate clusters (separate postmasters). Not so hard, is it? regards, tom lane ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] pg_restore and create FK without verification check
Alvaro Herrera [EMAIL PROTECTED] writes: That's a pity. And the lack of EXPLAINing function execution, too. Maybe it's not that hard to do? Not sure if it's hard or not, but it'd sure be a nice thing to have. regards, tom lane ---(end of broadcast)--- TIP 1: subscribe and unsubscribe commands go to [EMAIL PROTECTED]
Re: [HACKERS] pg_restore and create FK without verification check
Tom Lane wrote: Kevin Brown [EMAIL PROTECTED] writes: Tom Lane wrote: You don't. As I said, any physical backup is going to be all-or-nothing. These techniques are not a replacement for pg_dump. But this is just an artifact of the fact that the WAL is a single instance-wide entity, rather than a per-database entity. But since databases are completely separate entities that cannot be simultaneously accessed by any query (corrections welcome), there isn't any reason in principle that the WAL files cannot also be created on a per-database basis. WAL is not the bottleneck ... as I already mentioned today, pg_clog (and more specifically the meaning of transaction IDs) is what really makes a cluster an indivisible whole at the physical level. If you want to do separate physical dumps/restores, the answer is to set up separate clusters (separate postmasters). Not so hard, is it? Well, aside from the fact that separate clusters have completely separate user databases, listen on different ports, will compete with other clusters on the same system for resources that would be better managed by a single cluster, and generally have to be maintained as completely separate entities from start to finish, no it's not that hard. ;-) The ability to restore a single large database quickly is, I think, a reasonable request, it's just that right now it's difficult (perhaps impossible) to satisfy that request. It's probably something that we'll have to deal with if we want PG to be useful to people managing really large databases on really, really big iron, though. -- Kevin Brown [EMAIL PROTECTED] ---(end of broadcast)--- TIP 6: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] 7.5 Plans
On Thu, 27 Nov 2003, Christopher Kings-Lynne wrote: Changing all our $Id$ tags to $Postgres$ would make a separate CVS a lot easier, hint hint :) Why? From the info pages: `$Header$' A standard header containing the full pathname of the RCS file, the revision number, the date (UTC), the author, the state, and the locker (if locked). Files will normally never be locked when you use CVS. `$Id$' Same as `$Header$', except that the RCS filename is without a path. So, what does changing it to $PostgreSQL$ do? Or am I reading the wrong part of the manual? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 4: Don't 'kill -9' the postmaster
Re: [HACKERS] 7.5 Plans
So, what does changing it to $PostgreSQL$ do? Or am I reading the wrong part of the manual? After applying the patch in -patches to CVSROOT and running the update script... It will allow Chris and other to import the PostgreSQL source into their own CVS tree without having do to a ton of diff editing to reimport / export changes between the primary tree and their development tree. Forking the main PostgreSQL tree 30 times for the various groups to do development isn't appropriate nor necessary when we can just change the ID that PostgreSQL uses to something different. The BSDs wen't through similar measures to ensure they could maintain multiple CVS sources without diff / patch going nuts. ---(end of broadcast)--- TIP 2: you can get off all lists at once with the unregister command (send unregister YourEmailAddressHere to [EMAIL PROTECTED])
Re: [HACKERS] 7.5 Plans
On Wed, 26 Nov 2003, Rod Taylor wrote: So, what does changing it to $PostgreSQL$ do? Or am I reading the wrong part of the manual? After applying the patch in -patches to CVSROOT and running the update script... It will allow Chris and other to import the PostgreSQL source into their own CVS tree without having do to a ton of diff editing to reimport / export changes between the primary tree and their development tree. 'k, but why can't that be accomplished with $Id$? Are there any caveats to the change? Ie. if Tom has a checkout and this change is made, will he have problems committing? Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.5 Plans
Rod Taylor [EMAIL PROTECTED] writes: So, what does changing it to $PostgreSQL$ do? Or am I reading the wrong part of the manual? The BSDs wen't through similar measures to ensure they could maintain multiple CVS sources without diff / patch going nuts. Yeah, I have gotten similar requests from Red Hat's internal development group. $PostgreSQL$ doesn't do anything for *us* ... but it makes life easier for other people trying to import PG sources into their own CVS trees. In light of the Grand Scheme for World Domination, that seems like a worthwhile improvement to me ;-) regards, tom lane ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] 7.5 Plans
Marc G. Fournier [EMAIL PROTECTED] writes: 'k, but why can't that be accomplished with $Id$? $Id$ isn't much better than $Header$ --- the point is to avoid keywords that downstream people's CVS repositories will want to replace. Are there any caveats to the change? Ie. if Tom has a checkout and this change is made, will he have problems committing? Making the change will mean a big cvs update pass for everyone, but no worse than what happens every time Bruce hits the copyright dates, for example. It's a one-line change in every file AFAICS. I'd recommend giving committers a day or two's notice of the edit, same as Bruce typically does for pgindent runs, but there's no reason to consider it any worse than that. regards, tom lane ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.5 Plans
On Wed, 2003-11-26 at 23:32, Marc G. Fournier wrote: On Wed, 26 Nov 2003, Rod Taylor wrote: So, what does changing it to $PostgreSQL$ do? Or am I reading the wrong part of the manual? After applying the patch in -patches to CVSROOT and running the update script... It will allow Chris and other to import the PostgreSQL source into their own CVS tree without having do to a ton of diff editing to reimport / export changes between the primary tree and their development tree. 'k, but why can't that be accomplished with $Id$? When you import the files into the other CVS system the version and file information $Id$ represents will be replaced by the other system. So, when you diff the original (primary repository) EVERY file will show $Id$ has changed. Are there any caveats to the change? Ie. if Tom has a checkout and this change is made, will he have problems committing? $VARIABLE$ is used for informational purposes only AFAIK. There would be no change that I know of. At work I use $InQuent$, FreeBSD of course uses $FreeBSD$ of which other BSDs follow suite $NetBSD$ and $OpenBSD$. I've not heard of any issues with 3rd party CVS clients and any of those repositories in this regard. ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
[HACKERS] $Id$ - $PostgreSQL$ Change
Based on discussions on -hackers, and baring any objections betwen now and then, I'm going to go through all files in CVS and change: $Id$ - $PostgreSQL$ I will do this the evening of Friday, November 29th ... Marc G. Fournier Hub.Org Networking Services (http://www.hub.org) Email: [EMAIL PROTECTED] Yahoo!: yscrappy ICQ: 7615664 ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings
Re: [HACKERS] pg_restore and create FK without verification check
--- Alvaro Herrera [EMAIL PROTECTED] wrote: On Thu, Nov 27, 2003 at 12:40:28AM +0100, Andreas Pflug wrote: A common mistake, can't count how often I created this one... And not easy to find, because EXPLAIN won't explain triggers. That's a pity. And the lack of EXPLAINing function execution, too. Maybe it's not that hard to do? I'd like to emphasize again that NOT having an index on the FK column is a perfectly valid approach, despite some opinions to the contrary. In fact, requiring an index on FK column(s) when it is not required by the application's logic IS a mistake since it slows down inserts/deletes/updates/vacume/reindex/etc on the respective table and wastes disk space (could be considerable amount on large tables). Also, FK column index DOES NOT, in general, solve performance issues with FK verification check. Someone may (and, I'm sure, will) simply have more data or more constraints. The only solution here appears to be the --disable-triggers option as it was suggested by Robert Treat. If it works then I'm fine, somehow I did not see that option in the beginning. Thanks __ Do you Yahoo!? Free Pop-Up Blocker - Get it now http://companion.yahoo.com/ ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.5 Plans
Christopher Kings-Lynne wrote: Hi everyone, I'm just interested in what everyone's personal plans for 7.5 development are? Shridar, Gavin and myself are trying to get the tablespaces stuff off the ground. Hopefully we'll have a CVS set up for us to work in at some point (we didn't think getting a branch and commit privs was likely). Changing all our $Id$ tags to $Postgres$ would make a separate CVS a lot easier, hint hint :) What's everyone else wanting to work on? By the time tablespaces CVS goes live, I plan to study postgresql buffer management and push mmap. (See other thread). Hopefully my home internet connection will be up soon so that I can work on weekends. Shridhar ---(end of broadcast)--- TIP 8: explain analyze is your friend
Re: [HACKERS] 7.5 Plans
Rod Taylor [EMAIL PROTECTED] writes: On Wed, 2003-11-26 at 23:32, Marc G. Fournier wrote: 'k, but why can't that be accomplished with $Id$? When you import the files into the other CVS system the version and file information $Id$ represents will be replaced by the other system. So, when you diff the original (primary repository) EVERY file will show $Id$ has changed. Right. I can write a long sob story about the pain this has caused within Red Hat ... likely other people can say the same ... but the executive summary is: 1. You can't easily generate a clean diff of your local version against the original imported from postgresql.org. The changes you actually made get buried in a mass of useless $Foo$ diff lines. Stripping those out is possible in theory but painful. 2. You can't easily see which postgresql.org original version your local copy was actually derived from, because guess what, your local CVS server is going out of its way to hide the evidence. These effects not only cause pain for the downstream users, but interfere with them contributing their changes back upstream (by making it many orders of magnitude harder to generate a clean diff to send us). So we should fix it. regards, tom lane ---(end of broadcast)--- TIP 3: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Providing anonymous mmap as an option of sharing memory
Tom Lane wrote: Shridhar Daithankar [EMAIL PROTECTED] writes: I covered only first point in my post. IMO it is not such a unsolvable problem. If a postmaster crashes hard but leaves a backend running, would it clean pid file etc? I don't think so. So if a postmaster can start on a 'pid-clean' state, then it is guaranteed to be no childs left around. And that helps how? The problem is to detect whether there are any children left from the old postmaster, when what you have to work from is the pid-file it left behind. fine. We need shared memory for that. How about using 1 8K page just for detecting that? We don't need to base shared memory model on that, right? May be we can put clog in shared memory segment which would serve as process counter and move shared buffers to mmap? In any case, you're still handwaving away the very real portability issues around mmap. Linux is not the universe, and Linux+BSD isn't either. From the machines I can access here, following have anon and shared mmap.. [ost] ~ uname -a SunOS host 5.8 Generic_108528-21 sun4u sparc SUNW,Sun-Fire-880 Solaris [host] ~ uname -a AIX host 1 5 0001A5CA4C00 [/home/user]uname -a HP-UX host B.11.00 A 9000/785 2005950738 two-user license Is it enough of support? We might still have considered it, despite the negatives, if anyone had been able to point to any actual *advantages* of mmap. There are none. Yes, the SysV shmem API is old and ugly and crufty, but it does what we need it to do. 1) Per database buffers Postgresql does not perform well with large number of buffers. Say an installation is configured for 100K buffers and have 5 databases. Now what would happen if each of these databases get their own 100K buffers? mmap can not expand shared memory without a server restart. The current implementation of shared memory behaves the same way. Rather than moving it to use shared memory as and when required, we could push per database buffers to improve scalability. I think of this. 1. Introduce parameter columns in pg_database, for shared memory size (to start with) and number of live connections to that database. May be a callback to daemon postmaster to reread configuration if possible. (In shared memory, may be?) 2. Provide start and stop server commands which essentially either let a connection happen or not. Now somebody modifies the buffer parameters for a database(Say via alter database), all it has to do is disconnect and reconnect. If this is a first connection to the database, the parent postmaster should reread the per database parameters and force them. Same can happen with start/stop commands. 2) No more kernel mucking required. Recent linux installations are provide sane enough default of SHMMAX but I am sure plenty of folks would be glad to see that dependency go. I also want to talk about mmap for file IO but not in this thread. Shridhar ---(end of broadcast)--- TIP 9: the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] 7.5 Plans
Tom Lane wrote: 1. You can't easily generate a clean diff of your local version against the original imported from postgresql.org. The changes you actually made get buried in a mass of useless $Foo$ diff lines. Stripping those out is possible in theory but painful. Is that the reason linux does not use CVS? I thought so at least. Shridhar ---(end of broadcast)--- TIP 7: don't forget to increase your free space map settings