Re: [HACKERS] Vista/IPv6
On Wed, Apr 11, 2007 at 11:06:12PM -0600, Warren Turkal wrote: On Wednesday 11 April 2007 12:24, Andrew Dunstan wrote: If we could use configure for MSVC this would have Just Happened (tm). I wonder how many other little bits we miss out on? CMake anyone? IIRC, cmake is a replacement for make, not for configure. Or did I miss something? And it would require rewriting all the unix makefiles in cmake format, and it was one of the major requirements for this undertaking that this would not be required. //Magnus ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Vista/IPv6
On Thu, Apr 12, 2007 at 12:24:58AM +0200, Peter Eisentraut wrote: Magnus Hagander wrote: (FWIW, I had ipv6 on my list of things to make happen, but I didn't realise it would cause this issue on a machine with ipv6 on it, since I don't have one) The IPv6 support is finely tuned to deal with all kinds of combinations of API support, library support, and kernel support, or lack thereof, on the build system and on the host system. I suggest that you just use the logic that we have or emulate it, respectively. If it doesn't work, we will find out soon enough. The point is still what happens when you distribute a binary built on a system with ipv6 to a system that doesn't have it. Anyway. It seems reasonably safe, since I can build with ipv6 on my machine without the ipv6 driver, and it still works. Dave will have to test if it actually works on a machine that *has* ipv6 :-) So. Changes applied. //Magnus ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
[HACKERS] A Survey on Defect Management Practices in Free/Open Source Software
Dear PostgreSQL Contributors, I seek help from designers, developers, testers,defect fixers,project managers or playing any other key role in Free/Open Source software development or maintenence in carrying out a study on practices and problems of defect management in various Free/Open Source Software projects. The insights gained from the study can further help us to extract publicly accessible defect data and determine impact of defect management practices on software quality. Please spend a few minutes of your precious time to fill up the Questionnaire. The most of the questions follow multiple choice formats and are quite easy to answer. To have the Online Questionnaire, please visit: http://anu.puchd.ac.in/phpESP/public/survey.php?name=FOSS_Defect_Survey I hope you will find all the questions interesting and thought-provoking. Your answers will be kept anonymous.The data thus collected will only be used for research purpose.It would be nice if you may further refer this mail to others actively engaged with Free/Open Source Software development. If you have any query or suggestions then feel free to contact. Thank You With regards, Anu Gupta Senior Lecturer Department of Computer Science and Applications, Panjab University, Chandigarh. INDIA In case of any problem in accessing/using the above mentioned link please contact: E-mail: [EMAIL PROTECTED] [EMAIL PROTECTED] ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Vista/IPv6
Magnus Hagander wrote: On Thu, Apr 12, 2007 at 12:24:58AM +0200, Peter Eisentraut wrote: Magnus Hagander wrote: (FWIW, I had ipv6 on my list of things to make happen, but I didn't realise it would cause this issue on a machine with ipv6 on it, since I don't have one) The IPv6 support is finely tuned to deal with all kinds of combinations of API support, library support, and kernel support, or lack thereof, on the build system and on the host system. I suggest that you just use the logic that we have or emulate it, respectively. If it doesn't work, we will find out soon enough. The point is still what happens when you distribute a binary built on a system with ipv6 to a system that doesn't have it. Anyway. It seems reasonably safe, since I can build with ipv6 on my machine without the ipv6 driver, and it still works. Dave will have to test if it actually works on a machine that *has* ipv6 :-) So. Changes applied. Thanks - I can test a Vista build on a non-ipv6 machine when I'm back in the office - possibly tomorrow, certainly Monday. Regards, Dave ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Last chance to object to MVCC-safe CLUSTER
On Sat, 2007-04-07 at 18:09, Tom Lane wrote: Awhile back Csaba Nagy [EMAIL PROTECTED] wrote: Making cluster MVCC-safe will kill my back-door of clustering a hot table while I run a full DB backup. Are we agreed that the TRUNCATE-based workaround shown here http://archives.postgresql.org/pgsql-hackers/2007-03/msg00606.php is an adequate response to this objection? That workaround should actually work. It is more work but the desired goal is achieved. Cheers, Csaba. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Vista/IPv6
Hi. From: Dave Page [EMAIL PROTECTED] So. Changes applied. Umm, I think that you should correspond here. It seems to have been left. I remember it. src/include/pg_config.h.win32 /* Define to 1 if you have support for IPv6. */ // #define HAVE_IPV6 1 What do you think? Regards, Hiroshi Saito Thanks - I can test a Vista build on a non-ipv6 machine when I'm back in the office - possibly tomorrow, certainly Monday. Regards, Dave ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vista/IPv6
Am Donnerstag, 12. April 2007 09:04 schrieb Magnus Hagander: The point is still what happens when you distribute a binary built on a system with ipv6 to a system that doesn't have it. I think the problem is that you appear to have an ambiguous and overly coarse definition of a system with ipv6. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Vista/IPv6
Am Donnerstag, 12. April 2007 08:56 schrieb Magnus Hagander: IIRC, cmake is a replacement for make, not for configure. Or did I miss something? CMake might be considered a replacement for Automake. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Eliminating unnecessary left joins
Jim, Maybe odd, but simpler to optimize this way. Your idea would be also a very good optimization, there was already a discussion about that here: http://archives.postgresql.org/pgsql-performance/2006-01/msg00151.php, but that time Tom refused it because it was too expensive and rare. Maybe now he has a different opinion. However, left join optimization is lot simpler and cheaper, and can be useful not only for O/R mappers, but for efficient vertical partitioning as Simon mentioned. Best regards, Otto 2007/4/12, Jim Nasby [EMAIL PROTECTED]: I agree with others that the way that query is constructed is a bit odd, but it does bring another optimization to mind: when doing an inner-join between a parent and child table when RI is defined between them, if the query only refers to the child table you can drop the parent table from the join, because each row in the child table must have one and only one row in the parent. Use-case: I'll often use views to make it easier to query several related tables, but not all queries against the views need to hit every table. IE: if a table has several status fields that have RI to parent tables that describe what each status is, you sometimes will query for the status description, sometimes not. I suspect that checking to see if tables have the right unique keys or RI would add a noticeable amount of extra work to query planning, so we might want a GUC to disable it. On Apr 7, 2007, at 12:45 PM, Ottó Havasvölgyi wrote: Sorry, I have left out the PK requirement. What Nicolas wrote is right, I also use an O/R mapper and inheritance is solved with vertical partitioning. The tables are connected to each other with the PK. And the mapper defines views for each class with left joins. The mapper generates queries based on these views. A high fraction of the joins would be eliminated almost in every query. My simple example: Class hierarchy and fields: Shape (ID, X, Y) | +-Circle (ID, Radius) | +-Rectangle (ID, Width, Height) The mapper creates 3 tables with the columns next to the class name. And it creates 3 views. One of them: RectangleView: SELECT r.ID as ID, s.X as X, s.Y as Y, r.Width as Width, r.Height as Height FROM Rectangle r LEFT JOIN Shape s ON ( r.ID=s.ID) Now if I query Rectangle object IDs, whose Width is greater than 5, it will generate this: SELECT ID FROM RectangleView WHERE Width5 In this case I don't need to left join the Shape table, because X and Y columns are not used. The other typical situation is when I execute more complex, not-O/ Rmapper-generated SQL commands based on these views for reporting. For example the average width of rectangles whose height is greater than 10. This optimization should be also applied to subqueries. Is this optimization relatively easy to introduce? I would gladly work on this, but unfortunately I don't know the codebase at all. I would really appreciate if someone competent implemented this feature in 8.4. Thank you in advance, Otto -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)
Re: [HACKERS] Eliminating unnecessary left joins
Maybe odd, but simpler to optimize this way. Your idea would be also a very good optimization, there was already a discussion about that here: http://archives.postgresql.org/pgsql-performance/2006-01/msg00 151.php, but that time Tom refused it because it was too expensive and rare. Maybe now he has a different opinion. However, left join optimization is lot simpler and cheaper, and can be useful not only for O/R mappers, but for efficient vertical partitioning as Simon mentioned. For the views use case there is a simple solution without the expensive optimization: If you have a PK FK relationship simply rewrite the view to use a left join instead of a join. Since there is always one row on the outer (PK) side it makes no difference to the result set. And then the left join optimization can be used. Andreas ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Vista/IPv6
On Thu, Apr 12, 2007 at 10:58:26AM +0200, Peter Eisentraut wrote: Am Donnerstag, 12. April 2007 09:04 schrieb Magnus Hagander: The point is still what happens when you distribute a binary built on a system with ipv6 to a system that doesn't have it. I think the problem is that you appear to have an ambiguous and overly coarse definition of a system with ipv6. Possibly :-) I just want to make sure it doesn't break for our end users, no matter what the definition is. //Magnus ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Vista/IPv6
On Thu, Apr 12, 2007 at 05:14:06PM +0900, Hiroshi Saito wrote: Hi. From: Dave Page [EMAIL PROTECTED] So. Changes applied. Umm, I think that you should correspond here. It seems to have been left. I remember it. src/include/pg_config.h.win32 /* Define to 1 if you have support for IPv6. */ // #define HAVE_IPV6 1 What do you think? It's defined ni the msvc build script, see http://archives.postgresql.org/pgsql-committers/2007-04/msg00148.php //Magnus ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
I don't fully understand what transaction log means. If it means archived WAL, the current (8.2) code handle WAL as follows: Probably we can define transaction log to be the part of WAL that is not full pages. 1) If full_page_writes=off, then no full page writes will be written to WAL, except for those during onlie backup (between pg_start_backup and pg_stop_backup). The WAL size will be considerably small but it cannot recover from partial/inconsistent write to the database files. We have to go back to the online backup and apply all the archive log. 2) If full_page_writes=on, then full page writes will be written at the first update of a page after each checkpoint, plus full page writes at 1). Because we have no means (in 8.2) to optimize the WAL so far, what we can do is to copy WAL or gzip it at archive time. If we'd like to keep good chance of recovery after the crash, 8.2 provides only the method 2), leaving archive log size considerably large. My proposal maintains the chance of crash recovery the same as in the case of full_page_writes=on and reduces the size of archived log as in the case of full_page_writes=off. Yup, this is a good summary. You say you need to remove the optimization that avoids the logging of a new tuple because the full page image exists. I think we must already have the info in WAL which tuple inside the full page image is new (the one for which we avoided the WAL entry for). How about this: Leave current WAL as it is and only add the not removeable flag to full pages. pg_compresslog then replaces the full page image with a record for the one tuple that is changed. I tend to think it is not worth the increased complexity only to save bytes in the uncompressed WAL though. Another point about pg_decompresslog: Why do you need a pg_decompresslog ? Imho pg_compresslog should already do the replacing of the full_page with the dummy entry. Then pg_decompresslog could be a simple gunzip, or whatever compression was used, but no logic. Andreas ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Vista/IPv6
src/include/pg_config.h.win32 /* Define to 1 if you have support for IPv6. */ // #define HAVE_IPV6 1 What do you think? It's defined ni the msvc build script, see http://archives.postgresql.org/pgsql-committers/2007-04/msg00148.php It is a meaning with win32.mak. Then, It is the outside of the msvc tool. Regards, Hiroshi Saito ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] elog(FATAL) vs shared memory
Tom Lane wrote: 2) if a SIGTERM happens to arrive while btbulkdelete is running, the next CHECK_FOR_INTERRUPTS will do elog(FATAL), causing elog.c to do proc_exit(0), leaving the vacuum still recorded as active in the shared memory array maintained by _bt_start_vacuum/_bt_end_vacuum. The PG_TRY block in btbulkdelete doesn't get a chance to clean up. I skimmed through all users of PG_TRY/CATCH in the backend to check if there's other problems like that looming. There's one that looks dangerous in pg_start_backup() in xlog.c. forcePageWrites flag in shared memory is cleared in a PG_CATCH block. It's not as severe, though, as it can be cleared manually by calling pg_stop_backup(), and only leads to degraded performance. (3) eventually, either we try to re-vacuum the same index or accumulation of bogus active entries overflows the array. Either way, _bt_start_vacuum throws an error, which btbulkdelete PG_CATCHes, leading to_bt_end_vacuum trying to re-acquire the LWLock already taken by _bt_start_vacuum, meaning that the process hangs up. And then so does anything else that needs to take that LWLock... I also looked for other occurances of point (3), but couldn't find any, so I guess we're now safe from it. Point (3) is already fixed in CVS, but point (2) is a lot nastier. What it essentially says is that trying to clean up shared-memory state in a PG_TRY block is unsafe: you can't be certain you'll get to do it. Now this is not a big deal during normal SIGTERM or SIGQUIT database shutdown, because we're going to abandon the shared memory segment anyway. However, if we ever want to support individual session kill via SIGTERM, it's a problem. Even if we were not interested in someday considering that a supported feature, it seems that dealing with random SIGTERMs is needed for robustness in at least some environments. Agreed. We should do our best to be safe from SIGTERMs, even if we don't consider it supported. AFAICS, there are basically two ways we might try to approach this: Plan A: establish the rule that you mustn't try to clean up shared memory state in a PG_CATCH block. Anything you need to do like that has to be handled by an on_shmem_exit hook function, so it will be called during a FATAL exit. (Or maybe you can do it in PG_CATCH for normal ERROR cases, but you need a backing on_shmem_exit hook to clean up for FATAL.) Plan B: change the handling of FATAL errors so that they are thrown like normal errors, and the proc_exit call happens only when we get out to the outermost control level in postgres.c. This would mean that PG_CATCH blocks get a chance to clean up before the FATAL exit happens. The problem with that is that a non-cooperative PG_CATCH block might think it could recover from the error, and then the exit does not happen at all. We'd need a coding rule that PG_CATCH blocks *must* re-throw FATAL errors, which seems at least as ugly as Plan A. In particular, all three of the external-interpreter PLs are willing to return errors into the external interpreter, and AFAICS we'd be entirely at the mercy of the user-written Perl or Python or Tcl code whether it re-throws the error or not. So Plan B seems unacceptably fragile. Does anyone see a way to fix it, or perhaps a Plan C with a totally different idea? Plan A seems pretty ugly but it's the best I can come up with. Yeah, plan A seems like the way to go. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Bug about column references within subqueries used in selects
Hi, Shouldn't the final command below cause a 'column b does not exist error'? create table update_test (a int, b int); create table supdate_test(x int, y int); insert into update_test values (20, 30); insert into supdate_test values (40, 50); select a, (select b from supdate_test) from update_test; a ?column? -- - 2030 Is the problem with the code in colNameToVar or maybe we should add checks in transformSubLink? Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
Re: [HACKERS] Bug about column references within subqueries used in selects
On 4/12/07, NikhilS [EMAIL PROTECTED] wrote: Hi, Shouldn't the final command below cause a 'column b does not exist error'? create table update_test (a int, b int); create table supdate_test(x int, y int); insert into update_test values (20, 30); insert into supdate_test values (40, 50); select a, (select b from supdate_test) from update_test; a ?column? -- - 2030 Is the problem with the code in colNameToVar or maybe we should add checks in transformSubLink? I don't think so...the columns of update_test are visible to the scalar subquery...that way you can use fields from 'a' to filter the subquery... select a, (select y from supdate_test where x = a) from update_test; merlin ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] elog(FATAL) vs shared memory
On Apr 11, 2007, at 6:23 PM, Jim Nasby wrote: FWIW, you might want to put some safeguards in there so that you don't try to inadvertently kill the backend that's running that function... unfortunately I don't think there's a built-in function to tell you the PID of the backend you're connected to; if you're connecting via TCP you could use inet_client_addr() and inet_client_port(), but that won't work if you're using the socket to connect. *wipes egg off face* There is a pg_backend_pid() function, even if it's not documented with the other functions (it's in the stats function stuff for some reason). -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [PATCHES] [HACKERS] [Fwd: Index Advisor]
On 4/12/07, Bruce Momjian [EMAIL PROTECTED] wrote: Gurjeet Singh wrote: The interface etc. may not be beautiful, but it isn't ugly either! It is a lot better than manually creating pg_index records and inserting them into cache; we use index_create() API to create the index (build is deferred), and then 'rollback to savepoint' to undo those changes when the advisor is done. index_create() causes pg_depends entries too, so a 'RB to SP' is far much safer than going and deleting cache records manually. My complaint was not that the API used in the code was non-optimal(which I think was Tom's issue), but that the _user_ API was not very clean. Not sure what to recommend, but I will think about it later. That can be fixed/improved with minimal efforts, but if it is the internal API usage, or the architecture we're bothered about, then IMO just an overhaul of the code will not be sufficient, rather, it will require rework from scratch. Best regards, -- [EMAIL PROTECTED] [EMAIL PROTECTED] gmail | hotmail | yahoo }.com 17°29'34.37N 78°30'59.76E - Hyderabad 18°32'57.25N 73°56'25.42E - Pune *
Re: [HACKERS] Bug about column references within subqueries used in selects
Hi, On 4/12/07, Merlin Moncure [EMAIL PROTECTED] wrote: On 4/12/07, NikhilS [EMAIL PROTECTED] wrote: Hi, Shouldn't the final command below cause a 'column b does not exist error'? create table update_test (a int, b int); create table supdate_test(x int, y int); insert into update_test values (20, 30); insert into supdate_test values (40, 50); select a, (select b from supdate_test) from update_test; a ?column? -- - 2030 Is the problem with the code in colNameToVar or maybe we should add checks in transformSubLink? I don't think so...the columns of update_test are visible to the scalar subquery...that way you can use fields from 'a' to filter the subquery... select a, (select y from supdate_test where x = a) from update_test; Yes this is fine, but in select columnname from tablename using column references of the other involved table is what I am objecting to. Regards, Nikhils -- EnterpriseDB http://www.enterprisedb.com
[HACKERS] What tools do people use to hack on PostgreSQL?
Hi I'm very excited that my project for implementing read-only queries on PITR slaves was accepted for GSoC, and I'm now trying to work out what tools I'll use for that job. I'd like to be able to create some sort of branches and tags for my own work (only inside my local repository of course). I've considered using git, but I couldn't make the cvs-git gateway work - neither using the postgresql CVS repository directly, nor with a private copy obtained with CVSup. There is also svk, but I think I'd need a svn repo that mirrors the postgresql CVS for that to work. I think Joshua Drake created one once, but I don't now if it is kept up-to-date. What do you guys use for your development work? greetings, Florian Pflug ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
[HACKERS] Benchmarking tools for the Postgres, EDB and Oracle Database
Hi AllJ I am working on a project for testing the performance of Oracle, EDB, and postgres and looking for a OLTP benchmarking tool which can do the benchmarking on all these databases. Can anyone please help me on this? Thanks in advance to all for you kind co-operation. Regards Ranjan - Sucker-punch spam with award-winning protection. Try the free Yahoo! Mail Beta.
Re: [HACKERS] Bug about column references within subqueries used in selects
I don't think so...the columns of update_test are visible to the scalar subquery...that way you can use fields from 'a' to filter the subquery... select a, (select y from supdate_test where x = a) from update_test; Yes this is fine, but in select columnname from tablename using column references of the other involved table is what I am objecting to. There's nothing here to object to, the system is acting correctly. Your column name b is ambiguous, and the system takes the column b that exists, rather than returning an error on a column that doesn't exist. If you were explicit in your column name, you would get an error: =# select a, (select supdate_test.b from supdate_test) from update_test; ERROR: No such attribute supdate_test.b Regards, Paul ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Florian G. Pflug wrote: Hi I'm very excited that my project for implementing read-only queries on PITR slaves was accepted for GSoC, Congratulations. and I'm now trying to work out what tools I'll use for that job. I'd like to be able to create some sort of branches and tags for my own work (only inside my local repository of course). I've considered using git, but I couldn't make the cvs-git gateway work - neither using the postgresql CVS repository directly, nor with a private copy obtained with CVSup. There is also svk, but I think I'd need a svn repo that mirrors the postgresql CVS for that to work. I think Joshua Drake created one once, but I don't now if it is kept up-to-date. What do you guys use for your development work? You can create a CVS mirror very easily (some info on how is below, copied from the buildfarm HOWTO.) The problem will be keeping things in sync with upstream if you commit changes. I'd be tempted to use some other system (svn would work fine, I think) to do your personal checkpointing, if that's what you want to do, so your checkpoint procedure would be something like: cvs update resolve merge problems svn ci A lower tech model might just cut a nightly CVS diff and archive it. Normally I don't bother with any of this, I just keep a separate tree for each development which I periodically update but don't do anything else until I'm ready to commit or cut a patch. It depends what you're comfortable with. I haven't very often wished I could roll back. cheers andrew -- 11. Almost all the bandwidth issues disappear if you use a local CVS repository instead of the one at postgresql.org. The way to do this (or at least the way I did it) is using CVSup. Since building CVSup is non-trivial, the best way to start this is to get a binary package for some system it will run on. In my case this was a Linux system running Fedora Core/1. After a few false starts, I got it working replicating the entire repo at postgresql.org, including the CVSROOT directory. Then I commented out the entries in CVSROOT/loginfo and CVSROOT/commitinfo, and set up the LockDir directive as I wanted it it CVSROOT/config. Then I checked out the CVSROOT module and did that all over again, and checked the module back in. Then to make sure CVSup didn't overwrite those files, I made entries for them in mirror-home/sup/repository/refuse. With that done, I was able to change the build config on that machine so that the config variable cvsrepo was just the name of the mirror root directory. Everything worked fine. After that I set up an anonymous cvs pserver against the mirror, so that my other machine could also get the source from there instead of from postgresql.org. I did a cvs login, changed the cvsrepo config variable on that machine, and it worked happily too. Finally, I set up a cron job on the mirror machine to update the mirror. The anonymous repository is only updated from the master once every hour, so there is no point in running the cron job more often than that. This should not be too big a deal, as CVSup is extremely efficient, and even doing this so frequently should not incur a lot of bandwidth use. 12. CVSup is not universally available. For example, it does not seem to be available any longer in Fedora Extras, and there are platforms for which it has never been available. However, a similar procedure to the above can be done with rsync, which is pretty universally available. Here is what I did. First I made a repo location, and get an initial repo copy: mkdir -p /home/cvsmirror/pg rsync -avzCH --delete anoncvs.postgresql.org::pgsql-cvs /home/cvsmirror/pg Then remove the sup directory and set up an rsync exclude file: rm -rf /home/cvsmirror/pg/sup echo /sup/ /home/cvsmirror/pg-exclude echo '/CVSROOT/loginfo*' /home/cvsmirror/pg-exclude echo '/CVSROOT/commitinfo*' /home/cvsmirror/pg-exclude echo '/CVSROOT/config*' /home/cvsmirror/pg-exclude Then edit the CVSROOT as in step 11. The add a job to cron something like this: 43 * * * * rsync -avzCH --delete --exclude-from=/home/cvsmirror/pg-exclude anoncvs.postgresql.org::pgsql-cvs /home/cvsmirror/pg Finally, add a pserver if other local buildfarm member machines need access. ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Bug about column references within subqueries used in selects
NikhilS [EMAIL PROTECTED] writes: Yes this is fine, but in select columnname from tablename using column references of the other involved table is what I am objecting to. You can object till you're blue in the face, but this behavior is not changing because it's *required by spec*. Outer references are a standard and indeed essential part of SQL. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Hi, Florian G. Pflug wrote: Hi I'm very excited that my project for implementing read-only queries on PITR slaves was accepted for GSoC, and I'm now trying to work out what tools I'll use for that job. I'd like to be able to create some sort of branches and tags for my own work (only inside my local repository of course). I've considered using git, but I couldn't make the cvs-git gateway work - neither using the postgresql CVS repository directly, nor with a private copy obtained with CVSup. There is also svk, but I think I'd need a svn repo that mirrors the postgresql CVS for that to work. I think Joshua Drake created one once, but I don't now if it is kept up-to-date. Yes, it is (the latest visible commit was made 6 hours ago), you can browse sources at: http://projects.commandprompt.com/public/pgsql/browser or do the anonymous checkout with: svn co http://projects.commandprompt.com/public/pgsql/repo/ Regards, -- Alexey Klyukin [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Benchmarking tools for the Postgres, EDB and Oracle Database
On 4/12/07, Ranjan Sahoo [EMAIL PROTECTED] wrote: I am working on a project for testing the performance of Oracle, EDB, and postgres and looking for a OLTP benchmarking tool which can do the benchmarking on all these databases. Can anyone please help me on this? To test all three, you'd have to use one of the following: - BenchmarkSQL (http://pgfoundry.org/projects/benchmarksql/) - OpenLink's ODBCBench or JBench (www.openlinksw.com) - JDBCBench (http://developer.mimer.com/features/feature_16.htm) - jTPCC (http://jtpcc.sourceforge.net/) Just to make sure you know, both Oracle and EnterpriseDB restrict public disclosure of benchmark results. -- Jonah H. Harris, Software Architect | phone: 732.331.1324 EnterpriseDB Corporation| fax: 732.331.1301 33 Wood Ave S, 3rd Floor| [EMAIL PROTECTED] Iselin, New Jersey 08830| http://www.enterprisedb.com/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Alexey Klyukin wrote: Hi, Florian G. Pflug wrote: Hi I'm very excited that my project for implementing read-only queries on PITR slaves was accepted for GSoC, and I'm now trying to work out what tools I'll use for that job. I'd like to be able to create some sort of branches and tags for my own work (only inside my local repository of course). I've considered using git, but I couldn't make the cvs-git gateway work - neither using the postgresql CVS repository directly, nor with a private copy obtained with CVSup. There is also svk, but I think I'd need a svn repo that mirrors the postgresql CVS for that to work. I think Joshua Drake created one once, but I don't now if it is kept up-to-date. Yes, it is (the latest visible commit was made 6 hours ago), you can browse sources at: http://projects.commandprompt.com/public/pgsql/browser or do the anonymous checkout with: svn co http://projects.commandprompt.com/public/pgsql/repo/ But if you have a checked out tree, does it work to do an update after the tree has been regenerated? As far as I know, the repo is generated completely every few hours, so it wouldn't surprise me that the checked out copy is not compatible with the new repo. I admit I haven't tried. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] elog(FATAL) vs shared memory
Heikki Linnakangas [EMAIL PROTECTED] writes: Tom Lane wrote: AFAICS, there are basically two ways we might try to approach this: Plan A: establish the rule that you mustn't try to clean up shared memory state in a PG_CATCH block. Anything you need to do like that has to be handled by an on_shmem_exit hook function, so it will be called during a FATAL exit. (Or maybe you can do it in PG_CATCH for normal ERROR cases, but you need a backing on_shmem_exit hook to clean up for FATAL.) ... So Plan B seems unacceptably fragile. Does anyone see a way to fix it, or perhaps a Plan C with a totally different idea? Plan A seems pretty ugly but it's the best I can come up with. Yeah, plan A seems like the way to go. The alternative is that instead of a general purpose shmem hook you note the pid of the process that is expecting to handle the cleanup. So for instance something like pg_start_backup instead of setting a flag would store its pid. Then someone else who comes along and finds the field set has to double check if the pid is actually still around and if not it has to clean it up itself. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Makefile patch to make gcov work on Postgres contrib modules
Tom Lane [EMAIL PROTECTED] writes: Gregory Stark [EMAIL PROTECTED] writes: %.so: %.o ! $(CC) -shared -o $@ $ sqlmansect = 7 --- 11,16 endif %.so: %.o ! $(CC) $(CFLAGS) -shared -o $@ $ Surely CFLAGS should be irrelevant at link time. Maybe LDFLAGS? Does LDFLAGS contain flags that can be passed to $(CC) or are they expecting to be passed to $(LD)? It would be less convenient for the user who would have to do CFLAGS='-fprofile-arcs -ftest-coverage' LDFLAGS='-fprofile-arcs -ftest-coverage' ./configure Unless there's a makefile variable that is included in both CFLAGS and LDFLAGS that the user could use instead? But then that wouldn't work on architectures where ld is used instead of gcc for linking. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Eliminating unnecessary left joins
On Wed, 11 Apr 2007, Jim Nasby wrote: I agree with others that the way that query is constructed is a bit odd, but it does bring another optimization to mind: when doing an inner-join between a parent and child table when RI is defined between them, if the query only refers to the child table you can drop the parent table from the join, because each row in the child table must have one and only one row in the parent. I don't think that's quite true without qualifications. First, I think it needs to be an immediate constraint (and I don't remember how we handle set constraints inside functions that might be called from a statement, so it might need to be not deferrable). Second, I think you also need to take care of NULLs since child rows with NULLs in the key pass the constraint but have no rows in the parent and would get culled by the inner join. Also, there's a possible issue that constraints do not actually guarantee that they always hold true, merely that they hold true at particular times. I don't know if it's possible to get a statement executed such that it would see the table state between the action and constraint check or if such is allowed by spec. ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Makefile patch to make gcov work on Postgres contrib modules
Am Donnerstag, 12. April 2007 17:08 schrieb Gregory Stark: Unless there's a makefile variable that is included in both CFLAGS and LDFLAGS that the user could use instead? But then that wouldn't work on architectures where ld is used instead of gcc for linking. Perhaps you should start by defining which situation you want to achieve. -- Peter Eisentraut http://developer.postgresql.org/~petere/ ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [COMMITTERS] pgsql: RESET SESSION, plus related new DDL commands.
Neil Conway wrote: Log Message: --- RESET SESSION, plus related new DDL commands. This phrase is missing a verb: The default value is defined as the value that the parameter would have had, if no commandSET/ ever been issued for it in the current session. /pgsql/doc/src/sgml/ref/reset.sgml I find this markup strange: synopsis SET SESSION AUTHORIZATION DEFAULT; RESET ALL; DEALLOCATE ALL; CLOSE ALL; UNLISTEN *; RESET PLANS; RESET TEMP; /synopsis shouldn't it be using programlisting ? In ResetTempTableNamespace(void), shouldn't it be using myTempNamespace instead of the SysCache lookup? -- Alvaro Herrerahttp://www.CommandPrompt.com/ The PostgreSQL Company - Command Prompt, Inc. ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Benchmarking tools for the Postgres, EDB and Oracle Database
Jonah H. Harris wrote: On 4/12/07, Ranjan Sahoo [EMAIL PROTECTED] wrote: I am working on a project for testing the performance of Oracle, EDB, and postgres and looking for a OLTP benchmarking tool which can do the benchmarking on all these databases. Can anyone please help me on this? To test all three, you'd have to use one of the following: - BenchmarkSQL (http://pgfoundry.org/projects/benchmarksql/) - OpenLink's ODBCBench or JBench (www.openlinksw.com) - JDBCBench (http://developer.mimer.com/features/feature_16.htm) - jTPCC (http://jtpcc.sourceforge.net/) Just to make sure you know, both Oracle and EnterpriseDB restrict public disclosure of benchmark results. You are kidding right? EDB wouldn't be so arrogant as to state that they are faster than PostgreSQL and then not allow public testing of such as argument? Sincerely, Joshua D. Drake -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] TOASTing smaller things
Luke Lonergan wrote: Hi Bruce, How about these: - Allow specification of TOAST size threshold in bytes on a per column basis - Enable storage of columns in separate TOAST tables - Enable use of multi-row compression method(s) for TOAST tables At this point I would be happy just to set the TOAST threshold to a value defined as optimal, rather than as the most minimal use of TOAST possible. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [EMAIL PROTECTED]: Re: Anyone interested in improving postgresql scaling?]
Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] writes: On Wed, Apr 11, 2007 at 01:03:50AM -0400, Tom Lane wrote: Well, the thing is, we've pretty much had it handed to us that current-command indicators that aren't up to date are not very useful. So rate-limited updates strike me as a useless compromise. I don't get your argument - ps auxww is never going to be 100% up-to-date because during the time the command is running the status may change. Of course. But we have already done the update-once-every-half-second bit --- that was how pg_stat_activity used to work --- and our users made clear that it's not good enough. So I don't see us expending significant effort to convert the setproctitle code path to that approach. The clear way of the future for expensive-setproctitle platforms is just to turn it off entirely and rely on the new pg_stat_activity implementation. 8.3 will modify less memory to update the process title than happened in the past --- perhaps that will reduce the overhead, but I doubt it. You can test CVS HEAD to check it. -- Bruce Momjian [EMAIL PROTECTED] http://momjian.us EnterpriseDB http://www.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Alvaro Herrera wrote: But if you have a checked out tree, does it work to do an update after the tree has been regenerated? As far as I know, the repo is generated completely every few hours, so it wouldn't surprise me that the checked out copy is not compatible with the new repo. I admit I haven't tried. I have tried and svn up worked without issues. Regards, -- Alexey Klyukin [EMAIL PROTECTED] ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Alexey Klyukin wrote: Alvaro Herrera wrote: But if you have a checked out tree, does it work to do an update after the tree has been regenerated? As far as I know, the repo is generated completely every few hours, so it wouldn't surprise me that the checked out copy is not compatible with the new repo. I admit I haven't tried. I have tried and svn up worked without issues. As a note we will be updating this to subversion 1.4 shortly so people can do svnsync too. J Regards, -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Anyone interested in improving postgresql scaling?
Mark Kirkwood wrote: Kris Kennaway wrote: If so, then your task is the following: Make SYSV semaphores less dumb about process wakeups. Currently whenever the semaphore state changes, all processes sleeping on the semaphore are woken, even if we only have released enough resources for one waiting process to claim. i.e. there is a thundering herd wakeup situation which destroys performance at high loads. Fixing this will involve replacing the wakeup() calls with appropriate amounts of wakeup_one(). I'm forwarding this to the pgsql-hackers list so that folks more qualified than I can comment, but as I understand the way postgres implements locking each process has it *own* semaphore it waits on - and who is waiting for what is controlled by an in (shared) memory hash of lock structs (access to these is controlled via platform Dependant spinlock code). So a given semaphore state change should only involve one process wakeup. Yes but there are still a lot of wakeups to be avoided in the current System V semaphore code. More specifically, not only do we wakeup all the processes waiting on a single semaphore everytime something changes, but we also wakeup all processes waiting on *any* of the semaphore in the semaphore *set*, whatever the reason we're sleeping. I came up with a quick patch so that Kris could do some testing with it, and it appears to have helped, but only very slightly; apparently some contention within the netisr code caused problems, so that in some cases the patch helped slightly, and in others it didn't. The semaphore code needs a clean rewrite and I hope to take care of this soon, as time permits, since we are heavy consumers of PostgreSQL under FreeBSD at my company. Cheers, Maxime ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Eliminating unnecessary left joins
I have this exact problem a lot. There are actually cases where you can eliminate regular joins, not just left joins. For example: CREATE TABLE partner ( id serial, namevarchar(40) not null, primary key (id) ); CREATE TABLE project ( id serial, namevarchar(40) not null, partner_id integer not null references project (id) ); CREATE VIEW project_view AS SELECT p.id, p.name, p.partner_id, pp.name AS partner FROM project p, partner pp WHERE p.partner_id = pp.id; If someone does a select from project_view and does not select the partner column, the join can be eliminated, because the not null and foreign key constraints on the partner_id column guarantee that there will always be exactly one matching row in the project table. If you didn't have the NOT NULL constraint on the partner_id column, you'd have to write the view this way, as described in the original email: CREATE VIEW project_view AS SELECT p.id, p.name, p.partner_id, pp.name AS partner FROM project p LEFT JOIN partner pp ON p.partner_id = pp.id; In this example, I just have one join, so the benefit to eliminating it is minimal (unless the tables are very large). But in the real application, project_view joins the project table against six other tables using inner joins (all against the primary keys of those other tables) and four additional tables using left joins (also against the primary keys of those other tables). Most queries only use a subset of these columns - a typical query requires evaluating only about three of the ten joins. ...Robert ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [EMAIL PROTECTED]: Re: Anyone interested in improving postgresql scaling?]
Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] forwards: Yes but there are still a lot of wakeups to be avoided in the current System V semaphore code. More specifically, not only do we wakeup all the processes waiting on a single semaphore everytime something changes, but we also wakeup all processes waiting on *any* of the semaphore in the semaphore *set*, whatever the reason we're sleeping. Thanks for forwarding my mail, Kris! To Tom: if you can get my mails to reach pgsql-hackers@ somehow that would be just great :-). O ... *that's* the problem. Ugh. Although we have a separate semaphore for each PG backend, they're grouped into semaphore sets (I think 16 active semaphores per set). So a wakeup intended for one process would uselessly send up to 15 others through the semop code. Yes. The only thing we could do to fix that from our end would be to use a smaller sema-set size on *BSD platforms. Is the overhead per sema set small enough to make this a sane thing to do? Will we be likely to run into system limits on the number of sets? I'm not familiar enough with the PostgreSQL code to know what impact such a change could have, but since the problem is clearly on our side here, I would advise against doing changes in PostgreSQL that are likely to complicate the code for little gain. We still didn't even fully measure how much the useless wakups cost us since we're running into other contention problems with my patch that removes those. And, as you point out, there are complications ensuing with respect to system limits (we already ask users to bump them when they install PostgreSQL). I'm looking forward fixing/rewriting all of the FreeBSD sysV semaphore code and am just waiting for a green light from my boss before doing so. Maybe someone will beat me to it, since it isn't such a big change. I think the high number of setproctitle() calls are more problematic to us at the moment, Kris can comment on that. Cheers, Maxime ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Query
Hi, As a part of my university project, I am trying to modify the postgres code to support parallel system. As the first step I have partitioned the data on different processors. And have kept a master node to process all the query requests. Whenever my master node is queried I need to push my query onto the backend processors. For this I need a way of connect to different backends via my master node's backend. I have tried different functions like: do_connect SendQuery PQconnectdb etc. But all of them give the same compilation error of not being able to find the reference to the functions. Now if I include command.o and common.opresent in src/bin/psql, I get the error of not referencing other functions. I there a way of accessing other backends through the master backend. Thanks Regards, Shaunak Godbole dblink is written specially for remote database access and is efficiently written for postgresql and is also having async query capabilities, u may check dblink_connect, dblink_fetch, dblink_exec, dblink_send_query, etc . more than that it is also having replication mechanism which u can achieve using dblink_build_sql_insert, dblink_build_sql_delete, dblink_build_sql_update which i think would be of utmost help if there is some replication requirement in ur parallel database system regards nags IIT Bombay -- View this message in context: http://www.nabble.com/Query-tf3547482.html#a9954956 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com. ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Why need XLogReadBuffer have the paramter init?
Tom Lane [EMAIL PROTECTED] дÈëÓʼþ news:[EMAIL PROTECTED] Jacky Leng [EMAIL PROTECTED] writes: Cann't we remove this param? No. We can rewrite like this: 1.XLogReadBuffer: * remove init; * everytime we cann't read a block, just log_invalid_page it, and return InvalidBuffer; Your proposal degrades the robustness of the system by turning non-error cases into errors. If the caller is able to rewrite the page fully, we should not report an error when it's not available to read. Oh, I see, but how about my second question, is it possible? If it happens: 1. the second rel's pages' lsn surely is lager than xlog records of the first rel; 2. so it's possible some xlog record are not redoed; 3. but those pages that can be rewrite fully are rewrited unconditionaly, If I do a PITR recovery now, is there any trouble?The file contains both old rels'data and new rel's. Am I wrong? ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Why need XLogReadBuffer have the paramter init?
Oh, I am wrong! Jacky Leng [EMAIL PROTECTED] дÈëÓʼþ news:[EMAIL PROTECTED] Tom Lane [EMAIL PROTECTED] дÈëÓʼþ news:[EMAIL PROTECTED] Jacky Leng [EMAIL PROTECTED] writes: Cann't we remove this param? No. We can rewrite like this: 1.XLogReadBuffer: * remove init; * everytime we cann't read a block, just log_invalid_page it, and return InvalidBuffer; Your proposal degrades the robustness of the system by turning non-error cases into errors. If the caller is able to rewrite the page fully, we should not report an error when it's not available to read. Oh, I see, but how about my second question, is it possible? If it happens: 1. the second rel's pages' lsn surely is lager than xlog records of the first rel; 2. so it's possible some xlog record are not redoed; 3. but those pages that can be rewrite fully are rewrited unconditionaly, If I do a PITR recovery now, is there any trouble?The file contains both old rels'data and new rel's. Am I wrong? ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Anyone interested in improving postgresql scaling?
Mark Kirkwood wrote: Kris Kennaway wrote: If so, then your task is the following: Make SYSV semaphores less dumb about process wakeups. Currently whenever the semaphore state changes, all processes sleeping on the semaphore are woken, even if we only have released enough resources for one waiting process to claim. i.e. there is a thundering herd wakeup situation which destroys performance at high loads. Fixing this will involve replacing the wakeup() calls with appropriate amounts of wakeup_one(). I'm forwarding this to the pgsql-hackers list so that folks more qualified than I can comment, but as I understand the way postgres implements locking each process has it *own* semaphore it waits on - and who is waiting for what is controlled by an in (shared) memory hash of lock structs (access to these is controlled via platform Dependant spinlock code). So a given semaphore state change should only involve one process wakeup. [mail resent, it seems it got eaten by pgsql-hackers@ MTA somehow] Yes but there are still a lot of wakeups to be avoided in the current System V semaphore code. More specifically, not only do we wakeup all the processes waiting on a single semaphore everytime something changes, but we also wakeup all processes waiting on *any* of the semaphore in the semaphore *set*, whatever the reason we're sleeping. I came up with a quick patch so that Kris could do some testing with it, and it appears to have helped, but only very slightly; apparently some contention within the netisr code caused problems, so that in some cases the patch helped slightly, and in others it didn't. The semaphore code needs a clean rewrite and I hope to take care of this soon, as time permits, since we are heavy consumers of PostgreSQL under FreeBSD at my company. Cheers, Maxime ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [EMAIL PROTECTED]: Re: Anyone interested in improving postgresql scaling?]
On Thu, Apr 12, 2007 at 12:57:32PM -0400, Bruce Momjian wrote: Tom Lane wrote: Kris Kennaway [EMAIL PROTECTED] writes: On Wed, Apr 11, 2007 at 01:03:50AM -0400, Tom Lane wrote: Well, the thing is, we've pretty much had it handed to us that current-command indicators that aren't up to date are not very useful. So rate-limited updates strike me as a useless compromise. I don't get your argument - ps auxww is never going to be 100% up-to-date because during the time the command is running the status may change. Of course. But we have already done the update-once-every-half-second bit --- that was how pg_stat_activity used to work --- and our users made clear that it's not good enough. So I don't see us expending significant effort to convert the setproctitle code path to that approach. The clear way of the future for expensive-setproctitle platforms is just to turn it off entirely and rely on the new pg_stat_activity implementation. 8.3 will modify less memory to update the process title than happened in the past --- perhaps that will reduce the overhead, but I doubt it. You can test CVS HEAD to check it. Yeah, this is not relevant for BSD, it uses a syscall to set it (which is why it has high overhead) instead of just modifying user memory. Kris pgpZa77sE5vkj.pgp Description: PGP signature
Re: [HACKERS] elog(FATAL) vs shared memory
Jim Nasby wrote: On Apr 11, 2007, at 6:23 PM, Jim Nasby wrote: FWIW, you might want to put some safeguards in there so that you don't try to inadvertently kill the backend that's running that function... unfortunately I don't think there's a built-in function to tell you the PID of the backend you're connected to; if you're connecting via TCP you could use inet_client_addr() and inet_client_port(), but that won't work if you're using the socket to connect. *wipes egg off face* There is a pg_backend_pid() function, even if it's not documented with the other functions (it's in the stats function stuff for some reason). eh. No worries - my safeguard is just a comment saying 'don't connect to the same database you are killing the connections of' :-) -- Stuart Bishop [EMAIL PROTECTED] http://www.canonical.com/ Canonical Ltd.http://www.ubuntu.com/ signature.asc Description: OpenPGP digital signature
[HACKERS] Do we still need log_invalid_page?
As the README for xlog says: There're two kinds of WAL records: * WAL record that contains enough information to re-generate the entire contents of a page; during recovery of these records, blocks are read with: buffer = XLogReadBuffer(reln, blkno, true); so it can be sure that the block will be read successfully, i.e. buffer won't be InvalidBuffer * WAL record provides only enough information to incrementally update the page; As all blocks related to these xlog records will be backed up in the xlog during the first write after checkpoint, and RestoreBkpBlocks will also read it in with : buffer = XLogReadBuffer(reln, blkno, true); so it also can be sure that these blocks will be read successfully: because the fisrt read of it must be in RestoreBkpBlocks, which will reconstruct the block validly, and none problems in later read. Then if we come to the path log_invalid_page, can I say there must be sth wrong, and we should PANIC? ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Reviewers Guide to Deferred Transactions/Transaction Guarantee
Simon Riggs [EMAIL PROTECTED] writes: transaction_guarantee.v11.patch I can't help feeling that this is enormously overcomplicated. The DFC in particular seems to not be worth its overhead. Why wouldn't we simply track the newest commit record at all times, and then whenever the wal writer wakes up, it would write/fsync that far (or write/fsync all completed WAL pages, if there's no new commit record to worry about)? I see the concern about not letting clog pages go to disk before the corresponding WAL data is flushed, but that could be handled much more simply: just force a flush through the newest commit record before any write of a clog page. Those writes are infrequent enough (every 32K transactions or one checkpoint) that this seems not a serious problem. The other interesting issue is not letting hint-bit updates get to disk in advance of the WAL flush, but I don't see a need to track those at a per-transaction level: just advance page LSN to latest commit record any time a hint bit is updated. The commit will likely be flushed before we'd be interested in writing the buffer out anyway. Moreover, the way you are doing it creates a conflict in that the DFC has to guarantee to remember every unflushed transaction, whereas it really needs to be just an approximate cache for its performance to be good. AFAIK there is no need to associate any forced flush with multixacts; there is no state saved across crashes for those anyway. I don't see a point in allowing the WAL writer to be disabled --- I believe it will be a performance win just like the bgwriter, independently of whether transaction_guarantee is used or not, by helping to keep down the number of dirty WAL buffers. That in turn allows some other simplifications, like not needing an assign hook for transaction_guarantee. I disagree with your desire to remove the fsync parameter. It may have less use than before with this feature, but that doesn't mean it has none. 3. Should the WALWriter also do the wal_buffers half-full write at the start of XLogInsert() ? That should go away entirely; to me the main point of the separate wal-writer process is to take over responsibility for not letting too many dirty wal buffers accumulate. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Do we still need log_invalid_page?
Jacky Leng [EMAIL PROTECTED] writes: Then if we come to the path log_invalid_page, can I say there must be sth wrong, and we should PANIC? No; you forgot about full_page_writes = off. regards, tom lane ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] TOASTing smaller things
Hi Bruce, On 4/12/07 9:24 AM, Bruce Momjian [EMAIL PROTECTED] wrote: Luke Lonergan wrote: Hi Bruce, How about these: - Allow specification of TOAST size threshold in bytes on a per column basis - Enable storage of columns in separate TOAST tables - Enable use of multi-row compression method(s) for TOAST tables At this point I would be happy just to set the TOAST threshold to a value defined as optimal, rather than as the most minimal use of TOAST possible. I agree that's a good starting point, I guess I was thinking that was already included in the work that Tom has been doing. If not, we can add a TODO like this as a precursor to the ones above: - Allow specification of TOAST size threshold (in bytes) on a per table basis - Luke ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] TOASTing smaller things
Luke Lonergan [EMAIL PROTECTED] writes: On 4/12/07 9:24 AM, Bruce Momjian [EMAIL PROTECTED] wrote: At this point I would be happy just to set the TOAST threshold to a value defined as optimal, rather than as the most minimal use of TOAST possible. I agree that's a good starting point, I guess I was thinking that was already included in the work that Tom has been doing. No. I put in the code needed to decouple toast tuple size from toasting threshold, but I don't have the time or interest to run performance tests to see whether there are better default values than the historical quarter-page values. Someone should do that before 8.3 beta ... If not, we can add a TODO like this as a precursor to the ones above: - Allow specification of TOAST size threshold (in bytes) on a per table basis I would suggest that *all* of those TODOs are premature in the absence of experimental evidence about the effect of varying the parameters. If we end up finding out that the existing settings are about right anyway across a range of test cases, who needs more knobs? We've got too many mostly-useless knobs already. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] TOASTing smaller things
Tom Lane [EMAIL PROTECTED] writes: I would suggest that *all* of those TODOs are premature in the absence of experimental evidence about the effect of varying the parameters. If we end up finding out that the existing settings are about right anyway across a range of test cases, who needs more knobs? We've got too many mostly-useless knobs already. Isn't it obvious that the right value is going to depend extraordinarily heavily on the precise usage pattern though? A typical table with 100-byte columns which are normally read with the rest of the columns, sometimes in sequential scans or updates, will find TOASTing them counter-productive as it t urns those all into additional random access i/o and extra inserts and deletes. Another system with a dozen 100-byte columns that are virtually never accessed and a handful of heavily-updated integer columns will benefit heavily from TOAST as it changes the table from an awfully performing 5-6 tuple/page table into a tremendously performing 100+ tuple/page table. We're not ever going to be able to predict when data is being stored what future usage pattern to expect, at least not realistically. The best we can hope for is to find the tipping point at which the cost if we guess wrong is some tolerable level of pain and set that to be the default and giving the tools to the user to obtain the benefit in the cases where he knows it'll help. Or perhaps TOAST is the wrong kind of vertical partitioning for this. Perhaps we should be using TOAST to deal with the large datum problem and have a different looking tool entirely for the vertical partitioning rarely used columns problem. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] TOASTing smaller things
Gregory Stark [EMAIL PROTECTED] writes: Tom Lane [EMAIL PROTECTED] writes: I would suggest that *all* of those TODOs are premature in the absence of experimental evidence about the effect of varying the parameters. Isn't it obvious that the right value is going to depend extraordinarily heavily on the precise usage pattern though? It's not yet obvious that there's any win to be had at all. AFAIK no one has done any playing around with alternative TOAST settings. It could be that the mechanism is simply not very sensitive to those values. Or perhaps TOAST is the wrong kind of vertical partitioning for this. Exactly my point --- quoting anecdotes about wins from someone else's vertical partitioning implementation doesn't really prove a darn thing about how TOAST will behave. It's suggestive, but I'd like to see some actual experimental evidence before we start constructing a lot of infrastructure. Infrastructure first, learn how to use it later is the philosophy that's given us nigh-useless stuff like commit_delay. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] temporal variants of generate_series()
I've written the following function definitions to extend generate_series to support some temporal types (timestamptz, date and time). Please include them if there's sufficient perceived need or value. -- timestamptz version CREATE OR REPLACE FUNCTION generate_series ( start_ts timestamptz , end_ts timestamptz , step interval ) RETURNS SETOF timestamptz AS $$ DECLARE current_ts timestamptz := start_ts; BEGIN IF start_ts end_ts AND step INTERVAL '0 seconds' THEN LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; ELSIF end_ts start_ts AND step INTERVAL '0 seconds' THEN LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; END IF; END; $$ LANGUAGE plpgsql IMMUTABLE; -- date version CREATE OR REPLACE FUNCTION generate_series ( start_ts date , end_ts date , step interval ) RETURNS SETOF date AS $$ DECLARE current_ts date := start_ts; BEGIN IF start_ts end_ts AND step INTERVAL '0 seconds' THEN LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; ELSIF end_ts start_ts AND step INTERVAL '0 seconds' THEN LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; END IF; END; $$ LANGUAGE plpgsql IMMUTABLE; -- time version CREATE OR REPLACE FUNCTION generate_series ( start_ts time , end_ts time , step interval ) RETURNS SETOF time AS $$ DECLARE current_ts time := start_ts; BEGIN IF step INTERVAL '0 seconds' THEN LOOP-- handle wraparound first IF current_ts end_ts THEN EXIT; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; ELSIF step INTERVAL '0 seconds' THEN LOOP-- handle wraparound first IF current_ts end_ts THEN EXIT; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; LOOP IF current_ts end_ts THEN RETURN; END IF; RETURN NEXT current_ts; current_ts := current_ts + step; END LOOP; END IF; END; $$ LANGUAGE plpgsql IMMUTABLE; ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] [COMMITTERS] pgsql: RESET SESSION, plus related new DDL commands.
On Thu, 2007-04-12 at 11:45 -0400, Alvaro Herrera wrote: This phrase is missing a verb: [...] I find this markup strange: [...] In ResetTempTableNamespace(void), shouldn't it be using myTempNamespace instead of the SysCache lookup? All fair points: I've applied the attached patch. Thanks for the review. -Neil Index: doc/src/sgml/ref/reset.sgml === RCS file: /home/neilc/postgres/cvs_root/pgsql/doc/src/sgml/ref/reset.sgml,v retrieving revision 1.33 diff -c -p -r1.33 reset.sgml *** doc/src/sgml/ref/reset.sgml 12 Apr 2007 06:53:46 - 1.33 --- doc/src/sgml/ref/reset.sgml 12 Apr 2007 22:28:39 - *** SET replaceable class=parameterconfi *** 42,48 para The default value is defined as the value that the parameter would !have had, if no commandSET/ ever been issued for it in the current session. The actual source of this value might be a compiled-in default, the configuration file, command-line options, or per-database or per-user default settings. See xref --- 42,48 para The default value is defined as the value that the parameter would !have had, if no commandSET/ had ever been issued for it in the current session. The actual source of this value might be a compiled-in default, the configuration file, command-line options, or per-database or per-user default settings. See xref *** SET replaceable class=parameterconfi *** 112,118 Releases all temporary resources associated with the current session. This has the same effect as executing the following command sequence: ! synopsis SET SESSION AUTHORIZATION DEFAULT; RESET ALL; DEALLOCATE ALL; --- 112,118 Releases all temporary resources associated with the current session. This has the same effect as executing the following command sequence: ! programlisting SET SESSION AUTHORIZATION DEFAULT; RESET ALL; DEALLOCATE ALL; *** CLOSE ALL; *** 120,126 UNLISTEN *; RESET PLANS; RESET TEMP; ! /synopsis /para /listitem /varlistentry --- 120,126 UNLISTEN *; RESET PLANS; RESET TEMP; ! /programlisting /para /listitem /varlistentry Index: src/backend/catalog/namespace.c === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/backend/catalog/namespace.c,v retrieving revision 1.94 diff -c -p -r1.94 namespace.c *** src/backend/catalog/namespace.c 12 Apr 2007 06:53:46 - 1.94 --- src/backend/catalog/namespace.c 12 Apr 2007 22:24:18 - *** InitTempTableNamespace(void) *** 1946,1963 void ResetTempTableNamespace(void) { ! char namespaceName[NAMEDATALEN]; ! Oid namespaceId; ! ! /* find oid */ ! snprintf(namespaceName, sizeof(namespaceName), pg_temp_%d, MyBackendId); ! namespaceId = GetSysCacheOid(NAMESPACENAME, ! CStringGetDatum(namespaceName), ! 0, 0, 0); ! ! /* clean if exists */ ! if (OidIsValid(namespaceId)) ! RemoveTempRelations(namespaceId); } /* --- 1946,1953 void ResetTempTableNamespace(void) { ! if (OidIsValid(myTempNamespace)) ! RemoveTempRelations(myTempNamespace); } /* Index: src/include/catalog/namespace.h === RCS file: /home/neilc/postgres/cvs_root/pgsql/src/include/catalog/namespace.h,v retrieving revision 1.46 diff -c -p -r1.46 namespace.h *** src/include/catalog/namespace.h 12 Apr 2007 06:53:48 - 1.46 --- src/include/catalog/namespace.h 12 Apr 2007 22:34:08 - *** extern char *NameListToQuotedString(List *** 81,86 --- 81,87 extern bool isTempNamespace(Oid namespaceId); extern bool isAnyTempNamespace(Oid namespaceId); extern bool isOtherTempNamespace(Oid namespaceId); + extern void ResetTempTableNamespace(void); extern OverrideSearchPath *GetOverrideSearchPath(MemoryContext context); extern void PushOverrideSearchPath(OverrideSearchPath *newpath); *** extern char *namespace_search_path; *** 100,105 extern List *fetch_search_path(bool includeImplicit); - extern void ResetTempTableNamespace(void); - #endif /* NAMESPACE_H */ --- 101,104 ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [PATCHES] [HACKERS] Full page writes improvement, code update
Hi, Sorry, inline reply. Zeugswetter Andreas ADI SD wrote: Yup, this is a good summary. You say you need to remove the optimization that avoids the logging of a new tuple because the full page image exists. I think we must already have the info in WAL which tuple inside the full page image is new (the one for which we avoided the WAL entry for). How about this: Leave current WAL as it is and only add the not removeable flag to full pages. pg_compresslog then replaces the full page image with a record for the one tuple that is changed. I tend to think it is not worth the increased complexity only to save bytes in the uncompressed WAL though. It is essentially what my patch proposes. My patch includes flag to full page writes which can be removed. Another point about pg_decompresslog: Why do you need a pg_decompresslog ? Imho pg_compresslog should already do the replacing of the full_page with the dummy entry. Then pg_decompresslog could be a simple gunzip, or whatever compression was used, but no logic. Just removing full page writes does not work. If we shift the rest of the WAL, then LSN becomes inconsistent in compressed archive logs which pg_compresslog produces. For recovery, we have to restore LSN as the original WAL. Pg_decompresslog restores removed full page writes as a dumm records so that recovery redo functions won't be confused. Regards; Andreas -- - Koichi Suzuki ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Joshua D. Drake wrote: Alexey Klyukin wrote: Alvaro Herrera wrote: But if you have a checked out tree, does it work to do an update after the tree has been regenerated? As far as I know, the repo is generated completely every few hours, so it wouldn't surprise me that the checked out copy is not compatible with the new repo. I admit I haven't tried. I have tried and svn up worked without issues. As a note we will be updating this to subversion 1.4 shortly so people can do svnsync too. Do I read this correctly as This repository will stay around for a while, and isn't just an experiment that might be stopped tomorrow?. If so, I'll try using it - and lots of thanks for providing that greetings, Florian Pflug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] What tools do people use to hack on PostgreSQL?
Florian G. Pflug wrote: Joshua D. Drake wrote: Alexey Klyukin wrote: Alvaro Herrera wrote: But if you have a checked out tree, does it work to do an update after the tree has been regenerated? As far as I know, the repo is generated completely every few hours, so it wouldn't surprise me that the checked out copy is not compatible with the new repo. I admit I haven't tried. I have tried and svn up worked without issues. As a note we will be updating this to subversion 1.4 shortly so people can do svnsync too. Do I read this correctly as This repository will stay around for a while, and isn't just an experiment that might be stopped tomorrow?. If so, I'll try using it - and lots of thanks for providing that The subversion repository has been around *a long* time. The only thing that is relatively new is the anonymous checkout and that is just because we were lazy. So yes, you can use it :) Joshua D. Drake greetings, Florian Pflug ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org -- === The PostgreSQL Company: Command Prompt, Inc. === Sales/Support: +1.503.667.4564 || 24x7/Emergency: +1.800.492.2240 Providing the most comprehensive PostgreSQL solutions since 1997 http://www.commandprompt.com/ Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate PostgreSQL Replication: http://www.commandprompt.com/products/ ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] Idle idea for a feature
On Apr 10, 2007, at 9:48 AM, Stephen Frost wrote: Referenced by: loc_base_clin_loc_base_id_fkey FOREIGN KEY (loc_base_id) BY wdm_networx.loc_base_clin(loc_base_id) /| \/|\ Referenced column(s) in *this* table | |--- column(s) in referencing table +1, and I also like Nikhils' idea of \d reporting if a table inherits or is inherited from. Bruce, can we get a TODO? -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Minor changes to Recovery related code
On Mar 30, 2007, at 5:51 PM, Florian G. Pflug wrote: In realitly, however, I feare that most people will just create a script that does 'echo select pg_stop_backup | psql' or something similar. If they're a bit more carefull, they will enable ON_ERROR_STOP, and check the return value of pgsql. I believe that those are the people who would really benefit from a pg_stop_backup() that waits for archiving to complete. But they probably won't check for WARNINGs. Maybe doing it the other way round would be an option? pg_stop_backup() could wait for the archiver to complete forever, but spit out a warning every 60 seconds or so WARNING: Still waiting for wal archiving of wal ??? to complete. If someone really wants a 60-second timeout, he can just use statement_timeout. I agree; people are far more likely to take the lazy way out, so it'd be good to protect against that. Esp. since statement_timeout provides a means to get the other behavior. -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] Benchmarking tools for the Postgres, EDB and Oracle Database
On Thu, 12 Apr 2007, Jonah H. Harris wrote: - JDBCBench (http://developer.mimer.com/features/feature_16.htm) I wouldn't recommend this one unless you've got plenty of time to debug it and validate the results. There are multiple errors in the random number generation code, some other bugs I forget the details of right now, and the coding on some of the later optimizations they throw in are written so badly they end up degrading performance with PostgreSQL. You can find a better version of the random ID functions in the version of JDBCBench included in HSQL, http://hsqldb.org/ That's missing some useful features Mimer added though, and the HSQL version comes with a funky license. Their version is by no means bug-free either. -- * Greg Smith [EMAIL PROTECTED] http://www.gregsmith.com Baltimore, MD ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] So are we calling it: Feature Freeze?
On Apr 2, 2007, at 12:40 PM, Joshua D. Drake wrote: Should we announce? There is some web work etc.. to be done. Is that work documented anywhere? Seems it would be a GoodThing if it was... -- Jim Nasby[EMAIL PROTECTED] EnterpriseDB http://enterprisedb.com 512.569.9461 (cell) ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Benchmarking tools for the Postgres, EDB and Oracle Database
All, I would recommend the new semi-free version of SpecJAppserver, called EAStress.http://www.spec.org/jAppServer2004/ --Josh ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq