Re: [pgtranslation-translators] [HACKERS] Opinions about wording of error messages for bug #3883?
Alvaro Herrera wrote: Tom Lane wrote: Alvaro Herrera [EMAIL PROTECTED] writes: I suggest cannot execute \%s\ on \%s\ because ... Hmm, why not just cannot execute %s \%s\ because ... ? Hmm, yeah, that seems fine too. Thinking more about it, from the POV of the translator probably the three forms are the same because he has all the elements to construct the phrase however he sees fit. Alvarro's sentence seems better to me. Anyways, I have no problem with such a change this near of a release. As Alvarro said, if the translation of this sentence is not available for 8.3, it can be for 8.3.1. That's not such a big deal. And thanks for asking translators' opinion on this, I really appreciate. Regards. -- Guillaume. http://www.postgresqlfr.org http://dalibo.com ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
On Tue, 2008-01-29 at 22:12 -0800, Dann Corbit wrote: http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html I assume you mean to have interesting CPU-intensive tasks offloaded to the GPU, rather than full-porting... I looked into this and it seems like an innovative plan technically, but most servers running PostgreSQL don't have a GPU. So it makes it more a personal computer opportunity rather than a business server one, no? Plus I wouldn't look at it ahead of a debugger being available. If you wrote a CUDA function to perform a useful operation that might be a good starting place to general understanding of how we might use this in the future. PostgreSQL is extensible, so an add-in function might be a useful module on pgfoundry. Maybe we could build in a hook to allow an external-function library to provide sort capability... but I don't think anyone is going to take it seriously for database core without a greater than normal amount of test results and prototypes. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
The plural seems better to me; there's no such thing as a solitary synchronized scan, no? The whole point of the feature is to affect the behavior of multiple scans. +1. The plural is important IMHO. ok, good. As I stated earlier, I don't really like this argument (we already broke badly designed applications a few times in the past) but we really need a way to guarantee that the execution of a query is stable and doesn't depend on external factors. And the original problem was to guarantee that pg_dump builds a dump as identical as possible to the existing data by ignoring external factors. It's now the case with your patch. The fact that it allows us not to break existing applications relying too much on physical ordering is a nice side effect though :). One more question. It would be possible that a session that turned off the synchronized_seqscans still be a pack leader for other later sessions. Do/should we consider that ? The procedure would be: start from page 0 iff no other pack is present fill the current scan position for others Andreas ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] MSVC Build error
On Sun, Jan 27, 2008 at 10:04:38PM +0100, Gevik Babakhani wrote: Do you have the dumpbin command available in the path? //Magnus :) yes. This is why I do not understand why the command does not run correctly! Strange indeed. I recognise this problem from earlier, but I thought it had been fixed a long time ago. Can you confirm that you are building 8.3-current and not 8.2 here? Also, looking at the original one it seems that it has dumped *some* files, but not all. Could you modify gendef.pl line 22 to include the filename that it's failing on, to see if that gets you further? //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
On Wed, Jan 30, 2008 at 10:56:47AM +0100, Zeugswetter Andreas ADI SD wrote: The plural seems better to me; there's no such thing as a solitary synchronized scan, no? The whole point of the feature is to affect the behavior of multiple scans. +1. The plural is important IMHO. ok, good. As I stated earlier, I don't really like this argument (we already broke badly designed applications a few times in the past) but we really need a way to guarantee that the execution of a query is stable and doesn't depend on external factors. And the original problem was to guarantee that pg_dump builds a dump as identical as possible to the existing data by ignoring external factors. It's now the case with your patch. The fact that it allows us not to break existing applications relying too much on physical ordering is a nice side effect though :). One more question. It would be possible that a session that turned off the synchronized_seqscans still be a pack leader for other later sessions. Do/should we consider that ? The procedure would be: start from page 0 iff no other pack is present fill the current scan position for others I think that allowing other scans to use the scan started by a query that disabled the sync scans would have value. It would prevent these types of queries from completely tanking the I/O. +1 Ken ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Zeugswetter Andreas ADI SD escribió: The plural seems better to me; there's no such thing as a solitary synchronized scan, no? The whole point of the feature is to affect the behavior of multiple scans. +1. The plural is important IMHO. ok, good. Hmm, if you guys are going to add another GUC variable, please hurry because we have to translate the description text. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
On Mon, 2008-01-28 at 16:21 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Rather than having a boolean GUC, we should have a number and make the parameter synchronised_scan_threshold. This would open up a can of worms I'd prefer not to touch, having to do with whether the buffer-access-strategy behavior should track that or not. As the note in heapam.c says, * If the table is large relative to NBuffers, use a bulk-read access * strategy and enable synchronized scanning (see syncscan.c). Although * the thresholds for these features could be different, we make them the * same so that there are only two behaviors to tune rather than four. It's a bit late in the cycle to be revisiting that choice. Now we do already have three behaviors to worry about (BAS on and syncscan off) but throwing in a randomly settable knob will take it back to four, and we have no idea how that fourth case will behave. The other tack we could take (having the one GUC variable control both thresholds) is not good since it will result in pg_dump trashing the buffer cache. I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. We're trying to guess which data is in memory and which is on disk and then act accordinly. The answer to that question cannot be answered solely by how big shared_buffers is. It really ought to be a combination of (at least) shared_buffers and total database size. I think we must either put some more intelligence into the setting of the threshold, or give it to the user as a parameter, possibly as a parameter not mentioned in the sample .conf. If we set the threshold unintelligently or in a way that cannot be overridden, we will still get weird bug reports from people who set shared_buffers higher and got a performance drop. We need to make a final decision on this quickly, so I'll say no more on this for 8.3 to help that process. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Truncate Triggers
On Fri, 2008-01-25 at 10:44 -0500, Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: Notes: As the syntax shows, these would be statement-level triggers (only). Requesting row level triggers will cause an error. [As Chris Browne explained, if people really want, they can use these facilities to create a Before Statement trigger that executes a DELETE, which then fires row level calls.] Is there a way for a BS trigger to return a flag skip the statement, as there is for BR? I've got a working version of truncate triggers now, posted to -patches shortly. Answering the above question is the last point of the implementation. ISTM it would be best to think of it as a separate and not-very related feature, implemented as a separate patch, if we decide we really do want that. It doesn't seem important to do that for replication, which was the main use case for truncate triggers. Currently, BS trigger functions return NULL. This is handled in various ways within each PL and is specifically tested for within main trigger exec code. Returning different information in some form or other would be required to signal skip the main statement. FOR EACH ROW triggers return NULL when they want to skip the change for that row, so the current implementation is the wrong way round for BS triggers. I'm not sure how to handle that in a way that makes obvious sense for future trigger developers, so suggestions welcome. So allowing us to skip commands as a result of statement level triggers is as much work for INSERT, UPDATE, DELETE and TRUNCATE together as it is just for TRUNCATE. I also think that if we did do that for TRUNCATE it would be useful to do for the other commands anyway. SQLStandard doesn't say we *can't* do this. Having said that, some PLs simply ignore the return value from BS triggers. So interpreting return values in new ways might make existing trigger code break or behave differently. So if we did BS trigger skipping for all statement types then we would need to introduce that concept slowly over a couple of releases with a non-default, then default trigger behaviour parameter. I've written the truncate trigger handling in such a way that it would be straightforward to extend this to include statement skipping, should we do it in the future. Can we just skip statement skipping? -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] autonomous transactions
All, Added to TODO: * Add anonymous transactions http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php IMHO, autonomous transactions should be part of a package with a spec-compliant CREATE PROCEDURE statement.That is, the difference between PROCEDURES and FUNCTIONS would be that: -- PROCs have autonomous transactions -- PROCs have to be excuted with CALL, and can't go in a query -- PROCs don't necessarily return a result --Josh Berkus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] autonomous transactions
Josh Berkus escribió: All, Added to TODO: * Add anonymous transactions http://archives.postgresql.org/pgsql-hackers/2008-01/msg00893.php IMHO, autonomous transactions should be part of a package with a spec-compliant CREATE PROCEDURE statement. IMHO we should try to get both things separately, otherwise we will never get either. -- Alvaro Herrerahttp://www.CommandPrompt.com/ PostgreSQL Replication, Consulting, Custom Development, 24x7 support ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Alvaro Herrera [EMAIL PROTECTED] writes: Hmm, if you guys are going to add another GUC variable, please hurry because we have to translate the description text. Yeah, I'm going to put it in today --- just the on/off switch. Any discussions of exposing threshold parameters will have to wait for 8.4. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Simon Riggs [EMAIL PROTECTED] writes: I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. That argument leads immediately to the conclusion that you need per-table control over the behavior. Which maybe you do, but it's far too late to be proposing it for 8.3. We should put this whole area of more-control-over-BAS-and-syncscan on the TODO agenda. regards, tom lane ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
Dann Corbit [EMAIL PROTECTED] writes: http://www.nvidia.com/object/cuda_get.html The license terms here seem to be sufficient reason why not. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
[HACKERS] Will PostgreSQL get ported to CUDA?
2008/1/30 Dann Corbit [EMAIL PROTECTED]: http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html Someone at CMU has tried this, somewhat fruitfully. http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf This was based on GPUSort: http://gamma.cs.unc.edu/GPUSORT/ Unfortunately, the licensing of GPUSort is, if anything, more awful than that for CUDA. http://gamma.cs.unc.edu/GPUSORT/terms.html This would need to get pretty totally reimplemented to be useful with PostgreSQL. Happily, we actually have some evidence that the exercise would be of some value. Further, it looks to me like the implementation that was done was done in a pretty naive way. Something done more seriously would likely be radically better... -- http://linuxfinances.info/info/linuxdistributions.html The definition of insanity is doing the same thing over and over and expecting different results. -- assortedly attributed to Albert Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling ---(end of broadcast)--- TIP 2: Don't 'kill -9' the postmaster
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: One more question. It would be possible that a session that turned off the synchronized_seqscans still be a pack leader for other later sessions. Do/should we consider that ? Seems like a reasonable thing to consider ... for 8.4. I'm not willing to go poking the syncscan code that much at this late point in the 8.3 cycle. (I'm not sure if it's been mentioned yet on -hackers, but the current plan is to freeze 8.3.0 tomorrow evening.) regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
On Wed, 2008-01-30 at 13:07 -0500, Tom Lane wrote: Zeugswetter Andreas ADI SD [EMAIL PROTECTED] writes: One more question. It would be possible that a session that turned off the synchronized_seqscans still be a pack leader for other later sessions. Do/should we consider that ? Seems like a reasonable thing to consider ... for 8.4. Definitely. I thought about this the other day and decided it had some strange behaviour in some circumstances, so wouldn't be desirable overall. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. That argument leads immediately to the conclusion that you need per-table control over the behavior. It's even worse than that. Elsewhere in this thread Simon mentioned a partitioned table, where each partition on its own is smaller than the threshold, but you're seq scanning several partitions and the total size of the seq scans is larger than memory size. In that scenario, you would want BAS and synchronized scans, but even a per-table setting wouldn't cut it. One idea would be to look at the access plan and estimate the total size of all scans in the plan. Another idea would be to switch to a more seq scan resistant cache replacement algorithm, and forget about the threshold. For synchronized scans to help in the partitioned situation, I guess you'd want to synchronize across partitions. If someone is already scanning partition 5, you'd want to start from that partition and join the pack, instead of starting from partition 1. I think we'll be wiser after we see some real world use of what we have there now.. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
On Wed, 2008-01-30 at 18:42 +, Heikki Linnakangas wrote: Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. That argument leads immediately to the conclusion that you need per-table control over the behavior. It's even worse than that. Elsewhere in this thread Simon mentioned a partitioned table, where each partition on its own is smaller than the threshold, but you're seq scanning several partitions and the total size of the seq scans is larger than memory size. In that scenario, you would want BAS and synchronized scans, but even a per-table setting wouldn't cut it. For synchronized scans to help in the partitioned situation, I guess you'd want to synchronize across partitions. If someone is already scanning partition 5, you'd want to start from that partition and join the pack, instead of starting from partition 1. You're right, but in practice its not quite that bad with the multi-table route. When you have partitions you generally exclude most of them, with typically 1-2 per query, usually different ones. If you were scanning lots of partitions in sequence so frequently that you'd get benefit from synch scans then your partitioning scheme isn't working for you - and that is the worst problem by far. But yes, it does need to be addressed. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
Christopher Browne [EMAIL PROTECTED] writes: This was based on GPUSort: http://gamma.cs.unc.edu/GPUSORT/ I looked briefly at GPUSort a while back. I couldn't see how to shoehorn into POstgres the assumption that you're always sorting floating point numbers. You would have to add some property of some data types that mapped every value to a floating point value. I suppose we could add that as a btree proc entry for specific data types but it would be a pretty radical change. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's Slony Replication support! ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUCvariable
Simon Riggs wrote: On Wed, 2008-01-30 at 18:42 +, Heikki Linnakangas wrote: It's even worse than that. Elsewhere in this thread Simon mentioned a partitioned table, where each partition on its own is smaller than the threshold, but you're seq scanning several partitions and the total size of the seq scans is larger than memory size. In that scenario, you would want BAS and synchronized scans, but even a per-table setting wouldn't cut it. For synchronized scans to help in the partitioned situation, I guess you'd want to synchronize across partitions. If someone is already scanning partition 5, you'd want to start from that partition and join the pack, instead of starting from partition 1. You're right, but in practice its not quite that bad with the multi-table route. When you have partitions you generally exclude most of them, with typically 1-2 per query, usually different ones. Yep. And in that case, you *don't'* want BAS or sync scans to kick in, because you're only accessing a relatively small chunk of data, and it's worthwhile to cache it. -- Heikki Linnakangas EnterpriseDB http://www.enterprisedb.com ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
-Original Message- From: [EMAIL PROTECTED] [mailto:pgsql-hackers- [EMAIL PROTECTED] On Behalf Of Christopher Browne Sent: Wednesday, January 30, 2008 9:56 AM To: pgsql-hackers@postgresql.org Subject: [HACKERS] Will PostgreSQL get ported to CUDA? 2008/1/30 Dann Corbit [EMAIL PROTECTED]: http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~IS SU E~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html Someone at CMU has tried this, somewhat fruitfully. http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf This was based on GPUSort: http://gamma.cs.unc.edu/GPUSORT/ Unfortunately, the licensing of GPUSort is, if anything, more awful than that for CUDA. http://gamma.cs.unc.edu/GPUSORT/terms.html This would need to get pretty totally reimplemented to be useful with PostgreSQL. Happily, we actually have some evidence that the exercise would be of some value. Further, it looks to me like the implementation that was done was done in a pretty naive way. Something done more seriously would likely be radically better... It's too bad that they have a restrictive license. Perhaps there is an opportunity to create an information appliance that contains a special build of PostgreSQL, a nice heap of super-speedy disk, and a big pile of GPUs for sort and merge type operations. The thing that seems nice to me about this idea is that you would have a very stable test platform (all hardware and software combinations would be known and thoroughly tested) and you might also get some extreme performance. I guess that a better sort than GPUSort could be written from scratch, but legal entanglements with the use of the graphics cards may make the whole concept DOA. ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] 8.3RC1 on windows missing descriptive Event handle names
Stephen Denne wrote: I said... On Windows XP, using Process Explorer with the lower pane showing Handles, not all postgres.exe processes are including an Event type with a description of what the process is doing. I've had difficulty reproducing this, but I now suspect that it is only happening when running both v8.2 and v8.3rc1 at once, and I think it is the second started that is missing the process descriptions. That makes sense, really - I think you nailed it. We create a global event, and for those that are duplicates, it won't show up in the second process. I think the solution to this is to add the process id to the name of the event. So instead of: pgident: postgres: autovacuum launcher process We'd ahve pgident(12345): postgres: autovacuum launcher process Seems reasomable? I'll be able to write up and properly test a patch tomorrow. //Magnus ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
[HACKERS] Oops - BF:Mastodon just died
http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00 /D ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Oops - BF:Mastodon just died
Dave Page wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00 Maybe I shouldn't have had those beers after work today, but that looks like it's for example failing tsearch2, which hasn't been touched for over a month! Any chance there's something dodgy in the build env? (If I'm missing the obvious, I blame the beer!) //Magnus ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Oops - BF:Mastodon just died
Dave Page wrote: On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote: Dave Page wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00 Maybe I shouldn't have had those beers after work today, but that looks like it's for example failing tsearch2, which hasn't been touched for over a month! Any chance there's something dodgy in the build env? I can't remember the last time I logged into that box so if it's something in the buildenv, it's either caused by a Windows update, or some failing hardware. I won't have access to my MSVC box until tomorrow, but unless beaten to it I can dig into it a bit more. I don't see anything obvious int he latest patches thoughy (but again, that could be the beer :-P). Any chance you could just do a forced run on it now to show if it was some kind of transient stuff? //Magnus ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Oops - BF:Mastodon just died
On Jan 30, 2008 9:21 PM, Magnus Hagander [EMAIL PROTECTED] wrote: I won't have access to my MSVC box until tomorrow, but unless beaten to it I can dig into it a bit more. I don't see anything obvious int he latest patches thoughy (but again, that could be the beer :-P). Any chance you could just do a forced run on it now to show if it was some kind of transient stuff? Not from here. :-( /D ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Oops - BF:Mastodon just died
On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote: Dave Page wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00 Maybe I shouldn't have had those beers after work today, but that looks like it's for example failing tsearch2, which hasn't been touched for over a month! Any chance there's something dodgy in the build env? I can't remember the last time I logged into that box so if it's something in the buildenv, it's either caused by a Windows update, or some failing hardware. /D ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Oops - BF:Mastodon just died
Dave Page wrote: On Jan 30, 2008 9:13 PM, Magnus Hagander [EMAIL PROTECTED] wrote: Dave Page wrote: http://www.pgbuildfarm.org/cgi-bin/show_log.pl?nm=mastodondt=2008-01-30%2020:00:00 Maybe I shouldn't have had those beers after work today, but that looks like it's for example failing tsearch2, which hasn't been touched for over a month! Any chance there's something dodgy in the build env? I can't remember the last time I logged into that box so if it's something in the buildenv, it's either caused by a Windows update, or some failing hardware. None of the CVS changes in the relevant period seems to have any relation to the errors, so I suspect a local problem. red_bat is due to build in a couple of hours, so we will soon see if it reproduces the error. cheers andrew ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] GSSAPI doesn't play nice with non-canonical host names
Magnus Hagander [EMAIL PROTECTED] writes: On Sun, Jan 27, 2008 at 09:32:54PM -0500, Stephen Frost wrote: While I'm complaining: that's got to be one of the least useful error messages I've ever seen, and it's for a case that's surely going to be fairly common in practice. Can't we persuade GSSAPI to produce something more user-friendly? At least convert 7 to Server not found in Kerberos database? I agree, and have found it to be very frustrating while working w/ Kerberos in general. I *think* there's a library which can convert those error-codes (libcomm-err?), but I've not really looked into it yet. AFAIK, that one is for Kerberos only. For GSSAPI, we already use the gss_display_status function to get the error messages. I think the problem here is in the Kerberos library? Yeah, I found it: https://bugzilla.redhat.com/show_bug.cgi?id=430983 The best fix is not entirely clear, but in any case it's not our bug. regards, tom lane ---(end of broadcast)--- TIP 4: Have you searched our list archives? http://archives.postgresql.org
Re: [HACKERS] Will PostgreSQL get ported to CUDA?
On Wed, 2008-01-30 at 17:55 +, Christopher Browne wrote: 2008/1/30 Dann Corbit [EMAIL PROTECTED]: http://www.scientificcomputing.com/ShowPR~PUBCODE~030~ACCT~300100~ISSUE~0801~RELTYPE~HPCC~PRODCODE~~PRODLETT~C.html http://www.nvidia.com/object/cuda_learn.html http://www.nvidia.com/object/cuda_get.html Someone at CMU has tried this, somewhat fruitfully. http://www.andrew.cmu.edu/user/ngm/15-823/project/Draft.pdf http://www.andrew.cmu.edu/user/ngm/15-823/project/Final.pdf Well done that man! Excellent piece of research. Clearly GPUsort is cool; is it cool enough? Here's a few thoughts and questions that we still need answers to: The concept of CPU offload can be generalised to any specialised hardware. Can we offload such tasks easily? If so, to what? Should it be a GPU, or just another more general CPU? Is the cost and difficulty of making the GPU work in generalised form better than spending that money on more resources e.g. memory? Can the sorting network really be reused in the general case, or must we realistically recreate it for each new sort set? Can we have multiple concurrent sorts of the GPU, or is it one user at a time? Would we need multiple GPUs? Is such an architecture available? I note that the comparison with HeapSort is worst case, since we sort 1 GB of memory without increasing work_mem beyond 1MB. There doesn't seem to be a discussion of how GPUsort would handle sorts too large to fit within the GPU, so we would need to have an external sort mechanism. So if qsort is better for smaller inputs and external sorts are needed for larger, then there seems to be a narrow-ish middle band of benefit. Another thought would be to replace external heap sort with an external sort based around qsort or GPUsort, which we extend the range of usefulness. But then we're back to redesigning external sorts. -- Simon Riggs 2ndQuadrant http://www.2ndQuadrant.com ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Better default_statistics_target
On Mon, Jan 28, 2008 at 11:14:05PM +, Christopher Browne wrote: On Dec 6, 2007 6:28 PM, Decibel! [EMAIL PROTECTED] wrote: FWIW, I've never seen anything but a performance increase or no change when going from 10 to 100. In most cases there's a noticeable improvement since it's common to have over 100k rows in a table, and there's just no way to capture any kind of a real picture of that with only 10 buckets. I'd be more inclined to try to do something that was at least somewhat data aware. The interesting theory that I'd like to verify if I had a chance would be to run through a by-column tuning using a set of heuristics. My first order approximation would be: - If a column defines a unique key, then we know there will be no clustering of values, so no need to increase the count... - If a column contains a datestamp, then the distribution of values is likely to be temporal, so no need to increase the count... - If a column has a highly constricted set of values (e.g. - boolean), then we might *decrease* the count. - We might run a query that runs across the table, looking at frequencies of values, and if it finds a lot of repeated values, we'd increase the count. That's a bit hand-wavy, but that could lead to both increases and decreases in the histogram sizes. Given that, we can expect the overall stat sizes to not forcibly need to grow *enormously*, because we can hope for there to be cases of shrinkage. I think that before doing any of that you'd be much better off investigating how much performance penalty there is for maxing out default_statistict_target. If, as I suspect, it's essentially 0 on modern hardware, then I don't think it's worth any more effort. BTW, that investigation wouldn't just be academic either; if we could convince ourselves that there normally wasn't any cost associated with a high default_statistics_target, we could increase the default, which would reduce the amount of traffic we'd see on -performance about bad query plans. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpJmmXmUl3KN.pgp Description: PGP signature
Re: [HACKERS] [PATCHES] Better default_statistics_target
Decibel! [EMAIL PROTECTED] writes: I think that before doing any of that you'd be much better off investigating how much performance penalty there is for maxing out default_statistict_target. If, as I suspect, it's essentially 0 on modern hardware, then I don't think it's worth any more effort. That's not my experience. Even just raising it to 100 multiplies the number of rows ANALYZE has to read by 10. And the arrays for every column become ten times larger. Eventually they start being toasted... BTW, that investigation wouldn't just be academic either; if we could convince ourselves that there normally wasn't any cost associated with a high default_statistics_target, we could increase the default, which would reduce the amount of traffic we'd see on -performance about bad query plans. I suspect we could raise it, we just don't know by how much. -- Gregory Stark EnterpriseDB http://www.enterprisedb.com Ask me about EnterpriseDB's 24x7 Postgres support! ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Oops - BF:Mastodon just died
Andrew Dunstan [EMAIL PROTECTED] writes: None of the CVS changes in the relevant period seems to have any relation to the errors, so I suspect a local problem. skylark and baiji are now red too, so I guess that theory is dead in the water. Something in today's changes broke the MSVC build, but what? I diffed yesterday's and today's make logs from skylark, and found nothing interesting except this: *** *** 605,611 Generate DEF file^M Generating POSTGRES.DEF from directory Release\postgres^M \ ..\ .^M ! Generated 5208 symbols^M Linking...^M Creating library Release\postgres\postgres.lib and object Release\postgres\postgres.exp^M Embedding manifest...^M --- 605,611 Generate DEF file^M Generating POSTGRES.DEF from directory Release\postgres^M \ ..\ .^M ! Generated 5205 symbols^M Linking...^M Creating library Release\postgres\postgres.lib and object Release\postgres\postgres.exp^M Embedding manifest...^M *** Presumably the three missing symbols include the two that are being complained of later, but what the heck? (Hmm, actually today's commits should have added two global symbols to the backend, so it seems there are five not three symbols to be accounted for.) It is probably significant that both of the known missing symbols come from guc.c, which we added another variable to today. I have a sickening feeling that we have hit some kind of undocumented internal limit in MSVC as to the number of symbols imported/exported by one source file... regards, tom lane ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] [PATCHES] Better default_statistics_target
On Jan 31, 2008 12:08 AM, Gregory Stark [EMAIL PROTECTED] wrote: Decibel! [EMAIL PROTECTED] writes: I think that before doing any of that you'd be much better off investigating how much performance penalty there is for maxing out default_statistict_target. If, as I suspect, it's essentially 0 on modern hardware, then I don't think it's worth any more effort. That's not my experience. Even just raising it to 100 multiplies the number of rows ANALYZE has to read by 10. And the arrays for every column become ten times larger. Eventually they start being toasted... +1. From the tests I did on our new server, I set the default_statistict_target to 30. Those tests were mainly based on the ANALYZE time though, not the planner overhead introduced by larger statistics - with higher values, I considered the ANALYZE time too high for the benefits. I set it higher on a per column basis only if I see it can lead to better stats but from all the tests I did so far, it was sufficient for our data set. -- Guillaume ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Better default_statistics_target
Guillaume Smet [EMAIL PROTECTED] writes: On Jan 31, 2008 12:08 AM, Gregory Stark [EMAIL PROTECTED] wrote: That's not my experience. Even just raising it to 100 multiplies the number of rows ANALYZE has to read by 10. And the arrays for every column become ten times larger. Eventually they start being toasted... +1. From the tests I did on our new server, I set the default_statistict_target to 30. Those tests were mainly based on the ANALYZE time though, not the planner overhead introduced by larger statistics - with higher values, I considered the ANALYZE time too high for the benefits. eqjoinsel(), for one, is O(N^2) in the number of MCV values kept. Possibly this could be improved, but in general I'd be real wary of pushing the default to the moon without some explicit testing of the impact on planning time. regards, tom lane ---(end of broadcast)--- TIP 7: You can help support the PostgreSQL project by donating at http://www.postgresql.org/about/donate
Re: [HACKERS] Oops - BF:Mastodon just died
I wrote: I diffed yesterday's and today's make logs from skylark, and found nothing interesting except this: *** *** 605,611 Generating POSTGRES.DEF from directory Release\postgres^M ! Generated 5208 symbols^M Linking...^M --- 605,611 Generating POSTGRES.DEF from directory Release\postgres^M ! Generated 5205 symbols^M Linking...^M *** Looking at this a bit closer, I realize that it's coming from gendef.pl's dumpbin usage of recent infamy. So there are a couple of ideas that come to mind: * Has the buildfarm script changed recently in a way that might change the execution PATH and thereby suck in a different version of dumpbin? (Or even a different version of Perl?) * Is it conceivable that dumpbin's output format has changed in a way that confuses the bit of Perl code that's parsing it? One idea that comes to mind is that it contains a timestamp that just got wider --- I remember seeing some bugs like that when the value of Unix time_t reached 1 billion and became 9 instead of 8 digits. Neither of these sound very plausible, but it seems the next step for investigation is to look closely at what's happening in gendef.pl. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Oops - BF:Mastodon just died
Dave Page [EMAIL PROTECTED] writes: I can't remember the last time I logged into that box so if it's something in the buildenv, it's either caused by a Windows update, Re-reading the thread ... could that last point be significant? Are all four of these boxen set to auto-accept updates from Redmond? regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Oops - BF:Mastodon just died
Tom Lane wrote: * Has the buildfarm script changed recently in a way that might change the execution PATH and thereby suck in a different version of dumpbin? (Or even a different version of Perl?) No. In at least the case of red_bat nothing has changed for months. * Is it conceivable that dumpbin's output format has changed in a way that confuses the bit of Perl code that's parsing it? One idea that comes to mind is that it contains a timestamp that just got wider --- I remember seeing some bugs like that when the value of Unix time_t reached 1 billion and became 9 instead of 8 digits. Neither of these sound very plausible, but it seems the next step for investigation is to look closely at what's happening in gendef.pl. Right. I agree that your diff makes gendef.pl the prime suspect. Yoo also just said: Dave Page [EMAIL PROTECTED] writes: I can't remember the last time I logged into that box so if it's something in the buildenv, it's either caused by a Windows update, Re-reading the thread ... could that last point be significant? Are all four of these boxen set to auto-accept updates from Redmond? No. red_bat does not auto-accept anything. cheers andrew ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. That argument leads immediately to the conclusion that you need per-table control over the behavior. Which maybe you do, but it's far too late to be proposing it for 8.3. We should put this whole area of more-control-over-BAS-and-syncscan on the TODO agenda. Another question --- why don't we just turn off synchronized_seqscans when we do COPY TO? That would fix pg_dump and be transparent. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Bruce Momjian wrote: Tom Lane wrote: Simon Riggs [EMAIL PROTECTED] writes: I'm still not very happy with any of the options here. BAS is great if you didn't want to trash the cache, but its also annoying to people that really did want to load a large table into cache. However we set it, we're going to have problems because not everybody has the same database. That argument leads immediately to the conclusion that you need per-table control over the behavior. Which maybe you do, but it's far too late to be proposing it for 8.3. We should put this whole area of more-control-over-BAS-and-syncscan on the TODO agenda. Another question --- why don't we just turn off synchronized_seqscans when we do COPY TO? That would fix pg_dump and be transparent. Sorry, I was unclear. I meant don't have a GUC at all but just set an internal variable to turn off synchronized sequential scans when we do COPY TO. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 5: don't forget to increase your free space map settings
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Bruce Momjian [EMAIL PROTECTED] writes: Another question --- why don't we just turn off synchronized_seqscans when we do COPY TO? That would fix pg_dump and be transparent. Enforcing this from the server side seems a pretty bad idea. Note that there were squawks about having pg_dump behave this way at all; if the control is on the pg_dump side then at least we have the chance to make it a user option later. Also, you forgot about pg_dump -d. regards, tom lane ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] [PATCHES] Better default_statistics_target
On Jan 30, 2008 5:58 PM, Decibel! [EMAIL PROTECTED] wrote: On Mon, Jan 28, 2008 at 11:14:05PM +, Christopher Browne wrote: On Dec 6, 2007 6:28 PM, Decibel! [EMAIL PROTECTED] wrote: FWIW, I've never seen anything but a performance increase or no change when going from 10 to 100. In most cases there's a noticeable improvement since it's common to have over 100k rows in a table, and there's just no way to capture any kind of a real picture of that with only 10 buckets. I'd be more inclined to try to do something that was at least somewhat data aware. The interesting theory that I'd like to verify if I had a chance would be to run through a by-column tuning using a set of heuristics. My first order approximation would be: - If a column defines a unique key, then we know there will be no clustering of values, so no need to increase the count... - If a column contains a datestamp, then the distribution of values is likely to be temporal, so no need to increase the count... - If a column has a highly constricted set of values (e.g. - boolean), then we might *decrease* the count. - We might run a query that runs across the table, looking at frequencies of values, and if it finds a lot of repeated values, we'd increase the count. That's a bit hand-wavy, but that could lead to both increases and decreases in the histogram sizes. Given that, we can expect the overall stat sizes to not forcibly need to grow *enormously*, because we can hope for there to be cases of shrinkage. I think that before doing any of that you'd be much better off investigating how much performance penalty there is for maxing out default_statistict_target. If, as I suspect, it's essentially 0 on modern hardware, then I don't think it's worth any more effort. BTW, that investigation wouldn't just be academic either; if we could convince ourselves that there normally wasn't any cost associated with a high default_statistics_target, we could increase the default, which would reduce the amount of traffic we'd see on -performance about bad query plans. There seems to be *plenty* of evidence out there that the performance penalty would NOT be essentially zero. Tom points out: eqjoinsel(), for one, is O(N^2) in the number of MCV values kept. It seems to me that there are cases where we can *REDUCE* the histogram width, and if we do that, and then pick and choose the columns where the width increases, the performance penalty may be yea, verily *actually* 0. This fits somewhat with Simon Riggs' discussion earlier in the month about Segment Exclusion; these both represent cases where it is quite likely that there is emergent data in our tables that can help us to better optimize our queries. -- http://linuxfinances.info/info/linuxdistributions.html The definition of insanity is doing the same thing over and over and expecting different results. -- assortedly attributed to Albert Einstein, Benjamin Franklin, Rita Mae Brown, and Rudyard Kipling ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Tom Lane wrote: Bruce Momjian [EMAIL PROTECTED] writes: Another question --- why don't we just turn off synchronized_seqscans when we do COPY TO? That would fix pg_dump and be transparent. Enforcing this from the server side seems a pretty bad idea. Note that there were squawks about having pg_dump behave this way at all; if the control is on the pg_dump side then at least we have the chance to make it a user option later. Also, you forgot about pg_dump -d. OK, but keep in mind if we use synchronized_seqscans in pg_dump we will have to recognize that GUC forever. -- Bruce Momjian [EMAIL PROTECTED]http://momjian.us EnterpriseDB http://postgres.enterprisedb.com + If your life is a hard drive, Christ can be your backup. + ---(end of broadcast)--- TIP 1: if posting/reading through Usenet, please send an appropriate subscribe-nomail command to [EMAIL PROTECTED] so that your message can get through to the mailing list cleanly
Re: [HACKERS] Oops - BF:Mastodon just died
Tom Lane wrote: Neither of these sound very plausible, but it seems the next step for investigation is to look closely at what's happening in gendef.pl. Yes, I have found the problem. It is this line, which I am amazed hasn't bitten us before: next unless /^\d/; The first field in the dumpbin output looks like a 3 digit hex number. The line on my system for GetConfigOptionByName starts with 'A02' which of course fails the test above. For now I'm going try to fix it by changing it to: next unless $pieces[0] =~/^[A-F0-9]{3}$/; I also propose to have the gendefs.pl script save the dumpbin output so this sort of problem will be easier to debug. cheers andrew ---(end of broadcast)--- TIP 9: In versions below 8.0, the planner will ignore your desire to choose an index scan if your joining column's datatypes do not match
Re: [HACKERS] Oops - BF:Mastodon just died
Andrew Dunstan [EMAIL PROTECTED] writes: Yes, I have found the problem. It is this line, which I am amazed hasn't bitten us before: next unless /^\d/; The first field in the dumpbin output looks like a 3 digit hex number. Argh, so it was crossing a power-of-2 boundary that got us. Good catch. For now I'm going try to fix it by changing it to: next unless $pieces[0] =~/^[A-F0-9]{3}$/; Check. I also propose to have the gendefs.pl script save the dumpbin output so this sort of problem will be easier to debug. Agreed, but I suggest waiting till 8.4 is branched unless you are really sure about this addition. We freeze for 8.3.0 in less than 24 hours. regards, tom lane ---(end of broadcast)--- TIP 3: Have you checked our extensive FAQ? http://www.postgresql.org/docs/faq
Re: [HACKERS] [PATCHES] Proposed patch: synchronized_scanning GUC variable
Bruce Momjian [EMAIL PROTECTED] writes: OK, but keep in mind if we use synchronized_seqscans in pg_dump we will have to recognize that GUC forever. No, because it's being used on the query side, not in the emitted dump. We have *never* promised that pg_dump version N could dump from server version N+1 .., in fact, personally I'd like to make that case be a hard error, rather than something people could override with -i. regards, tom lane ---(end of broadcast)--- TIP 6: explain analyze is your friend
Re: [HACKERS] Truncate Triggers
On Mon, Jan 28, 2008 at 09:09:13PM -0300, Alvaro Herrera wrote: Decibel! wrote: On Fri, Jan 25, 2008 at 11:40:19AM +, Simon Riggs wrote: (for 8.4 ...) I'd like to introduce triggers that fire when we issue a truncate: Rather than focusing exclusively on TRUNCATE, how about triggers that fire whenever any kind of DDL operation is performed? (Ok, truncate is more DML than DDL, but still). I don't think it makes sense in general. For example, would we fire triggers on CLUSTER? Or on ALTER TABLE / SET STATISTICS? CLUSTER isn't DDL. Most forms of ALTER TABLE are. And CREATE blah, etc. My point is that people have been asking for triggers that fire when specific commands are executed for a long time; it would be short-sighted to come up with a solution that only works for TRUNCATE if we could instead come up with a more generic solution that works for a broader class of (or perhaps all) commands. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgph70AwSi4cQ.pgp Description: PGP signature
Re: [HACKERS] [PATCHES] Better default_statistics_target
On Wed, Jan 30, 2008 at 09:13:37PM -0500, Christopher Browne wrote: There seems to be *plenty* of evidence out there that the performance penalty would NOT be essentially zero. Tom points out: eqjoinsel(), for one, is O(N^2) in the number of MCV values kept. It seems to me that there are cases where we can *REDUCE* the histogram width, and if we do that, and then pick and choose the columns where the width increases, the performance penalty may be yea, verily *actually* 0. This fits somewhat with Simon Riggs' discussion earlier in the month about Segment Exclusion; these both represent cases where it is quite likely that there is emergent data in our tables that can help us to better optimize our queries. This is all still hand-waving until someone actually measures what the impact of the stats target is on planner time. I would suggest actually measuring that before trying to invent more machinery. Besides, I think you'll need that data for the machinery to make an intelligent decision anyway... BTW, with autovacuum I don't really see why we should care about how long analyze takes, though perhaps it should have a throttle ala vacuum_cost_delay. -- Decibel!, aka Jim C. Nasby, Database Architect [EMAIL PROTECTED] Give your computer some brain candy! www.distributed.net Team #1828 pgpbkSuadbMUY.pgp Description: PGP signature