Re: [HACKERS] Makefiles don't seem to remember to rebuild everything anymore
On Sun, Dec 16, 2012 at 12:23 AM, Pavan Deolasee pavan.deola...@gmail.com wrote: When a file is secondary, make will not create the file merely because it does not already exist, but make does not automatically delete the file. (link: ftp://ftp.gnu.org/old-gnu/Manuals/make-3.79.1/html_chapter/make_10.html#SEC97) And here is another relevant info from http://www.gnu.org/software/make/manual/html_node/Special-Targets.html .SECONDARY with no prerequisites causes all targets to be treated as secondary (i.e., no target is removed because it is considered intermediate). Reading that along with the other comment explains what we are seeing with a .SECONDARY without any prerequisites. BTW I also tried with a very simple Makefile to rule out any roles that implicit rules might be playing. If I create a Makefile like below: final: sfinal cp sfinal final sfinal: qfinal cp qfinal sfinal qfinal: touch qfinal If I build fully and then remove file sfinal, subsequent make will recreate that file. But if I add a .SECONDARY: target without any prerequisites, sfinal will not be recreated. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
Hello I understand to motivation, but proposed syntax is not too intuitive and robust can you do it in one function and call import only in first call? Regards Pavel 2012/12/16 Peter Eisentraut pete...@gmx.net: I'm going to use PL/Python as an example, but I would also like to know if this could be applicable to other languages. When you do CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ source code here $$; it internally creates a source file that contains --- def __plpython_procedure_foo_12345(): source code here --- It would be useful to be able to do something like this instead: --- some code here def __plpython_procedure_foo_12345(): some more code here --- This would especially be useful for placing imports into the first part. While you can have them in the function definition, that means they are executed every time the function is called, which makes it much slower. Also, future imports are not possible this way. CREATE FUNCTION already supports multiple AS items. Currently, multiple AS items are rejected for all languages but C. I'd imagine lifting that restriction and leaving it up to the validator to check it. Then any language can accept two AS items if it wants and paste them together in whichever way it needs. (The probin/prosrc naming will then become more obsolete, but it's perhaps not worth changing anything about that.) So in practice this might look like this: CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ import x import y $$, $$ real code here $$; Comments? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On 16 December 2012 07:53, Pavan Deolasee pavan.deola...@gmail.com wrote: On Sun, Dec 16, 2012 at 3:18 AM, Tom Lane t...@sss.pgh.pa.us wrote: Simon Riggs si...@2ndquadrant.com writes: Doing that only makes sense when we're running a SELECT. Setting the all visible bit immediately prior to an UPDATE that clears it again is pointless effort, generating extra work for no reason. On the other hand, the HOT prune operation itself is worthless when we're running a SELECT. The only reason we do it that way is that we have to prune before the query starts to use the page, else pruning might invalidate pointers-to-tuples that are being held within the query plan tree. Right. HOT prune may and often would be called in the SELECT path or heap/index scan leading to UPDATE/DELETE. But whenever its called, it looks like a good idea to set the visibility map bit. As explained above, I disagree that it looks like a good idea, and you've shown no evidence it would be or is true. I agree with Tom that cleaning on SELECT can be worthless and I'd definitely like to be able to turn it off conditionally or permanently. There is some additional overhad to check if a LIVE tuple is all-visible or not, but that doesn't look too much. I did run some pgbench tests for fully cached tables and did not see any difference in tps. Another idea could have been to NOT clear the visibility bit when a HOT update happens. Such tuple can get pruned by HOT prune, so we don't need vacuum per se, and the index-only scans are not affected because the update was a HOT update, so the index keys did not change either. So index-only scans would continue to return the same result. Don't know if this would work with hot standby, probably not. Maybe it's time to look at what it'd take for the low-level scan operations to know whether they're scanning the target relation of an UPDATE query, so that we could skip pruning altogether except when a HOT update could conceivably ensue. Yeah, we could do that. We may not be able to predict whether the coming update is HOT or not, but I don't think that matters. We can predict that an update is HOT in advance, if none of the index columns are touched in the UPDATE. Yes, there are some cases where it might not be, but we could probably store that in the statement cache. Making that checkat run time must cost some block contention, so it would be good to remove it from every update. OTOH its probably not too bad to prune in any scan (like we do today) because there is fairly high chance that the page will be dirtied for hint bit updates. So may be its better to just prune as well. We have already put in several optimisations to do so only when required and without any unnecessary contention. Of course, benchmarks can prove me wrong. I think we could use some measurement/stats there so we can check. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 12/14/2012 3:20 PM, Robert Haas wrote: On Fri, Dec 14, 2012 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: ... In more than ten years of working with PostgreSQL, I've never encountered where the restriction at issue here prevented a bug. It's only annoyed me and broken my application code (when moving from PostgreSQL 8.2 to PostgreSQL 8.3, never mind any other database!). There are quite a few examples in our archives of application bugs that would have been prevented, or later were prevented, by the 8.3 changes that reduced the system's willingness to apply implicit casts to text. I recall for instance cases where people got wrong/unexpected answers because the system was sorting what-they-thought-were-timestamp values textually. So I find such sweeping claims to be demonstrably false, and I'm suspicious of behavioral changes that are proposed with such arguments backing them. I think you're mixing apples and oranges. The whole point of the patch on the table - which, if you recall, was designed originally by you and merely implemented by me - was to change the behavior only in the cases where there's only one function with the appropriate name and argument count. The ambiguous cases that 8.3+ helpfully prevent are those where overloading is in use and the choice of which function to call is somewhat arbitrary and perhaps incorrectly-foreseen by the user. Those changes also have the side-effect of preventing a straightforward function call from working without casts even in cases where no overloading is in use - and making that case work is completely different from making the ambiguous case arbitrarily pick one of the available answers. FWIW I for one thought that casting more liberal in the case at hand, where there is only one function with that name and number of arguments, would be a good thing. My only concern with the patch presented was that changing make_fn_assignment() in that way may have some unwanted side effects because it is called from different locations and the usage of COERCION_IMPLICIT may actually guard against something, that we don't want to allow. I don't have any evidence that it does, just the concern that it may. Perhaps make_fn_arguments() needs to receive that coercion context as an argument and the caller hands in COERCION_ASSIGNMENT only in the case at hand? Jan -- Anyone who trades liberty for security deserves neither liberty nor security. -- Benjamin Franklin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Fix fmgroids.h not regenerated after clean (not clean dist) on Windows
On Wed, Dec 12, 2012 at 8:07 AM, Craig Ringer cr...@2ndquadrant.com wrote: Hi all There's an issue with MSVC builds on Windows where clean.bat deletes src\include\utils\fmgroids.h (as it should) but build.pl doesn't re-create it reliably. It's created fine on the first build because Gen_fmgrtab.pl is called if src\backend\utils\fmgrtab.c is missing, which it is on first build. Generating it also generates src\backend\utils\fmgroids.h, which is copied to src\include. However, this copy isn't repeated after clean deletes src\include\utils\fmgroids.h . clean dist is fine, since it deletes fmgrtab.c too, causing the whole thing to be re-generated. The attached patch fixes the issue. Looks good to me. Applied and backpatched to 9.2 - the logic appears slightly different before that. -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] XLByte* usage
Hi, Now that XLRecPtr's are plain 64bit integers what are we supposed to use in code comparing and manipulating them? There already is plenty example of both, but I would like new code to go into one direction not two... I personally find direct comparisons/manipulations far easier to read than the XLByte* equivalents. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] MySQL search query is not executing in Postgres DB
On 27 November 2012 22:41, Tom Lane t...@sss.pgh.pa.us wrote: Merlin Moncure mmonc...@gmail.com writes: ... I think if you relaxed the function sigs of a few functions on this page (http://www.postgresql.org/docs/9.2/interactive/functions-string.html), most reported problems would go away. That's an interesting way of approaching it. Do we have any data on exactly which functions people do complain about? One thing that worries me is introducing ambiguous cases where previously there weren't any though. Right, but at least we'd be confining the ambiguity to a small number of function names. Tweaking the casting rules could have a lot of unforeseen consequences. There have been many good points made on this thread. Being sloppy in all cases is a bad thing we all agree or reluctantly admit; what is needed is the ability for a user to be able to more closely define what they mean by such conversions, so that application SQL can be made to work. It certainly isn't easy to say that COLUMN LIKE '1%' would work in all cases, since the preferred format of that data might be (xxx) xxx-, or $x or EURO symbol,xx (i.e. with a comma as the decimal separator). The format comes from the meaning of the data, which we cannot know. What would be useful is to be able to define default format models for each column. If not defined, there is no implicit cast. If FORMAT is defined then we know to apply it in the absence of a global cast. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote: Another idea could have been to NOT clear the visibility bit when a HOT update happens. Such tuple can get pruned by HOT prune, so we don't need vacuum per se, and the index-only scans are not affected because the update was a HOT update, so the index keys did not change either. So index-only scans would continue to return the same result. Don't know if this would work with hot standby, probably not. For IOSs that sounds like an interesting and itself easy to implement idea, you basically only would need to add a single !use_hot_update in the if blocks doing the PageClearAllVisible in heap_update. This probably could make IOSs far more likely in some scenarios. The complicated bit seems to be the heapgetpage() logic arround all_visible, because HOT updates are obviously relevant in normal heap scans. It seems to me that would require the vm bit continuing to be set while the page level bit get unset. I *think* thats actually ok because whenever we set/clear the visibilitymap we will still log it properly, so the crash safety guarantees seem to hold true. Obviously we would have to change the escape hatch for exactly that condition in vacuumlazy, but thats not a problem... I don't immediately see a problem with HS, your logic seems to hold equally true there. The replay logic would need to be refined slightly, but it looks possible. That is without having checked the code... Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On 2012-12-15 16:48:08 -0500, Tom Lane wrote: Simon Riggs si...@2ndquadrant.com writes: Doing that only makes sense when we're running a SELECT. Setting the all visible bit immediately prior to an UPDATE that clears it again is pointless effort, generating extra work for no reason. On the other hand, the HOT prune operation itself is worthless when we're running a SELECT. The only reason we do it that way is that we have to prune before the query starts to use the page, else pruning might invalidate pointers-to-tuples that are being held within the query plan tree. Maybe it's time to look at what it'd take for the low-level scan operations to know whether they're scanning the target relation of an UPDATE query, so that we could skip pruning altogether except when a HOT update could conceivably ensue. I think this was discussed back when HOT went in, but nobody wanted to make the patch more invasive than it had to be. FWIW I think that would be a pretty worthwile optimization - I have seen workloads where hot pruning lead to considerable contention. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding - GetOldestXmin
On 2012-12-15 01:19:26 +0100, Andres Freund wrote: On 2012-12-14 14:01:30 -0500, Robert Haas wrote: On Fri, Dec 14, 2012 at 6:46 AM, Andres Freund and...@2ndquadrant.com wrote: Just moving that tidbit inside the lock seems to be the pragmatic choice. GetOldestXmin is called * once per checkpoint * one per index build * once in analyze * twice per vacuum * once for HS feedback messages Nothing of that occurs frequently enough that 5 instructions will make a difference. I would be happy to go an alternative path, but right now I don't see any nice one. A already_locked parameter to GetOldestXmin seems to be a cure worse than the disease. I'm not sure that would be so bad, but I guess I question the need to do it this way at all. Most of the time, if you need to advertise your global xmin, you use GetSnapshotData(), not GetOldestXmin(), and I guess I'm not seeing why that wouldn't also work here. Am I dumb? I wondered upthread whether that would be better: On 2012-12-13 21:03:44 +0100, Andres Freund wrote: Another alternative to this would be to get a snapshot with GetSnapshotData(), copy the xmin to the logical slot, then call ProcArrayEndTransaction(). But that doesn't really seem to be nicer to me. Not sure why I considered it ugly anymore, but it actually has a noticeable disadvantage. GetOldestXmin is nicer is than GetSnapshotData as the latter set a fairly new xid as xmin whereas GetOldestXmin returns the actual current xmin horizon. Thats preferrable because it allows us to start up more quickly. snapbuild.c can only start building a snapshot once it has seen a xl_running_xact with oldestRunningXid = own_xmin. Otherwise we cannot be sure that no relevant catalog tuples have been removed. Hm. One way that could work with fewer changes is to exploit the fact that a) it seems to be possible to acquire a shared lwlock twice in the same backend and b) both GetOldestXmin GetSnapshotData acquire only a shared lwlock. Are we willing to guarantee that recursive acquiration of shared lwlocks continues to work? Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Assert for frontend programs?
On 12/16/2012 01:29 AM, Peter Eisentraut wrote: On Fri, 2012-12-14 at 17:03 -0500, Tom Lane wrote: Having the layer is a good thing, eg so that USE_ASSERT_CHECKING can control it, or so that somebody can inject a different behavior if they want. You could also (or at least additionally) map !USE_ASSERT_CHECKING to NDEBUG. This would also help with imported code that calls assert() directly. We should probably do that for both frontend and backend code, no? That would get rid of potential problems we already have like inet_net_pton.c that I noted the other day. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com wrote: As explained above, I disagree that it looks like a good idea, and you've shown no evidence it would be or is true. Lets separate out these two issues. What you are suggesting as a follow up to Tom's idea, I've no objection to that and that might be worthwhile optimisation to try out. But this patch itself does not attempt to deal with that and its a separate work item and will require invasive changes and tests. *Whenever* we HOT prune, either in SELECT path or UPDATE path, what I'm suggesting is lets try to set the visibility map bit if the conditions are favorable. The only extra work that we are doing (as in the submitted patch) is to check few additional things for LIVE tuples such as if xmin precedes the OldestXmin or not. That itself does not seem too costly. What we gain is: 1. next vacuum may skip that page because its marked all-visible and 2. index-only scan will not visit the heap page. These two improvements will avoid an useless heap page IO and may justify a little more work in HOT prune. Thanks, Pavan -- Pavan Deolasee http://www.linkedin.com/in/pavandeolasee -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On 16 December 2012 14:41, Andres Freund and...@2ndquadrant.com wrote: On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote: Another idea could have been to NOT clear the visibility bit when a HOT update happens. Such tuple can get pruned by HOT prune, so we don't need vacuum per se, and the index-only scans are not affected because the update was a HOT update, so the index keys did not change either. So index-only scans would continue to return the same result. Don't know if this would work with hot standby, probably not. For IOSs that sounds like an interesting and itself easy to implement idea, you basically only would need to add a single !use_hot_update in the if blocks doing the PageClearAllVisible in heap_update. This probably could make IOSs far more likely in some scenarios. Doing that would completely change the meaning of the visibility map from a heap visibility map into an index-only map. IndexOnly scans would still work, but nothing else would ever and it would be hard to confirm the validity of the vm. -1 -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Serious problem: media recovery fails after system or PostgreSQL crash
On 8.12.2012 03:08, Jeff Janes wrote: On Thu, Dec 6, 2012 at 3:52 PM, Tomas Vondra t...@fuzzy.cz wrote: Hi, On 6.12.2012 23:45, MauMau wrote: From: Tom Lane t...@sss.pgh.pa.us Well, that's unfortunate, but it's not clear that automatic recovery is possible. The only way out of it would be if an undamaged copy of the segment was in pg_xlog/ ... but if I recall the logic correctly, we'd not even be trying to fetch from the archive if we had a local copy. No, PG will try to fetch the WAL file from pg_xlog when it cannot get it from archive. XLogFileReadAnyTLI() does that. Also, PG manual contains the following description: http://www.postgresql.org/docs/9.1/static/continuous-archiving.html#BACKUP-ARCHIVING-WAL WAL segments that cannot be found in the archive will be sought in pg_xlog/; this allows use of recent un-archived segments. However, segments that are available from the archive will be used in preference to files in pg_xlog/. So why don't you use an archive command that does not create such incomplete files? I mean something like this: archive_command = 'cp %p /arch/%f.tmp mv /arch/%f.tmp /arch/%f' Until the file is renamed, it's considered 'incomplete'. Wouldn't having the incomplete file be preferable over having none of it at all? It seems to me you need considerable expertise to figure out how to do optimal recovery (i.e. losing the least transactions) in this situation, and that that expertise cannot be automated. Do you trust a partial file from a good hard drive, or a complete file from a partially melted pg_xlog? It clearly is a rather complex issue, no doubt about that. And yes, reliability of the devices with pg_xlog on them is an important detail. Alghough if the WAL is not written in a reliable way, you're hosed anyway I guess. The recommended archive command is based on the assumption that the local pg_xlog is intact (e.g. because it's located on a reliable RAID1 array), which seems to be the assumption of the OP too. In my opinion it's more likely to meet an incomplete copy of WAL in the archive than a corrupted local WAL. And if it really is corrupted, it would be identified during replay. Tomas -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
On 2012-12-16 16:25:03 +, Simon Riggs wrote: On 16 December 2012 14:41, Andres Freund and...@2ndquadrant.com wrote: On 2012-12-16 13:23:56 +0530, Pavan Deolasee wrote: Another idea could have been to NOT clear the visibility bit when a HOT update happens. Such tuple can get pruned by HOT prune, so we don't need vacuum per se, and the index-only scans are not affected because the update was a HOT update, so the index keys did not change either. So index-only scans would continue to return the same result. Don't know if this would work with hot standby, probably not. For IOSs that sounds like an interesting and itself easy to implement idea, you basically only would need to add a single !use_hot_update in the if blocks doing the PageClearAllVisible in heap_update. This probably could make IOSs far more likely in some scenarios. Doing that would completely change the meaning of the visibility map from a heap visibility map into an index-only map. IndexOnly scans would still work, but nothing else would ever and it would be hard to confirm the validity of the vm. I don't think it would change the meaning that much - the visibilitymap would still work for vacuum as normal heap updates would still unset the all-visible flag. Vacuum would skip pages that already were all-visible and then only got hot updated, true, but that seems like its an acceptable tradeoff as that dead space can be fully cleaned up by hot pruning. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding - GetOldestXmin
On 13 December 2012 20:03, Andres Freund and...@2ndquadrant.com wrote: Does anybody have an opinion on the attached patches? Especially 0001, which contains the procarray changes? It moves a computation of the sort of: result -= vacuum_defer_cleanup_age; if (!TransactionIdIsNormal(result)) result = FirstNormalTransactionId; inside ProcArrayLock. But I can't really imagine that to be relevant... I don't see why this is hard. Just make the lock acquisition/release conditional on another parameter. That way the only thing you'll be moving inside the lock is an if test on a constant boolean. -- Simon Riggs http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] logical decoding - GetOldestXmin
On 2012-12-16 16:44:04 +, Simon Riggs wrote: On 13 December 2012 20:03, Andres Freund and...@2ndquadrant.com wrote: Does anybody have an opinion on the attached patches? Especially 0001, which contains the procarray changes? It moves a computation of the sort of: result -= vacuum_defer_cleanup_age; if (!TransactionIdIsNormal(result)) result = FirstNormalTransactionId; inside ProcArrayLock. But I can't really imagine that to be relevant... I don't see why this is hard. Just make the lock acquisition/release conditional on another parameter. That way the only thing you'll be moving inside the lock is an if test on a constant boolean. Thats not really cheaper. Two branches + additional parameter passed/pushed vs one branch, one subtransaction, two assignments is a close call. As I don't think either really matters in the GetOldestXmin case, I would be happy with that as well. If people prefer an additional parameter + adjusting the few callsite vs. a separate function I will go that way. Greetings, Andres Freund -- Andres Freund http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] small pg_basebackup display bug
On Sat, Dec 15, 2012 at 2:24 PM, Erik Rijkers e...@xs4all.nl wrote: On Sat, December 15, 2012 14:10, Magnus Hagander wrote: On Sat, Dec 15, 2012 at 11:39 AM, Erik Rijkers e...@xs4all.nl wrote: from 9.3devel (this morning): The truncated name in parentheses only shows up during the filling of the new PGDATA; when 100% is reached that part of the displayed name is deleted. That's another bug in itself, I think: shouldn't the end-state of that line also display the parenthesized name? Those are both intentional. It may be argued that the intention shoud be changed :), but it's not a bug in itself - it's acting like intended. Fair enough. How about just changing the possible length to enable display somewhat longer names like I used? It truncates to 30 characters; perhaps something like 100 would be more reasonable? The general idea with the selection was that it shouldn't truncate on a normal terminal. 30 might be a bit on the aggressive side, but 100 would be guaranteed to almost always do it (since there is also other text on the line than just the filename). I'm not sure if we can figure out the size of the screen in a platform independent way without going through a lot of extra work though - but maybe we can? That would make such a truncation less frequent, and after all a truncated display is not particular useful. Agreed - it's useful during testing, but not in a typical production use. It might actually be more useful if it's truncated in in the other end (keeping the last 30 instead of the first 30 chars) -- Magnus Hagander Me: http://www.hagander.net/ Work: http://www.redpill-linpro.com/ -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Set visibility map bit after HOT prune
Pavan Deolasee pavan.deola...@gmail.com writes: On Sun, Dec 16, 2012 at 3:10 PM, Simon Riggs si...@2ndquadrant.com wrote: As explained above, I disagree that it looks like a good idea, and you've shown no evidence it would be or is true. Lets separate out these two issues. What you are suggesting as a follow up to Tom's idea, I've no objection to that and that might be worthwhile optimisation to try out. But this patch itself does not attempt to deal with that and its a separate work item and will require invasive changes and tests. *Whenever* we HOT prune, either in SELECT path or UPDATE path, what I'm suggesting is lets try to set the visibility map bit if the conditions are favorable. I don't believe it's clear at all that this is a good idea. If we restrict pruning to occur only when there's a fairly good chance of an ensuing HOT update, then Simon's original objection (that we're probably going to have to clear the bit again right away) has considerable force. And I agree with him that your proposed redefinition of the bit's meaning to avoid that is pretty horrid; it's ugly, complicates the invariant quite a lot, and breaks some existing usages of the bit. If we decide that we don't want to restrict pruning like that, then this patch probably has merit. But we can't evaluate the two issues independently. Another thing that would need to be considered, if we do want to restrict when pruning happens, is whether it is worth introducing some other path altogether for setting the all-visible bit. Or perhaps we should modify autovacuum's rules so that it will fire on relations for which there might be lots of unmarked all-visible pages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On Sun, 2012-12-16 at 10:20 +0100, Pavel Stehule wrote: Hello I understand to motivation, but proposed syntax is not too intuitive and robust can you do it in one function and call import only in first call? Sometimes, but it's even less intuitive and robust. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
Peter Eisentraut pete...@gmx.net writes: When you do CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ source code here $$; it internally creates a source file that contains --- def __plpython_procedure_foo_12345(): source code here --- It would be useful to be able to do something like this instead: --- some code here def __plpython_procedure_foo_12345(): some more code here --- This would especially be useful for placing imports into the first part. Sure, but wouldn't it be cleaner to do that via some language-specific syntax inside the function string? I'm imagining some syntax like CREATE FUNCTION ... AS $$ global[ some definitions here ] function code here $$; where the PL would be responsible for pulling off the global chunk and structuring what it outputs accordingly. CREATE FUNCTION already supports multiple AS items. Currently, multiple AS items are rejected for all languages but C. I'd imagine lifting that restriction and leaving it up to the validator to check it. Then any language can accept two AS items if it wants and paste them together in whichever way it needs. (The probin/prosrc naming will then become more obsolete, but it's perhaps not worth changing anything about that.) I think doing it this way is a bad idea, mainly because (1) it won't scale to more than two items (at least not without great rearrangement of pg_proc) and (2) having two otherwise-unlabeled AS items isn't at all understandable or readable. For instance, which of the two is the global part, and why? The fact that C functions do it like that is a legacy syntax we're stuck with, not a good model to copy for other languages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 07:37 AM, Peter Eisentraut wrote: I'm going to use PL/Python as an example, but I would also like to know if this could be applicable to other languages. When you do CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ source code here $$; it internally creates a source file that contains --- def __plpython_procedure_foo_12345(): source code here --- It would be useful to be able to do something like this instead: --- some code here def __plpython_procedure_foo_12345(): some more code here --- This would especially be useful for placing imports into the first part. While you can have them in the function definition, that means they are executed every time the function is called, which makes it much slower. Also, future imports are not possible this way. CREATE FUNCTION already supports multiple AS items. Currently, multiple AS items are rejected for all languages but C. I'd imagine lifting that restriction and leaving it up to the validator to check it. Then any language can accept two AS items if it wants and paste them together in whichever way it needs. (The probin/prosrc naming will then become more obsolete, but it's perhaps not worth changing anything about that.) So in practice this might look like this: CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ import x import y $$, $$ real code here $$; Comments? As an idea seems quite good, but maybe the run once part could use its own keyword in the future, something like PREPARE or REQUIRE? Or maye WITH to reuse a keyword CREATE FUNCTION foo(...) ... LANGUAGE plpythonu WITH -- this part is evaluated only once, in PLy_procedure_create $$ import x import y $$ AS -- this is compiled in the same namespace as above $$ function body here $$; WHile at it, why not also fix the functions to be real function _with_ _real_ _arguments_ , not arguments-passed-in-as-globals and at least we could call this function with its real name inside its own module (stored global namespace) so we could easily do recursion CREATE FUNCTION factorial(n bigint) returns bigint LANGUAGE plpythonu AS $$ if n==0: return 1 return factorial(n-1) * n $$; -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] small pg_basebackup display bug
Magnus Hagander mag...@hagander.net writes: On Sat, Dec 15, 2012 at 2:24 PM, Erik Rijkers e...@xs4all.nl wrote: That would make such a truncation less frequent, and after all a truncated display is not particular useful. Agreed - it's useful during testing, but not in a typical production use. It might actually be more useful if it's truncated in in the other end (keeping the last 30 instead of the first 30 chars) +1 for truncating from the left. I think pg_upgrade already does that in its progress messages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 07:03 PM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: When you do CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ source code here $$; it internally creates a source file that contains --- def __plpython_procedure_foo_12345(): source code here --- It would be useful to be able to do something like this instead: --- some code here def __plpython_procedure_foo_12345(): some more code here --- This would especially be useful for placing imports into the first part. Sure, but wouldn't it be cleaner to do that via some language-specific syntax inside the function string? I'm imagining some syntax like CREATE FUNCTION ... AS $$ global[ some definitions here ] function code here $$; where the PL would be responsible for pulling off the global chunk and structuring what it outputs accordingly. I was going to suggest some special function name to be pulled out of code passed to CREATE FUNCTION in line with CREATE FUNCTION foo(a,b,c) AS $$ import x from __future__ import nex_cool_feature def helper_function(x): ... def __pg_main__(a,b,c): defined function body here $$; so that the whole text gets compiled into module at first call and the __pg_main__ will be the function that gets called as foo(a,b,c) from postgresql but this would not be backwards compatible, at least not in any obvious way. --- Hanniu CREATE FUNCTION already supports multiple AS items. Currently, multiple AS items are rejected for all languages but C. I'd imagine lifting that restriction and leaving it up to the validator to check it. Then any language can accept two AS items if it wants and paste them together in whichever way it needs. (The probin/prosrc naming will then become more obsolete, but it's perhaps not worth changing anything about that.) I think doing it this way is a bad idea, mainly because (1) it won't scale to more than two items (at least not without great rearrangement of pg_proc) and (2) having two otherwise-unlabeled AS items isn't at all understandable or readable. For instance, which of the two is the global part, and why? The fact that C functions do it like that is a legacy syntax we're stuck with, not a good model to copy for other languages. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 07:20 PM, Hannu Krosing wrote: On 12/16/2012 07:03 PM, Tom Lane wrote: Peter Eisentraut pete...@gmx.net writes: When you do CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ source code here $$; it internally creates a source file that contains --- def __plpython_procedure_foo_12345(): source code here --- It would be useful to be able to do something like this instead: --- some code here def __plpython_procedure_foo_12345(): some more code here --- This would especially be useful for placing imports into the first part. Sure, but wouldn't it be cleaner to do that via some language-specific syntax inside the function string? I'm imagining some syntax like CREATE FUNCTION ... AS $$ global[ some definitions here ] function code here $$; where the PL would be responsible for pulling off the global chunk and structuring what it outputs accordingly. I was going to suggest some special function name to be pulled out of code passed to CREATE FUNCTION in line with CREATE FUNCTION foo(a,b,c) AS $$ import x from __future__ import nex_cool_feature def helper_function(x): ... def __pg_main__(a,b,c): defined function body here $$; so that the whole text gets compiled into module at first call and the __pg_main__ will be the function that gets called as foo(a,b,c) from postgresql On further thought the function name should just be what it is defined in postgresql, like this CREATE FUNCTION foo(a,b,c) AS $$ import x from __future__ import nex_cool_feature def helper_function(x): ... def foo(a,b,c): defined function body here def bar(i,j): function body for bar(i,j) $$ language plpythonu; if the above definition saved the whole compiled unit as module pg_functions.foo then we could define postgresql/plpython function bar() by importing the same module CREATE FUNCTION bar(a,b,c) AS $$ form pg_functions.foo import bar $$ language plpythonu; This is not as simple as this, as we still need to find the source for foo in case bar() gets called first and module foo is not yet saved, but this could be one approach to having python modules without introducing extra syntax at postgreSQL level. but this would not be backwards compatible, at least not in any obvious way. This is still unfortunately true :( Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
Hannu Krosing ha...@2ndquadrant.com writes: On further thought the function name should just be what it is defined in postgresql, like this CREATE FUNCTION foo(a,b,c) AS $$ import x from __future__ import nex_cool_feature def helper_function(x): ... def foo(a,b,c): defined function body here def bar(i,j): function body for bar(i,j) $$ language plpythonu; but this would not be backwards compatible, at least not in any obvious way. This is still unfortunately true :( Could we say that *if* the function text contains a line beginning def function_name then we interpret it as above, otherwise oldstyle? I'm not sure how big a risk of false positives there'd be. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
Hannu Krosing ha...@2ndquadrant.com writes: On further thought the function name should just be what it is defined in postgresql, like this CREATE FUNCTION foo(a,b,c) AS $$ def foo(a,b,c): BTW, how well will that play with overloaded function names? I don't particularly care for saying that PL/Python fails if you overload a function name across multiple schemas or argument lists ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 07:44 PM, Tom Lane wrote: Hannu Krosing ha...@2ndquadrant.com writes: On further thought the function name should just be what it is defined in postgresql, like this CREATE FUNCTION foo(a,b,c) AS $$ def foo(a,b,c): BTW, how well will that play with overloaded function names? I don't particularly care for saying that PL/Python fails if you overload a function name across multiple schemas or argument lists ... Currently each pl/python function gets compiled in its own python module namespace, so this is not be a problem . -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 07:37 PM, Tom Lane wrote: Hannu Krosing ha...@2ndquadrant.com writes: On further thought the function name should just be what it is defined in postgresql, like this CREATE FUNCTION foo(a,b,c) AS $$ import x from __future__ import nex_cool_feature def helper_function(x): ... def foo(a,b,c): defined function body here def bar(i,j): function body for bar(i,j) $$ language plpythonu; but this would not be backwards compatible, at least not in any obvious way. This is still unfortunately true :( Could we say that *if* the function text contains a line beginning def function_name then we interpret it as above, otherwise oldstyle? I'm not sure how big a risk of false positives there'd be. You could be inclined to define a recursive function like this under current pl/python CREATE FUNCTION factorial(n bigint) returns bigint LANGUAGE plpythonu AS $$ def factorial(n): if n==0: return 1 return factorial(n-1) * n return factorial(n) $$; but at least for functions returning a non-null value an old-style definition usually end with line in form return something Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 01:37 AM, Peter Eisentraut wrote: So in practice this might look like this: CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ import x import y $$, $$ real code here $$; Bleah. It seems obscure to say the least. Why not have something along the lines of plperl's on_init setting to load libraries? Among other things that would give you the advantage of being able to preload them, and also of some consistency among PLs. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 08:44 PM, Andrew Dunstan wrote: On 12/16/2012 01:37 AM, Peter Eisentraut wrote: So in practice this might look like this: CREATE FUNCTION foo(...) ... LANGUAGE plpythonu AS $$ import x import y $$, $$ real code here $$; Bleah. It seems obscure to say the least. Why not have something along the lines of plperl's on_init setting to load libraries? Among other things that would give you the advantage of being able to preload them, and also of some consistency among PLs. While plpython.on_init is a much needed feature, it is orthogonal to what is discussed here. AIUI Peters proposal aimed adding per-function preparation / initialisation not something to be run for initialising the whole interpreter. Also - to make the plpython.on_init really useful - there should be some way to have python _modules_ inside postgresql If we would redefine plpython functions to define their own _visible_ modules (currently each has its own module but there is no way to reference it from others) and have the function stored in this module we could not only solve the problem of plpython modules but also for calling other plpython modules directly from python . for example, if doing CREATE FUNCTION foo(i int) RETURNS int LANGUAGE plpythonu $$ def foo(i): return i+1 $$; would also make this function available as plpy.modules.foo_int.foo (meaning its global namespace would be saved as plpy.modules.foo_int then other plpy functions could call it directly by doing from plpy.modules.foo_int import foo I try to come up with a more detailed proposal along these lines. -- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On Sun, 2012-12-16 at 19:13 +0100, Hannu Krosing wrote: As an idea seems quite good, but maybe the run once part could use its own keyword in the future, something like PREPARE or REQUIRE? Well, either we do it in a language independent way, in which case this would be too prescriptive, or we do it in a Python-specific way (less likely), but prepare or require are not Python concepts. WHile at it, why not also fix the functions to be real function _with_ _real_ _arguments_ , not arguments-passed-in-as-globals and at least we could call this function with its real name inside its own module (stored global namespace) so we could easily do recursion CREATE FUNCTION factorial(n bigint) returns bigint LANGUAGE plpythonu AS $$ if n==0: return 1 return factorial(n-1) * n $$; These are also good things to fix, but are they related? Could they not be fixed independently? -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On Sun, 2012-12-16 at 13:03 -0500, Tom Lane wrote: Sure, but wouldn't it be cleaner to do that via some language-specific syntax inside the function string? I'm imagining some syntax like CREATE FUNCTION ... AS $$ global[ some definitions here ] function code here $$; where the PL would be responsible for pulling off the global chunk and structuring what it outputs accordingly. But then the language text wouldn't be Python anymore. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] WIP: index support for regexp search
On Fri, Dec 14, 2012 at 1:34 AM, Alexander Korotkov aekorot...@gmail.comwrote: On Mon, Dec 3, 2012 at 4:31 PM, Alexander Korotkov aekorot...@gmail.comwrote: Actually, I generally dislike path matrix for same reasons. But: 1) Output graphs could contain trigrams which are completely useless for search. For example, for regex /(abcdefgh)*ijk/ we need only ijk trigram while graph would contain much more.Path matrix is a method to get rid of all of them. 2) If we use color trigrams then we need some criteria for which color trigrams to expand into trigrams. Simultaneously, we shouldn't allow path from initial state to the final by unexpanded trigrams. It seems much harder to do with graph than with matrix. Now, I have an idea about doing some not comprehensive but simple and fast simplification of graph. I'm doing experiments now. In case of success we could get rid of path matrix. Attached patch have following changes: 1) Postphone expansion of colors. Graph are building on color trigrams. 2) Selective expansion of color trigrams into simple trigrams. All non-expanded color trigrams are removed. Such removal leads to union of all states pairs connected with corresponding arcs. Surely, this must no lead to union of initial and final states: that could do all previous work senseless. -- With best regards, Alexander Korotkov. trgm-regexp-0.8.patch.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Doc patch, further describe and-mask nature of the permission system
On 12/16/2012 12:56:22 AM, Peter Eisentraut wrote: On Mon, 2012-12-10 at 20:48 -0600, Karl O. Pinc wrote: On 11/14/2012 02:35:54 PM, Karl O. Pinc wrote: On 11/13/2012 08:50:55 PM, Peter Eisentraut wrote: On Sat, 2012-09-29 at 01:16 -0500, Karl O. Pinc wrote: This patch makes some sweeping statements. Unfortunately, they are wrong. I will see if anything can be salvaged. Here's another try. (I bundled changes to both paragraphs into a single patch.) grants-of-roles-are-additive_v3.patch I don't get the point of this change, especially why you are trying to liken the roles system to the object hierarchy, when they are clearly different and unrelated. It seems to me the that the permission system follows the object system hierarchy in those cases where different levels of the object hierarchy may have identical permissions. The exceptions being permissions like USAGE, which seems to be a convenient common lexical token but mean (and need to mean) something entirely different at each level of the object hierarchy. ALL is also confuses the issue, since it means all permissions which work at this level of the object hierarchy and not all permissions so, say, granting ALL to a database says nothing about INSERT permission. I'm (clearly) not steeped in the pg permission system, but it does seem that where permissions are shared between levels of the object hierarchy there is a consistency in the resulting interaction when granting/revoking at different levels of the object hierarchy. Perhaps this is ipso facto (counterexamples being automatically designated as not shared by nature of the premise :) or perhaps more an artifact of my attention than the result of any sort of design. Anyway, my intent is to point out this consistency. Since the way in which interactions between permissions set at different levels of the object hierarchy is sometimes useful I go on to describe how to replicate the behavior and apply it outside the object hierarchy. In any case I thought the elaboration would be helpful. I had a few minutes and cooked it up. If you don't don't think it should go in then reject it. As noted already in the docs, permissions are different at different levels of the object hierarchy, but similar enough to describe in one place. I was hoping to provide a possible framework for thinking about permission interactions between object hierarchy levels where such occur. Without any sort of framework everything becomes a special case and it's hard to keep track of. Thanks for spending time on it. If there's anything about it that appeals then I will continue to work under your direction. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multiple CREATE FUNCTION AS items for PLs
On 12/16/2012 10:23 PM, Peter Eisentraut wrote: On Sun, 2012-12-16 at 19:13 +0100, Hannu Krosing wrote: As an idea seems quite good, but maybe the run once part could use its own keyword in the future, something like PREPARE or REQUIRE? Well, either we do it in a language independent way, in which case this would be too prescriptive, or we do it in a Python-specific way (less likely), but prepare or require are not Python concepts. The proposed keywords names are meant to be language-independant and to signify the the part that is meant for initialisation or requirements. The multiple AS $$...$$ sections have to mean something to be useful at all. My final choce of WITH seem to both fit with run-one/required/init meaning and is already a keyword. But I'd ended preferring much more the approach of putting the whole function module in functions source code and returning as the plpython function the item matching of the defined function which can be a function or any other callable. The main problem is staying backwards compatible with existing implementation. WHile at it, why not also fix the functions to be real function _with_ _real_ _arguments_ , not arguments-passed-in-as-globals and at least we could call this function with its real name inside its own module (stored global namespace) so we could easily do recursion CREATE FUNCTION factorial(n bigint) returns bigint LANGUAGE plpythonu AS $$ if n==0: return 1 return factorial(n-1) * n $$; These are also good things to fix, but are they related? Could they not be fixed independently? They could, but fixing these together will probably result in a cleaner design :) even with your original multiple-code-strings design you end up manipulating function-global namespaces (which seem really close to modules) to put the first $$...$$ there as run-once, pre-def code. using functions real name (instead of _plpython_funcname_oid) in its module namespace is an one-line fix but to be really useful the mess with arguments-as-globals needs to be rectified. if we move to the function-code-as module approach we will no longer need to munge code (add def before code and then \t at the beginning of each line) which makes everything much cleaner. The main thing to solve is different model for passing function arguments at call time. --- Hannu -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Add big fat caution to pg_restore docs regards partial db restores
On 12/16/2012 12:51:06 AM, Peter Eisentraut wrote: I'm going to set this patch as returned with feedback for now. Ok. At this point I don't have a vision for improving it so it might sit there untouched. Maybe someone else will step forward and make it better. Regards, Karl k...@meme.com Free Software: You don't pay back, you pay forward. -- Robert A. Heinlein -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PATCH: optimized DROP of multiple tables within a transaction
Hi, I've updated the patch to include the optimization described in the previous post, i.e. if the number of relations is below a certain threshold, use a simple for loop, for large numbers of relations use bsearch calls. This is done by a new constant BSEARCH_LIMIT, which is set to 10 in the patch. Then I've modified the 'drop-test' script to take yet another argument - number of indexes for each table. So for example this ./drop-test.py 1 100 3 'dbname=test' means create 1 tables, 3 indexes for each of them, and then drop them in batches of 100 tables. Then I've run the test with 0, 1, 2, ... 11 indexes for each table for (a) unpatched HEAD (b) patch v3.1 (without the optimization) (c) patch v3.3 (with BSEARCH_LIMIT=10) and I got these results: 1) dropping one-by-one -- This is the really interesting part - the issue with the v3.1 is that for a single table, it's ~2x slower than unpatched PostgreSQL. 0 1 2 3 4 5 6 789 10 11 -- unpatched 16 28 40 52 63 75 87 99 110 121 135 147 v3.1 33 43 46 56 58 60 63 72 75 76 79 80 v3.3 16 20 23 25 29 33 36 40 44 47 79 82 The values are durations in seconds, rounded to integer values. I've run the test repeatedly and there's very small variance in the numbers. The v3.3 improves that and it's actually even faster than unpatched PostgreSQL. How can this happen? I believe it's because the code is rewritten from for each relation (r) in the drop list DropRelFileNodeAllBuffers (r) for each shared buffer check and invalidate to copy the relations from drop list to array (a) DropRelFileNodeAllBuffers(a) for each shared buffer for each relation (r) in the array check and invalidate At least that's the only explanation I was able to come up with. Yet another interesting observation is that even the v3.1 is about as fast as the unpatched code once there are 3 or more indexes (or TOAST tables). So my opinion is that the optimizated patch works as expected, and that even without the optimization the performance would be acceptable for most real-world cases. 2) dropping in transaction -- This is mostly to verify that the code did not break anything, because the optimization should not kick-in in this case at all. And that seems to be the case: 0 1 2 3 4 5 6 789 10 11 -- unpatched 13 24 35 46 58 69 81 92 103 115 127 139 v3.13 5 7 8 10 12 14 15 16 18 20 21 v3.33 4 6 7 8 10 11 13 14 15 18 20 The differences between v3.1 and v3.3 are mostly due to rounding etc. Attached is the v3.3 patch and the testing script I've been using for the tests above. Feel free to run the tests on your hardware, with your hardware, shared buffers size etc. I've run that on a 4-core i5-2500 CPU with 2GB shared buffers. Tomas #!/bin/env python import datetime import psycopg2 import sys if __name__ == '__main__': if len(sys.argv) 4: print ERORR: not enough parameters print HINT: drop-test.py num-of-tables drop-num num-of-indexes 'connection string' sys.exit(1) ntables = int(sys.argv[1]) nlimit = int(sys.argv[2]) nindex = int(sys.argv[3]) connstr = str(sys.argv[4]) debug = False conn = psycopg2.connect(connstr) cur = conn.cursor() # print 'creating %s tables' % (ntables,) start = datetime.datetime.now() for i in range(ntables): cur.execute('CREATE TABLE tab_%s (id INT)' % (i,)) for j in range(nindex): cur.execute('CREATE INDEX idx_%s_%s ON tab_%s(id)' % (i,j,i)) conn.commit(); if (i % 1000 == 0) and debug: print ' tables created: %s' % (i,) conn.commit() end = datetime.datetime.now() # print ' all tables created in %s seconds' % ((end-start).total_seconds(),) # set to autocommit mode conn.autocommit = True start = datetime.datetime.now() # print 'dropping %s tables one by one ...' % (ntables,) for i in range(ntables): cur.execute('DROP TABLE tab_%s' % (i,)) if (i % 1000 == 0) and debug: print ' tables dropped: %s' % (i,) end = datetime.datetime.now() print 'dropped one-by-one in %s seconds' % ((end-start).total_seconds(),) # cancel the autocommit mode conn.autocommit = False # recreate tables # print 'creating %s tables' % (ntables,) start = datetime.datetime.now() for i in range(ntables): cur.execute('CREATE TABLE tab_%s (id INT)' % (i,)) for j in range(nindex): cur.execute('CREATE INDEX idx_%s_%s ON tab_%s(id)' % (i,j,i)) conn.commit(); if (i % 1000 == 0) and debug: print ' tables created: %s' % (i,) conn.commit() end = datetime.datetime.now() # print ' all tables created in %s seconds' % ((end-start).total_seconds(),) # drop the tables in batches
Re: [HACKERS] too much pgbench init output
Hi, attached is a new version of the patch that (a) converts the 'log_step_seconds' variable to a constant (and does not allow changing it using a command-line option etc.) (b) keeps the current logging as a default (c) adds a -q switch that enables the new logging with a 5-second interval I'm still not convinced there should be yet another know for tuning the log interval - opinions? Tomas On 11.12.2012 10:23, Jeevan Chalke wrote: On Sun, Dec 9, 2012 at 8:11 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: On 20.11.2012 08:22, Jeevan Chalke wrote: Hi, On Tue, Nov 20, 2012 at 12:08 AM, Tomas Vondra t...@fuzzy.cz mailto:t...@fuzzy.cz mailto:t...@fuzzy.cz mailto:t...@fuzzy.cz wrote: On 19.11.2012 11:59, Jeevan Chalke wrote: Hi, I gone through the discussion for this patch and here is my review: The main aim of this patch is to reduce the number of log lines. It is also suggested to use an options to provide the interval but few of us are not much agree on it. So final discussion ended at keeping 5 sec interval between each log line. However, I see, there are two types of users here: 1. Who likes these log lines, so that they can troubleshoot some slowness and all 2. Who do not like these log lines. Who likes these lines / needs them for something useful? No idea. I fall in second category. But from the discussion, I believe some people may need detailed (or lot more) output. I've read the thread again and my impression is that no one really needs or likes those lines, but (1) it's rather pointless to print a message every 100k rows, as it usually fills the console with garbabe (2) it's handy to have regular updates of the progress I don't think there're people (in the thread) that require to keep the current amount of log messages. But there might be users who actually use the current logs to do something (although I can't imagine what). If we want to do this in a backwards compatible way, we should probably use a new option (e.g. -q) to enable the new (less verbose) logging. Do we want to allow both types of logging, or shall we keep only the new one? If both, which one should be the default one? Both the options are fine with me, but the default should be the current behaviour. So keeping these in mind, I rather go for an option which will control this. People falling in category one can set this option to very low where as users falling under second category can keep it high. So what option(s) would you expect? Something that tunes the interval length or something else? Interval length. Well, I can surely imagine something like --interval N. +1 A switch that'd choose between the old and new behavior might be a good idea, but I'd strongly vote against automagic heuristics. It makes the behavior very difficult to predict and I really don't want to force the users to wonder whether the long delay is due to general slowness of the machine or some clever logic that causes long delays between log messages. That's why I choose a very simple approach with constant time interval. It does what I was aiming for (less logs) and it's easy to predict. Sure, we could choose different interval (or make it an option). I am preferring an option for choosing an interval, say from 1 second to 10 seconds. U, why not to allow arbitrary integer? Why saying 1 to 10 seconds? Hmm.. actually, I have no issues with any number there. Just put 1..10 as we hard-coded it 5. No particular reason as such. BTW, what if, we put one log message every 10% (or 5%) with time taken (time taken for last 10% (or 5%) and cumulative) over 5 seconds ? This will have only 10 (or 20) lines per pgbench initialisation. And since we are showing time taken for each block, if any slowness happens, one can easily find a block by looking at the timings and troubleshoot it. Though 10% or 5% is again a debatable number, but keeping it constant will eliminate the requirement of an option. That's what I originally proposed in September (see the messages from 17/9), and Alvaro was not relly excited about this. Attached is a patch with fixed whitespace / indentation errors etc. Otherwise it's the same as the previous version. OK. Looks good now. Any other views / suggestions are welcome. Thanks
Re: [HACKERS] Adjusting elog behavior in bootstrap/standalone mode
On Fri, Dec 14, 2012 at 04:20:30PM -0500, Tom Lane wrote: /* Determine whether message is enabled for server log output */ if (IsPostmasterEnvironment) output_to_server = is_log_level_output(elevel, log_min_messages); else /* In bootstrap/standalone case, do not sort LOG out-of-order */ output_to_server = (elevel = log_min_messages); In view of the confusion this caused just now, I wondered if we shouldn't get rid of the special case and always follow the is_log_level_output rule. I tried modifying the code that way, and soon found that it made initdb rather noisy: I can see why it was done that way, but agreed. creating configuration files ... ok creating template1 database in /home/postgres/data/base/1 ... LOG: bogus data in postmaster.pid LOG: database system was shut down at 2012-12-14 15:55:35 EST LOG: shutting down LOG: database system is shut down One possibility is to tweak the elog call sites for these specific messages so that they are, say, NOTICE not LOG level when not IsPostmasterEnvironment. That seems like a bit of a hack, but I don't see another answer that doesn't involve behind-the-scenes decisions in elog.c ... which is exactly what I want to get rid of. Your proposed hack seems decent. The second and third lines, in particular, are rather useless here despite their utility under normal logging. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Unresolved error 0xC0000409 on Windows Server
On Sun, Dec 09, 2012 at 02:09:21PM -0500, Matthew Gerber wrote: In this situation, st_transform throws an internal_error, which my function catches and returns NULL for. The error / crash is not caused by a NULL argument; rather, it is caused by the final value in the attached script's INSERT statement, which contains a lat/lon pair that is beyond PostGIS's range. I'm not questioning whether this value is actually outside the legal range, but I do not think such an input should cause the server to crash completely. The server should not crash, no. However, the facts that PostGIS reported an internal error and the crash is responsive to your choice of geographic inputs increases the chance that the problem lies in PostGIS code, not PostgreSQL core code. Here are the steps to reproduce the crash: 1) Create a new instance of a 9.2 server (Windows 64-bit), and a new database (call it test) with the PostGIS extension. 2) Run the script: psql -U postgres -d test -f C:\server_crash.sql You should see the following: psql:C:/server_crash.sql:31: server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. psql:C:/server_crash.sql:31: connection to server was lost 3) Check your log for the error. I tried this test procedure, but I could not reproduce the crash. PostgreSQL: one click installer postgresql-9.2.2-1-windows-x64.exe; PostGIS: v2.0.1 from Stack Builder; OS: Windows Server 2008 R2 Datacenter. I needed the workaround in the last comment of this[1] bug report to get a working installation. At that point, your test procedure completes without error. What specific versions are involved in your installation? Could you try again to get a minidump and stack trace? Connect to your test database with psql; run SELECT pg_backend_pid();; open Visual Studio; select Debug - Attach to process...; select the postgres.exe process with matching ID. Run your test case; when the exception window pops up, select Break. If the stack trace does not contain full symbol information, right click on some of the incomplete lines and select Load Symbols From - Symbol Path; navigate to the location of postgres.pdb. You can select Save Dump As... from the Debug menu to create the minidump. Thanks, nm [1] http://trac.osgeo.org/postgis/ticket/1824 -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Strange errors from 9.2.1 and 9.2.2 (I hope I'm missing something obvious)
On Dec 11, 2012 9:28 PM, David Gould da...@sonic.net wrote: Thank you. I got the example via cut and paste from email and pasted it into psql on different hosts. od tells me it ends each line with: \n followed by 0xC2 0xA0 and then normal spaces. The C2A0 thing is apparently NO-BREAK SPACE. Invisible, silent, odorless but still deadly. Which will teach me not to accept text files from the sort of people who write code in Word I guess. It's not just Word... I was bitten by this last week by a WYSIWYG HTML widget I was using to write some documentation. When I copied the examples I had created out of said environment during a final technical accuracy pass and they failed to run in psql, I panicked for a few minutes. I eventually determined that, rather than just wrapping my code in pre tags, the widget had created nbsp; entities that were faithfully converted into Unicode non-breaking spaces in the psql input.