Re: [HACKERS] [COMMITTERS] pgsql: Add restart_after_crash GUC.
On Tue, Jul 20, 2010 at 9:47 AM, Robert Haas rh...@postgresql.org wrote: Log Message: --- Add restart_after_crash GUC. In postgresql.conf.sample, on/off is used as a boolean value. But true/false is used for exit_on_error and restart_after_crash. Sorry, I had overlooked that inconsistency when reviewing the original patch. I attached the bug-fix patch. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center true_false_to_on_off_0727.patch Description: Binary data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Fujii Masao wrote: On Mon, Jul 26, 2010 at 8:25 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jul 26, 2010 at 6:48 AM, Marko Tiikkaja marko.tiikk...@cs.helsinki.fi wrote: On 7/26/10 1:44 PM +0300, Fujii Masao wrote: On Mon, Jul 26, 2010 at 6:36 PM, Yeb Havingayebhavi...@gmail.com wrote: I wasn't entirely clear. My suggestion was to have only acknowledge_commit = {no|recv|fsync|replay} instead of replication_mode = {async|recv|fsync|replay} Okay, I'll change the patch accordingly. For what it's worth, I think replication_mode is a lot clearer. Acknowledge_commit sounds like it would do something similar to asynchronous_commit. I agree. As the result of the vote, I'll leave the parameter replication_mode as it is. I'd like to bring forward another suggestion (please tell me when it is becoming spam). My feeling about replication_mode as is, is that is says in the same parameter something about async or sync, as well as, if sync, which method of feedback to the master. OTOH having two parameters would need documentation that the feedback method may only be set if the replication_mode was sync, as well as checks. So it is actually good to have it all in one parameter But somehow the shoe pinches, because async feels different from the other three parameters. There is a way to move async out of the enumeration: synchronous_replication_mode = off | recv | fsync | replay This also looks a bit like the synchronous_replication = N # similar in name to synchronous_commit Simon Riggs proposed in http://archives.postgresql.org/pgsql-hackers/2010-05/msg01418.php regards, Yeb Havinga PS: Please bear with me, I thought a bit about a way to make clear what deduction users must make when figuring out if the replication mode is synchronous. That question might be important when counting 'which servers are the synchronous standbys' to debug quorum settings. replication_mode from the assumption !async - sync and !async - recv|fsync|replay to infer recv|fsync|replay - synchronous_replication. synchronous_replication_mode from the assumption !off - on and !off - recv|fsync|replay to infer recv|fsync|replay - synchronous_replication. I think the last one is easier made by humans, since everybody will make the !off- on assumption, but not the !async - sync without having that verified in the documentation. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Joshua Tolley wrote: Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum idea is really the best thing for us. For reference: it appeared in a long thread a while ago http://archives.postgresql.org/pgsql-hackers/2010-05/msg01226.php. In short, there are three different modes: availability, performance, and protection. Protection appears to mean that at least one standby has applied the log; availability means at least one standby has received the log info Maybe we could do both, by describing use cases along the availability, performance and protection setups in the documentation and how they would be reflected with the standby related parameters. regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Fujii Masao wrote: The attached patch changes the backend so that it signals walsender to wake up from the sleep and send WAL immediately. It doesn't include any other synchronous replication stuff. Hello Fujii, I noted the changes in XlogSend where instead of *caughtup = true/false it now returns !MyWalSnd-sndrqst. That value is initialized to false in that procedure and it cannot be changed to true during execution of that procedure, or can it? regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 7:39 PM, Yeb Havinga yebhavi...@gmail.com wrote: Fujii Masao wrote: The attached patch changes the backend so that it signals walsender to wake up from the sleep and send WAL immediately. It doesn't include any other synchronous replication stuff. Hello Fujii, Thanks for the review! I noted the changes in XlogSend where instead of *caughtup = true/false it now returns !MyWalSnd-sndrqst. That value is initialized to false in that procedure and it cannot be changed to true during execution of that procedure, or can it? That value is set to true in WalSndWakeup(). If WalSndWakeup() is called after initialization of that value in XLogSend(), *caughtup is set to false. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 5:42 PM, Yeb Havinga yebhavi...@gmail.com wrote: I'd like to bring forward another suggestion (please tell me when it is becoming spam). My feeling about replication_mode as is, is that is says in the same parameter something about async or sync, as well as, if sync, which method of feedback to the master. OTOH having two parameters would need documentation that the feedback method may only be set if the replication_mode was sync, as well as checks. So it is actually good to have it all in one parameter But somehow the shoe pinches, because async feels different from the other three parameters. There is a way to move async out of the enumeration: synchronous_replication_mode = off | recv | fsync | replay ISTM that we need to get more feedback from users to determine which is the best. So, how about leaving the parameter as it is and revisiting this topic later? Since it's not difficult to change the parameter later, we will not regret even if we delay that determination. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Fujii Masao wrote: I noted the changes in XlogSend where instead of *caughtup = true/false it now returns !MyWalSnd-sndrqst. That value is initialized to false in that procedure and it cannot be changed to true during execution of that procedure, or can it? That value is set to true in WalSndWakeup(). If WalSndWakeup() is called after initialization of that value in XLogSend(), *caughtup is set to false. Ah, so it can be changed by another backend process. Another question: Is there a reason not to send the signal in XlogFlush itself, so it would be called at CreateCheckPoint(), EndPrepare(), FlushBuffer(), RecordTransactionAbortPrepared(), RecordTransactionCommit(), RecordTransactionCommitPrepared(), RelationTruncate(), SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(), and xact_redo_commit(). regards, Yeb Havinga -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 01:41:10PM +0900, Fujii Masao wrote: On Tue, Jul 27, 2010 at 12:36 PM, Joshua Tolley eggyk...@gmail.com wrote: Perhaps I'm hijacking the wrong thread for this, but I wonder if the quorum idea is really the best thing for us. I've been thinking about Oracle's way of doing things[1]. In short, there are three different modes: availability, performance, and protection. Protection appears to mean that at least one standby has applied the log; availability means at least one standby has received the log info (it doesn't specify whether that info has been fsynced or applied, but presumably does not mean applied, since it's distinct from protection mode); performance means replication is asynchronous. I'm not sure this method is perfect, but it might be simpler than the quorum behavior that has been considered, and adequate for actual use cases. In my case, I'd like to set up one synchronous standby on the near rack for high-availability, and one asynchronous standby on the remote site for disaster recovery. Can Oracle's way cover the case? I don't think it can support the case you're interested in, though I'm not terribly expert on it. I'm definitely not arguing for the syntax Oracle uses, or something similar; I much prefer the flexibility we're proposing, and agree with Yeb Havinga in another email who suggests we spell out in documentation some recipes for achieving various possible scenarios given whatever GUCs we settle on. availability mode with two standbys might create a sort of similar situation. That is, since the ACK from the near standby arrives in first, the near standby acts synchronous and the remote one does asynchronous. But the ACK from the remote standby can arrive in first, so it's not guaranteed that the near standby has received the log info before transaction commit returns a success to the client. In this case, we have to failover to the remote standby even if it's not under control of a clusterware. This is a problem for me. My concern is that in a quorum system, if the quorum number is less than the total number of replicas, there's no way to know *which* replicas composed the quorum for any given transaction, so we can't know which servers to fail to if the master dies. This isn't different from Oracle, where it looks like essentially the quorum value is always 1. Your scenario shows that all replicas are not created equal, and that sometimes we'll be interested in WAL getting committed on a specific subset of the available servers. If I had two nearby replicas called X and Y, and one at a remote site called Z, for instance, I'd set quorum to 2, but really I'd want to say wait for server X and Y before committing, but don't worry about Z. I have no idea how to set up our GUCs to encode a situation like that :) -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 8:48 PM, Yeb Havinga yebhavi...@gmail.com wrote: Is there a reason not to send the signal in XlogFlush itself, so it would be called at CreateCheckPoint(), EndPrepare(), FlushBuffer(), RecordTransactionAbortPrepared(), RecordTransactionCommit(), RecordTransactionCommitPrepared(), RelationTruncate(), SlruPhysicalWritePage(), write_relmap_file(), WriteTruncateXlogRec(), and xact_redo_commit(). Yes, it's because there is no need to send WAL immediately in other than the following functions: * EndPrepare() * RecordTransactionAbortPrepared() * RecordTransactionCommit() * RecordTransactionCommitPrepared() Some functions call XLogFlush() to follow the basic WAL rule. In the standby, WAL records are always flushed to disk prior to any corresponding data-file change. So, we don't need to replicate the result of XLogFlush() immediately for the WAL rule. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley eggyk...@gmail.com wrote: I don't think it can support the case you're interested in, though I'm not terribly expert on it. I'm definitely not arguing for the syntax Oracle uses, or something similar; I much prefer the flexibility we're proposing, and agree with Yeb Havinga in another email who suggests we spell out in documentation some recipes for achieving various possible scenarios given whatever GUCs we settle on. Agreed. I'll add it to my TODO list. My concern is that in a quorum system, if the quorum number is less than the total number of replicas, there's no way to know *which* replicas composed the quorum for any given transaction, so we can't know which servers to fail to if the master dies. What about checking the current WAL receive location of each standby by using pg_last_xlog_receive_location()? The standby which has the newest location should be failed over to. This isn't different from Oracle, where it looks like essentially the quorum value is always 1. Your scenario shows that all replicas are not created equal, and that sometimes we'll be interested in WAL getting committed on a specific subset of the available servers. If I had two nearby replicas called X and Y, and one at a remote site called Z, for instance, I'd set quorum to 2, but really I'd want to say wait for server X and Y before committing, but don't worry about Z. I have no idea how to set up our GUCs to encode a situation like that :) Yeah, quorum commit alone cannot cover that situation. I think that current approach (i.e., quorum commit plus replication mode per standby) would cover that. In your example, you can choose recv, fsync or replay as replication_mode in X and Y, and choose async in Z. Regards, -- Fujii Masao NIPPON TELEGRAPH AND TELEPHONE CORPORATION NTT Open Source Software Center -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
On Tue, Jul 27, 2010 at 10:53:45PM +0900, Fujii Masao wrote: On Tue, Jul 27, 2010 at 10:12 PM, Joshua Tolley eggyk...@gmail.com wrote: My concern is that in a quorum system, if the quorum number is less than the total number of replicas, there's no way to know *which* replicas composed the quorum for any given transaction, so we can't know which servers to fail to if the master dies. What about checking the current WAL receive location of each standby by using pg_last_xlog_receive_location()? The standby which has the newest location should be failed over to. That makes sense. Thanks. This isn't different from Oracle, where it looks like essentially the quorum value is always 1. Your scenario shows that all replicas are not created equal, and that sometimes we'll be interested in WAL getting committed on a specific subset of the available servers. If I had two nearby replicas called X and Y, and one at a remote site called Z, for instance, I'd set quorum to 2, but really I'd want to say wait for server X and Y before committing, but don't worry about Z. I have no idea how to set up our GUCs to encode a situation like that :) Yeah, quorum commit alone cannot cover that situation. I think that current approach (i.e., quorum commit plus replication mode per standby) would cover that. In your example, you can choose recv, fsync or replay as replication_mode in X and Y, and choose async in Z. Clearly I need to read through the GUCs and docs better. I'll try to keep quiet until that's finished :) -- Joshua Tolley / eggyknap End Point Corporation http://www.endpoint.com signature.asc Description: Digital signature
Re: [HACKERS] [COMMITTERS] pgsql: Add restart_after_crash GUC.
On Tue, Jul 27, 2010 at 2:33 AM, Fujii Masao masao.fu...@gmail.com wrote: On Tue, Jul 20, 2010 at 9:47 AM, Robert Haas rh...@postgresql.org wrote: Log Message: --- Add restart_after_crash GUC. In postgresql.conf.sample, on/off is used as a boolean value. But true/false is used for exit_on_error and restart_after_crash. Sorry, I had overlooked that inconsistency when reviewing the original patch. I attached the bug-fix patch. Good catch, thanks. Committed. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Synchronous replication
Le 27 juil. 2010 à 15:12, Joshua Tolley eggyk...@gmail.com a écrit : My concern is that in a quorum system, if the quorum number is less than the total number of replicas, there's no way to know *which* replicas composed the quorum for any given transaction, so we can't know which servers to fail to if the master dies. This isn't different from Oracle, where it looks like essentially the quorum value is always 1. Your scenario shows that all replicas are not created equal, and that sometimes we'll be interested in WAL getting committed on a specific subset of the available servers. If I had two nearby replicas called X and Y, and one at a remote site called Z, for instance, I'd set quorum to 2, but really I'd want to say wait for server X and Y before committing, but don't worry about Z. I have no idea how to set up our GUCs to encode a situation like that :) You make it so that Z does not take a vote, by setting it async. Regards, -- dim -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] merge command - GSoC progress
On Tue, Jul 27, 2010 at 1:04 AM, Boxuan Zhai bxzhai2...@gmail.com wrote: I have get a edition that the merge command can run. It accept the standard merge command and can do UPDATE, INSERT and DELETE actions now. But we cannot put additional qualification for actions. There are some bugs when we try to evaluate the quals which make the system quit. I will fix it soon. This patch doesn't compile. You're using zbxprint() from a bunch of places where it's not defined. I get compile warnings for all of those files and then a link failure at the end. You might find it useful to create src/Makefile.custom in your local tree and put COPT=-Werror in there; it tends to prevent problems of this kind. Undefined symbols: _zbxprint, referenced from: _transformStmt in analyze.o _ExecInitMergeAction in nodeModifyTable.o _ExecModifyTable in nodeModifyTable.o _ExecInitModifyTable in nodeModifyTable.o _merge_action_planner in planner.o Not that it's as high-priority as getting this fully working, but you should revert the useless changes in this patch - e.g. the one-line change to heaptuple.c is obvious debugging leftovers, and all of the changes to execQual.c and execUtil.c are whitespace-only. You should also try to make your code and comments conform to project style guidelines. In general, you'll find it easier to keep track of your changes (and you'll have fewer spurious changes) if you use git diff master...yourbranch instead of marking comments, etc. with ZBX or similar. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] PostGIS vs. PGXS in 9.0beta3
Hackers, A 9.0b3 tester reported this issue with our single most popular PostgreSQL extension, PostGIS: == Postgis makes use of 'PGXS' in postgresql 8.2. Within postgresql-9, datadir and many other variables are defined as multiple values with an append operator, like this: $ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global [snip] datadir := /usr/pgsql-9.0/share Postgis-1.5.1's Makefile.pgxs makefile override tries to use datadir as a flat variable, doesn't try to expand the array-like structure with a for loop or similar. So when 'make install' within postgis-1.5 configured against pgsql-9.x is ran, 'make' treats the second value within DATADIR as a command it should just run, which fails: http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html I've tried the latest tarball SVN exports of both postgis-1.5.x and postgis-2.x without success. == I'm unsure of what the resolution of this issue should be ... it seems like the fix belongs in PostGIS ... but I also think we can't go final until this is resolved, given. -- -- Josh Berkus PostgreSQL Experts Inc. http://www.pgexperts.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Query optimization problem
Robert Haas robertmh...@gmail.com writes: On Tue, Jul 20, 2010 at 11:23 AM, Dimitri Fontaine dfonta...@hi-media.com wrote: The specific diff between the two queries is : JOIN DocPrimary d2 ON d2.BasedOn=d1.ID - WHERE (d1.ID=234409763) or (d2.ID=234409763) + WHERE (d2.BasedOn=234409763) or (d2.ID=234409763) So the OP would appreciate that the planner is able to consider applying the restriction on d2.BasedOn rather than d1.ID given that d2.BasedOn is the same thing as d1.ID, from the JOIN. I have no idea if Equivalence Classes are where to look for this, and if they're meant to extend up to there, and if that's something possible or wise to implement, though. I was thinking of the equivalence class machinery as well. I think the OR clause may be the problem. If you just had d1.ID=constant, I think it would infer that d1.ID, d2.BasedOn, and the constant formed an equivalence class. Right. Because of the OR, it is *not* possible to conclude that d2.basedon is always equal to 234409763, which is the implication of putting them into an equivalence class. In the example, we do have d1.id and d2.basedon grouped in an equivalence class. So in principle you could substitute d1.id into the WHERE clause in place of d2.basedon, once you'd checked that it was being used with an operator that's compatible with the specific equivalence class (ie it's in one of the eclass's opfamilies, I think). The problem is to recognize that such a rewrite would be a win --- it could just as easily be a big loss. Even if we understood how to direct the rewriting process, I'm really dubious that it would win often enough to justify the added planning time. The particular problem here seems narrow enough that solving it on the client side is probably a whole lot easier and cheaper than trying to get the planner to do it. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
On Tue, Jul 27, 2010 at 1:13 PM, Josh Berkus j...@agliodbs.com wrote: http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html It's not obvious that there's an unresolved issue here; downthread there's some indication this might be an environment problem? http://postgis.refractions.net/pipermail/postgis-users/2010-May/026658.html -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] do we need to postpone beta4?
I think we should consider postponing beta4. I count eleven non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there are currently five items on the open 9.0 issues list, at least one of which appears to be a new bug in 9.0, and we just got a bug report on pgsql-bugs from Valentine Gogichashvili complaining of what looks to be a crasher in the redo path for heap_xlog_update(). It seems unlikely at this point that we can have all of these issues fixed and still have time for a full buildfarm cycle before the wrap. Does it make sense to put out a beta with known bugs (presumably requiring another beta) at this point, or should we push this off a bit? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
Robert Haas robertmh...@gmail.com writes: I think we should consider postponing beta4. I count eleven non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there are currently five items on the open 9.0 issues list, at least one of which appears to be a new bug in 9.0, and we just got a bug report on pgsql-bugs from Valentine Gogichashvili complaining of what looks to be a crasher in the redo path for heap_xlog_update(). It seems unlikely at this point that we can have all of these issues fixed and still have time for a full buildfarm cycle before the wrap. Does it make sense to put out a beta with known bugs (presumably requiring another beta) at this point, or should we push this off a bit? If we don't wrap a beta this week, the next possible window is several weeks away, because various people will be on vacation. So I think we should get the existing fixes out there, even if there are known bugs remaining. A beta is not an RC. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] page corruption on 8.3+ that makes it to standby
I reported a problem here: http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php Perhaps I used a poor subject line, but I believe it's a serious issue. That reproducible sequence seems like an obvious bug to me on 8.3+, and what's worse, the corruption propagates to the standby as I found out today (through a test, fortunately). The only mitigating factor is that it doesn't actually lose data, and you can fix it (I believe) with zero_damaged_pages (or careful use of dd). There are two fixes that I can see: 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and PageSetTLI() if the page is not new. This seems slightly awkward because most WAL replay stuff doesn't have to worry about zero pages, but in this case I think it does. 2. Have copy_relation_data() initialize new pages. I don't like this because (a) it's not really the job of SET TABLESPACE to clean up zero pages; and (b) it could be an index with different special size, etc., and it doesn't seem like a good place to figure that out. Comments? Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, 2010-07-27 at 14:11 -0400, Tom Lane wrote: Robert Haas robertmh...@gmail.com writes: I think we should consider postponing beta4. I count eleven non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there are currently five items on the open 9.0 issues list, at least one of which appears to be a new bug in 9.0, and we just got a bug report on pgsql-bugs from Valentine Gogichashvili complaining of what looks to be a crasher in the redo path for heap_xlog_update(). It seems unlikely at this point that we can have all of these issues fixed and still have time for a full buildfarm cycle before the wrap. Does it make sense to put out a beta with known bugs (presumably requiring another beta) at this point, or should we push this off a bit? If we don't wrap a beta this week, the next possible window is several weeks away, because various people will be on vacation. So I think we should get the existing fixes out there, even if there are known bugs remaining. A beta is not an RC. If we document the bugs, then +1, if we don't -1. E.g; we let people know where we KNOW there are warts. JD regards, tom lane -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Robert Haas wrote: On Tue, Jul 27, 2010 at 1:13 PM, Josh Berkus j...@agliodbs.com wrote: http://postgis.refractions.net/pipermail/postgis-users/2010-May/026654.html It's not obvious that there's an unresolved issue here; downthread there's some indication this might be an environment problem? http://postgis.refractions.net/pipermail/postgis-users/2010-May/026658.html No, I have just reproduced this with a totally vanilla environment. I think PostGIS probably needs to patch their makefile(s). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] git config user.email
Robert Haas robertmh...@gmail.com writes: On Thu, Jul 22, 2010 at 5:41 AM, Magnus Hagander mag...@hagander.net wrote: *Personally*, I'd prefer to keep using my main email address for commits. As for me, I'd much prefer to be rh...@postgresql.org than robertmh...@gmail.com. Prefer is exactly the key word here. I see no reason not to let each committer exercise his personal preference as to which address to use. We should suggest that reasonably stable ones be chosen, but it's not the project's business to make that decision for people. And in any case it's impossible to be sure of the longevity of email addresses more than a few years out, unless your crystal ball works a lot better than mine. (My own take is that I absolutely refuse to use t...@postgresql.org as a primary mail address. Its spam filtering is nearly nonexistent. What comes through there is not *quite* filed directly to /dev/null here, but it's darn close to that.) regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] ALTER TABLE ... DISABLE TRIGGER vs. AccessExclusiveLock
Hackers, Experience and a read through backend/commands/tablecmds.c's AlterTable() indicate that ALTER TABLE ... DISABLE TRIGGER obtains an exclusive lock on the table (as does any ALTER TABLE). Blocking other readers from a table when we've, within the body of a transaction performing a bulk update operation where we don't want / need triggers to fire, seems at first glance to be over-kill. I can see how AlterTable()'s complex logic is made less complex through 'get and keep a big lock', since most of its operational modes really do need exclusive access, but is it strictly required for ... DISABLE / REENABLE TRIGGER? Could, say, RowExclusiveLock hypothetically provide adequate protection, allowing concurrent reads, but blocking out any other writers (for ENABLE / DISABLE TRIGGER) -- such as if driven through a new statement other than ALTER TABLE -- such as DISABLE TRIGGER foo ON tbar ? Thanks! James Robinson Socialserve.com -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I think we should consider postponing beta4. I count eleven non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there are currently five items on the open 9.0 issues list, at least one of which appears to be a new bug in 9.0, and we just got a bug report on pgsql-bugs from Valentine Gogichashvili complaining of what looks to be a crasher in the redo path for heap_xlog_update(). It seems unlikely at this point that we can have all of these issues fixed and still have time for a full buildfarm cycle before the wrap. Does it make sense to put out a beta with known bugs (presumably requiring another beta) at this point, or should we push this off a bit? If we don't wrap a beta this week, the next possible window is several weeks away, because various people will be on vacation. So I think we should get the existing fixes out there, even if there are known bugs remaining. A beta is not an RC. Well, that's pretty much saying we won't release before September. Which is kind of a bummer, but I guess that's what happens when we get into vacation season. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] ALTER TABLE ... DISABLE TRIGGER vs. AccessExclusiveLock
On Tue, Jul 27, 2010 at 3:07 PM, James Robinson jlrob...@socialserve.com wrote: Experience and a read through backend/commands/tablecmds.c's AlterTable() indicate that ALTER TABLE ... DISABLE TRIGGER obtains an exclusive lock on the table (as does any ALTER TABLE). Blocking other readers from a table when we've, within the body of a transaction performing a bulk update operation where we don't want / need triggers to fire, seems at first glance to be over-kill. I can see how AlterTable()'s complex logic is made less complex through 'get and keep a big lock', since most of its operational modes really do need exclusive access, but is it strictly required for ... DISABLE / REENABLE TRIGGER? Could, say, RowExclusiveLock hypothetically provide adequate protection, allowing concurrent reads, but blocking out any other writers (for ENABLE / DISABLE TRIGGER) -- such as if driven through a new statement other than ALTER TABLE -- such as DISABLE TRIGGER foo ON tbar ? Funny you should mention this. There is a pending patch to do something very much along these line. https://commitfest.postgresql.org/action/patch_view?id=347 -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Preliminary review of Synchronous Replication patches
Kevin Grittner wrote: Unless there are objections, I will mark the patch by Zoltán Böszörményi as Returned with Feedback in a couple days, and ask that everyone interested in this feature focus on advancing the patch by Fujii Masao. Given the scope and importance of this area, I think we could benefit from another person or two signing on officially as Reviewers. Hello Zoltán, Fujii, Kevin and list, Here follows a severly time-boxed review of both synchronous replication patches. Expect this review to be incomplete, though I believe the general remarks to be valid. Off the list I've received word from Zoltan that work on a new patch is planned. It would be great if ideas from both patches could be merged into one. regards, Yeb Havinga Patch A: Zoltán Böszörményi Patch B: Fujii Masao * Is the patch in context diff format? A: Yes B: Yes * Does it apply cleanly to the current CVS HEAD? A: No B: Yes * Does it include reasonable tests, necessary doc patches, etc? A: Doc patches, and a contrib module for debugging purposes. B: Doc patches. For neither patch the documentation is final, see e.g. * 25.2 - The section starting with It should be noted that the log shipping is asynchronous, i.e., should be updated. * 25.2.5 - Dito for Streaming replication is asynchronous, so there... Testing any replication setup is not possible with the current regression tool suite, and adding that is beyond the scope of any synchronous replication patch. Perhaps the work of Markus Wanner with dtester (that adds a make dcheck) can be assimilated for this purpose. Both patches add synchronous replication to the current single-master streaming replication features in 9.0, which means that there now is a mechanism where a commit on the master does not finish until 'some or more of the replicas are in sync' * Does the patch actually implement that? A: Yes B: No, if no standbys are connected commits to not wait. (If I understand the code from WaitXLogSend correct) * Do we want that? For database users who do never want to loose a single committed record, synchronous replication is a relatively easy setup to achieve a high level of 'security' (where secure means: less chance of losing data). The easiness is relative to current solutions (the whole range of mirrored disks, controllers, backups, log shipping etc). * Do we already have it? No * Does it follow SQL spec, or the community-agreed behavior? A: Unknown, though the choices in guc parameters suggest the patch's been made to support actual use cases. B: Design of patch B has been discussed on the mailing list as well as the wiki (for links I refer to my previous mail) * Are there dangers? Besides the usual errors causing transactions to fail, in my opinion the single largest danger would be that the master thinks a standby is in sync, where in fact it isn't. * Have all the bases been covered? Patch A seems to cover two-phase commit where patch B does not. (I'm time constrained and currently do not have a replicated cluster with patch A anymore, so I cannot test). # Does the feature work as advertised? Patch A: yes Patch B: yes I ran a few tests with failures and had some recoverable problems, of which I'm unsure they are related to the sync rep patches or streaming replication in general. Work in the direction of a replication regression test would be useful. # Are there any assertion failures or crashes? No A note: reading source code of both patches makes it clear that these are patches from experienced PostgreSQL programmers. The source code reads just like 'normal' high quality postgresql source. Differences: Patch A synchronizes by sending back the Xids that have been received and written to disk. When the master receives the xids, the respective backends with having those xids active are unlocked and signalled to continue. This means some locking of the procarray. The libpq protocol was adapted so a client (the walreceiver) can report back the xids. Patch B synchronizes by waiting to send wal data, until the sync replicas have sending back LSN pointers in the wal files they have synced to, in one of three ways. The libpq protocol was adapted so the walreceiver can report back WAL file positions. Perhaps the weakness of both patches is that they are not one. In my opinion, both patches have their strengths. It would be great if these strenght could be unified in a single patch. patch A strengths: * design of the guc parameters - meaning of min_sync_replication_clients. As I understand it, it is not possible with patch B to define a minimum number of synchronous standby servers a transaction must be replicated to, before the commit succeeds. Perhaps the name could be changed (quorum_min_sync_standbys?), but in my opinion the definitive sync replication patch needs a parameter with this number and exact meaning. The 'synchronous_slave' guc in boolean is also a good one in my opinion. patch B strengths: * having the walsender synced by waiting for
Re: [HACKERS] Preliminary review of Synchronous Replication patches
Kevin Grittner wrote: Unless there are objections, I will mark the patch by Zoltán Böszörményi as Returned with Feedback in a couple days, and ask that everyone interested in this feature focus on advancing the patch by Fujii Masao. Given the scope and importance of this area, I think we could benefit from another person or two signing on officially as Reviewers. Hello Zoltán, Fujii, Kevin and list, Here follows a severly time-boxed review of both synchronous replication patches. Expect this review to be incomplete, though I believe the general remarks to be valid. Off the list I've received word from Zoltan that work on a new patch is planned. It would be great if ideas from both patches could be merged into one. regards, Yeb Havinga Patch A: Zoltán Böszörményi Patch B: Fujii Masao * Is the patch in context diff format? A: Yes B: Yes * Does it apply cleanly to the current CVS HEAD? A: No B: Yes * Does it include reasonable tests, necessary doc patches, etc? A: Doc patches, and a contrib module for debugging purposes. B: Doc patches. For neither patch the documentation is final, see e.g. * 25.2 - The section starting with It should be noted that the log shipping is asynchronous, i.e., should be updated. * 25.2.5 - Dito for Streaming replication is asynchronous, so there... Testing any replication setup is not possible with the current regression tool suite, and adding that is beyond the scope of any synchronous replication patch. Perhaps the work of Markus Wanner with dtester (that adds a make dcheck) can be assimilated for this purpose. Both patches add synchronous replication to the current single-master streaming replication features in 9.0, which means that there now is a mechanism where a commit on the master does not finish until 'some or more of the replicas are in sync' * Does the patch actually implement that? A: Yes B: No, if no standbys are connected commits to not wait. (If I understand the code from WaitXLogSend correct) * Do we want that? For database users who do never want to loose a single committed record, synchronous replication is a relatively easy setup to achieve a high level of 'security' (where secure means: less chance of losing data). The easiness is relative to current solutions (the whole range of mirrored disks, controllers, backups, log shipping etc). * Do we already have it? No * Does it follow SQL spec, or the community-agreed behavior? A: Unknown, though the choices in guc parameters suggest the patch's been made to support actual use cases. B: Design of patch B has been discussed on the mailing list as well as the wiki (for links I refer to my previous mail) * Are there dangers? Besides the usual errors causing transactions to fail, in my opinion the single largest danger would be that the master thinks a standby is in sync, where in fact it isn't. * Have all the bases been covered? Patch A seems to cover two-phase commit where patch B does not. (I'm time constrained and currently do not have a replicated cluster with patch A anymore, so I cannot test). # Does the feature work as advertised? Patch A: yes Patch B: yes I ran a few tests with failures and had some recoverable problems, of which I'm unsure they are related to the sync rep patches or streaming replication in general. Work in the direction of a replication regression test would be useful. # Are there any assertion failures or crashes? No A note: reading source code of both patches makes it clear that these are patches from experienced PostgreSQL programmers. The source code reads just like 'normal' high quality postgresql source. Differences: Patch A synchronizes by sending back the Xids that have been received and written to disk. When the master receives the xids, the respective backends with having those xids active are unlocked and signalled to continue. This means some locking of the procarray. The libpq protocol was adapted so a client (the walreceiver) can report back the xids. Patch B synchronizes by waiting to send wal data, until the sync replicas have sending back LSN pointers in the wal files they have synced to, in one of three ways. The libpq protocol was adapted so the walreceiver can report back WAL file positions. Perhaps the weakness of both patches is that they are not one. In my opinion, both patches have their strengths. It would be great if these strenght could be unified in a single patch. patch A strengths: * design of the guc parameters - meaning of min_sync_replication_clients. As I understand it, it is not possible with patch B to define a minimum number of synchronous standby servers a transaction must be replicated to, before the commit succeeds. Perhaps the name could be changed (quorum_min_sync_standbys?), but in my opinion the definitive sync replication patch needs a parameter with this number and exact meaning. The 'synchronous_slave' guc in boolean is also a good one in my opinion. patch B
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, Jul 27, 2010 at 2:06 PM, Jeff Davis pg...@j-davis.com wrote: I reported a problem here: http://archives.postgresql.org/pgsql-bugs/2010-07/msg00173.php Perhaps I used a poor subject line, but I believe it's a serious issue. That reproducible sequence seems like an obvious bug to me on 8.3+, and what's worse, the corruption propagates to the standby as I found out today (through a test, fortunately). I think that the problem is not so much your choice of subject line as your misfortune to discover this bug when Tom and Heikki were both on vacation. The only mitigating factor is that it doesn't actually lose data, and you can fix it (I believe) with zero_damaged_pages (or careful use of dd). There are two fixes that I can see: 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and PageSetTLI() if the page is not new. This seems slightly awkward because most WAL replay stuff doesn't have to worry about zero pages, but in this case I think it does. 2. Have copy_relation_data() initialize new pages. I don't like this because (a) it's not really the job of SET TABLESPACE to clean up zero pages; and (b) it could be an index with different special size, etc., and it doesn't seem like a good place to figure that out. It appears to me that all of the callers of log_newpage() other than copy_relation_data() do so with pages that they've just constructed, and which therefore can't be new. So maybe we could just modify copy_relation_data to check PageIsNew(buf), or something like that, and only call log_newpage() if that returns true. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
Robert Haas wrote: On Tue, Jul 27, 2010 at 2:11 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: I think we should consider postponing beta4. ?I count eleven non-documentation, 9.0-specific bug fix on REL9_0_STABLE, but there are currently five items on the open 9.0 issues list, at least one of which appears to be a new bug in 9.0, and we just got a bug report on pgsql-bugs from Valentine Gogichashvili complaining of what looks to be a crasher in the redo path for heap_xlog_update(). ?It seems unlikely at this point that we can have all of these issues fixed and still have time for a full buildfarm cycle before the wrap. ?Does it make sense to put out a beta with known bugs (presumably requiring another beta) at this point, or should we push this off a bit? If we don't wrap a beta this week, the next possible window is several weeks away, because various people will be on vacation. ?So I think we should get the existing fixes out there, even if there are known bugs remaining. ?A beta is not an RC. Well, that's pretty much saying we won't release before September. Which is kind of a bummer, but I guess that's what happens when we get into vacation season. Yeah, if we are lucky we can do RC1 in mid-August and still release final in August, but that assumes everything is done by then, and that we only need 1-2 RCs. Early September looks more likely. The good thing is that we are deciding now and are continually seeing if we can tighten the schedule. (If we stop trying to tighten the schedule, we get a lose schedule, which no one likes.) -- Bruce Momjian br...@momjian.ushttp://momjian.us EnterpriseDB http://enterprisedb.com + It's impossible for everything to be true. + -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] multibyte charater set in levenshtein function
Here is new version of my patch. There are following changes: 1) I've merged singlebyte and multibyte versions of levenshtein_internal and levenshtein_less_equal_internal using macros and includes. 2) I found that levenshtein takes reasonable time even for long strings. There is an example with strings which contain random combinations of words. test=# select count(*) from words; count --- 98569 (1 row) test=# select * from words limit 10; id | word| next_id ---++- 200 | Agnew diodes drilled composts Wassermann nonprofit adjusting Chautauqua| 17370 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither| 70983 5418 | Eroses gauchos bowls smellier weeklies tormentors cornmeal punctuality | 96685 6103 | G prompt boron overthrew cricking begonia snuffers blazed | 34518 4707 | Djibouti Mari pencilling approves pointing's pashas magpie rams| 71200 10125 | Lyle Edgardo livers gruelled capable cancels gnawing's inhospitable| 28018 9615 | Leos deputy evener yogis assault palatals Octobers surceased | 21878 15640 | Starr's arrests malapropism Koufax's aesthetes Howell rustier Algeria | 19316 15776 | Sucre battle's misapprehension's predicating nutmegged electrification bowl planks | 65739 16878 | Updike Forster ragout keenly exhalation grayed switch guava's | 43013 (10 rows) Time: 26,125 ms Time: 137,061 ms test=# select avg(length(word)) from words; avg - 74.6052207083362923 (1 row) Time: 96,415 ms test=# select * from words where levenshtein(word, 'Dhaka craziness savvies teeae luhs Barentss unwe zher')=10; id | word | next_id --+-+- 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither | 70983 (1 row) Time: 2957,426 ms test=# select * from words where levenshtein_less_equal(word, 'Dhaka craziness savvies teeae luhs Barentss unwe zher',10)=10; id | word | next_id --+-+- 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither | 70983 (1 row) Time: 84,755 ms It takes O(max(n, m) * max_d / min(ins_c, max_c)) in worst case. That's why I changed limitation of this function to max_d = 255 * min(ins_c, del_c) 3) I found that there is no need for storing character offsets in levenshtein_less_equal_internal_mb. Instead of this first position in string, where value of distance wasn't greater than max_d, can be stored. 4) I found that there is theoretical maximum distance based of string lengths. It represents the case, when no characters are matching. If theoretical maximum distance is less than given maximum distance we can use theoretical maximum distance. It improves behavior of levenshtein_less_equal with high max_d, but it's still slower than levenshtein: test=# select * from words where levenshtein_less_equal(word, 'Dhaka craziness savvies teeae luhs Barentss unwe zher',1000)=10; id | word | next_id --+-+- 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither | 70983 (1 row) Time: 4102,731 ms test=# select * from words where levenshtein(word, 'Dhaka craziness savvies teeae luhs Barentss unwe zher')=10; id | word | next_id --+-+- 4589 | Dhaka craziness savvies teenager ploughs Barents's unwed zither | 70983 (1 row) Time: 2983,244 ms With best regards, Alexander Korotkov. fuzzystrmatch-0.5.tar.gz Description: GNU Zip compressed data -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.
Hi all, This is a potential memory error in nodeSubplan.c or execGrouping.c Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME); to reproduce this bug. You may see the memory content that slot1's tts_values[0] point to before and after the statement : MemoryContextReset(evalContext) of execTuplesMatch() in PG version with --enable-cassert switch on. You will find that the memory is set to 0x7f7f7f7f. Here are my code analysis: For the executor node SubPlanState, The node has a ExprContext named 'innerecontext' that is created by itself in function ExecInitSubPlan(). Also, the 'innerecontext' is used to build a project info, 'projRight', of SubPlanState node. When the SubPlanState node is executed, a hash table of subplan will be built, so buildSubPlanHash() will be called. In buildSubPlanHash function, 'hashtable' of SubPlanState will be initialized by calling BuildTupleHashTable(), and the code passes the 'ecxt_per_tuple_memory' of 'innerecontext' to BuildTupleHashTable() as the 'hashtable''s tempcxt. At this point, we can conclude that the 'projRight' and 'hashtable''s 'tempcxt' of SubPlanState node are all using the same memory context, that is 'innerecontext' in SubPlanState node. So: 1) The memory of all the fetched tuples from 'projRight' will be located in 'innerecontext' of SubPlanState node. 2) All other intermediate result that is located in 'hashtable''s tempcxt, also in fact, will reside in the 'innerecontext' of SubPlanState node. Now next: In buildSubPlanHash(), we will fetch tuple from 'projRight' to fill the hash table of SubPlanState node. As we known, all the fetched tuples are located in the 'innerecontext'. When filling the tuple hash table, if the tuple already exists in the hash table of SubPlanState node, the match function execTuplesMatch() will be called by tuples matching, but in this function, the statement: MemoryContextReset(evalContext); will be reset 'evalContext', to free some memory usage. In fact, the 'evalContext' is equal to 'innerecontext' that the fetched tuples are located in, and actually this statement will reset the 'innerecontext'. So the fetched tuple's memory are all lost. That's the problem. In fact, this error only in debug version, but not in release version. In debug using --enable-cassert to configure, the memory will be set to 0x7f7f7f7f, but not for release. For this error memory usage, the pg does not always report error because the 0x7f7f7f in testing Macro VARATT_IS_COMPRESSED() and VARATT_IS_EXTERNAL() are always false in !WORDS_BIGENDIAN platform such as i386 GNU Linux and in WORDS_BIGENDIAN, the testing Macro VARATT_IS_COMPRESSED() will be true and error may be reported such as AIX(Power 6 AIX V6.1). To fix this problem, we can use another memory context to passin BuildTupleHashTable() as the hashtable's tempcxt, or use other memory context as hash table's tempcxt or other ways. Any questions, please contact me. Regards -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
2010/7/26 Robert Haas robertmh...@gmail.com: On Mon, Jul 26, 2010 at 10:39 AM, Merlin Moncure mmonc...@gmail.com wrote: On Mon, Jul 26, 2010 at 9:26 AM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jul 26, 2010 at 9:10 AM, Merlin Moncure mmonc...@gmail.com wrote: CONCAT('foo', NULL) = 'foo' really the behavior that everyone else implements here? And why does CONCAT() take a variadic ANY argument? Shouldn't that be variadic TEXT? What does that accomplish, besides forcing you to sprinkle every concat call with text casts (maybe that's not a bad thing?)? You could ask the same thing about the existing || operator. And in fact, we used to have that behavior. We changed it in 8.3. Perhaps that was a good decision and perhaps it wasn't, but I don't think using CONCAT() to make an end-run around that decision is the way to go. It was absolutely a good decision because it prevented type inference in ways that were ambiguous or surprising (for a canonical case see: http://www.mail-archive.com/pgsql-gene...@postgresql.org/msg93224.html). || operator is still pretty tolerant in the 8.3+ world. select interval_col || bool_col; -- error select interval_col::text || bool_col; -- text concatenation select text_col || interval_col || bool_col; -- text concatenation variadic text would require text casts on EVERY non 'unknown' argument which drops it below the threshold of usefulness IMO -- it would be far stricter than vanilla || concatenation. Ok, pure bikeshed here (shutting my trap now!), but concat() is one of those wonder functions that you want to make as usable and terse as possible. I don't see the value in making it overly strict. I'm just very skeptical that we should give our functions argument types that are essentially fantasy. CONCAT() doesn't concatenate integers or intervals or boxes: it concatenates strings, and only strings. Surely the right fix, if our casting rules are too restrictive, is to fix the casting rules; not to start adding a bunch of kludgery in every function we define. Rules are correct probably. The problem is in searching function algorithm - it is too simple (and fast, just only one cycle). And some exceptions - like COALESCE and similar are solved specifically on parser level. We cannot enforce some casting on user level. PostgreSQL is more strict with timestamps, dates than other databases. Sometimes you have to do explicit casts, but it clean from context desired datatype - SELECT date_trunc('day', current_date) - result is timestamp, but it is clean, so result have to be date ... When I proposed a parser hook I though about these functions. With this hook, you can enforce any necessary casting like some buildin functions does. so any The problem with your canonical example (and the other examples of this type I've seen) is that they are underspecified. Given two identically-named operators, one of which accepts types T1 and T2, and the other of which accepts types T3 and T4, what is one to do with T1 OP T4? You can cast T1 to T3 and call the first operator or you can cast T4 to T2 and call the second one, and it's really not clear which is right, so you had better thrown an error. The same applies to functions: given foo(text) and foo(date) and the call foo('2010-04-15'), you had better complain, or you may end up with a very sad user. But sometimes our current casting rules require casts in situations where they don't seem necessary, such as LPAD(integer_column, '0', 5). That's not ambiguous because there's only one definition of LPAD, and the chances that the user will be unpleasantly surprised if you call it seem quite low. In other words, this problem is not unique to CONCAT(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
so any datatype is last possibility - is workaroud for custom functions. Probably the most correct implementation of CONCAT have to contains a parser changes - and then you can use a some internal concat function with text only parameters. VARIADIC with any is just workaround that is enouhg Regards Pavel The problem with your canonical example (and the other examples of this type I've seen) is that they are underspecified. Given two identically-named operators, one of which accepts types T1 and T2, and the other of which accepts types T3 and T4, what is one to do with T1 OP T4? You can cast T1 to T3 and call the first operator or you can cast T4 to T2 and call the second one, and it's really not clear which is right, so you had better thrown an error. The same applies to functions: given foo(text) and foo(date) and the call foo('2010-04-15'), you had better complain, or you may end up with a very sad user. But sometimes our current casting rules require casts in situations where they don't seem necessary, such as LPAD(integer_column, '0', 5). That's not ambiguous because there's only one definition of LPAD, and the chances that the user will be unpleasantly surprised if you call it seem quite low. In other words, this problem is not unique to CONCAT(). -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote: I'm just very skeptical that we should give our functions argument types that are essentially fantasy. CONCAT() doesn't concatenate integers or intervals or boxes: it concatenates strings, and only strings. Surely the right fix, if our casting rules are too restrictive, is to fix the casting rules; not to start adding a bunch of kludgery in every function we define. The problem with your canonical example (and the other examples of this type I've seen) is that they are underspecified. Given two identically-named operators, one of which accepts types T1 and T2, and the other of which accepts types T3 and T4, what is one to do with T1 OP T4? You can cast T1 to T3 and call the first operator or you can cast T4 to T2 and call the second one, and it's really not clear which is right, so you had better thrown an error. The same applies to functions: given foo(text) and foo(date) and the call foo('2010-04-15'), you had better complain, or you may end up with a very sad user. But sometimes our current casting rules require casts in situations where they don't seem necessary, such as LPAD(integer_column, '0', 5). That's not ambiguous because there's only one definition of LPAD, and the chances that the user will be unpleasantly surprised if you call it seem quite low. In other words, this problem is not unique to CONCAT(). shoot, can't resist :-). Are the casting rules broken? If you want to do lpad w/o casts for integers, you can define it explicitly -- feature, not bug. You can basically do this for any function with fixed arguments -- either in userland or core. lpad(int) actually introduces a problem case with the minus sign possibly requiring application specific intervention, so things are probably correct exactly as they are. Huh? If you're arguing that LPAD should require a cast on an integer argument because the defined behavior of the function might not be what someone wants, then apparently explicit casts should be required in all cases. If you're arguing that I can eliminate the need for an explicit cast by overloading LPAD(), I agree with you, but that's not a feature. ISTM you are unhappy with the any variadic mechanism in general, not the casting rules. No, I'm unhappy with the use of any to make an end-run around the casting rules. If you're writing a function that operates on an argument of any type, like pg_sizeof() - or if you're writing a function that does something like append two arrays of unspecified but identical type or, perhaps, search an array of unspecified type for an element of that same type - or if you're writing a function where the types of the arguments can't be known in advance, like printf(), well, then any is what you need. But the only argument anyone has put forward for making CONCAT() accept ANY instead of TEXT is that it might require casting otherwise. My response to that is well then you have to cast it, or fix the casting rules. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
2010/7/26 Robert Haas robertmh...@gmail.com: On Mon, Jul 26, 2010 at 11:39 AM, Merlin Moncure mmonc...@gmail.com wrote: I'm just very skeptical that we should give our functions argument types that are essentially fantasy. CONCAT() doesn't concatenate integers or intervals or boxes: it concatenates strings, and only strings. Surely the right fix, if our casting rules are too restrictive, is to fix the casting rules; not to start adding a bunch of kludgery in every function we define. The problem with your canonical example (and the other examples of this type I've seen) is that they are underspecified. Given two identically-named operators, one of which accepts types T1 and T2, and the other of which accepts types T3 and T4, what is one to do with T1 OP T4? You can cast T1 to T3 and call the first operator or you can cast T4 to T2 and call the second one, and it's really not clear which is right, so you had better thrown an error. The same applies to functions: given foo(text) and foo(date) and the call foo('2010-04-15'), you had better complain, or you may end up with a very sad user. But sometimes our current casting rules require casts in situations where they don't seem necessary, such as LPAD(integer_column, '0', 5). That's not ambiguous because there's only one definition of LPAD, and the chances that the user will be unpleasantly surprised if you call it seem quite low. In other words, this problem is not unique to CONCAT(). shoot, can't resist :-). Are the casting rules broken? If you want to do lpad w/o casts for integers, you can define it explicitly -- feature, not bug. You can basically do this for any function with fixed arguments -- either in userland or core. lpad(int) actually introduces a problem case with the minus sign possibly requiring application specific intervention, so things are probably correct exactly as they are. Huh? If you're arguing that LPAD should require a cast on an integer argument because the defined behavior of the function might not be what someone wants, then apparently explicit casts should be required in all cases. If you're arguing that I can eliminate the need for an explicit cast by overloading LPAD(), I agree with you, but that's not a feature. ISTM you are unhappy with the any variadic mechanism in general, not the casting rules. No, I'm unhappy with the use of any to make an end-run around the casting rules. If you're writing a function that operates on an argument of any type, like pg_sizeof() - or if you're writing a function that does something like append two arrays of unspecified but identical type or, perhaps, search an array of unspecified type for an element of that same type - or if you're writing a function where the types of the arguments can't be known in advance, like printf(), well, then any is what you need. But the only argument anyone has put forward for making CONCAT() accept ANY instead of TEXT is that it might require casting otherwise. My response to that is well then you have to cast it, or fix the casting rules. I understand, but with only text accepting, then CONCAT will has much less benefit - you can't do a numeric list, for example see concat(c1::text, ',', c2::text, ',' ...) with this is much simpler use a pipes '' || c1 || ',' || c2 ... and this operator does necessary cast self. This function is probably one use case of exception from our rules. Regards Pavel Stehule -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 3:07 PM, Robert Haas robertmh...@gmail.com wrote: On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I understand, but with only text accepting, then CONCAT will has much less benefit - you can't do a numeric list, for example see concat(c1::text, ',', c2::text, ',' ...) with this is much simpler use a pipes '' || c1 || ',' || c2 ... and this operator does necessary cast self. This function is probably one use case of exception from our rules. At least two, right? correct: there would be at least two. Because for that use case you'd probably want concat_ws(). In fact, it's hard for me to think of a variadic text function where you wouldn't want the no casts behavior you're getting via ANY. concat() is not a variadic text function. it is variadic any that happens to do text coercion (not casting) inside the function. The the assumption that concat is casting internally is probably wrong. Suppose I had hacked the int-text cast to call a custom function -- I would very much expect concat() not to produce output from that function, just the vanilla output text (I could always force the cast if I wanted to). concat is just a function that does something highly similar to casting. suppose I had a function count_memory(variadic any) that summed memory usage of input args -- forcing casts would make no sense in that context (I'm not suggesting that you think so -- just bringing up a case that illustrates how forcing cast into the function can change behavior in subtle ways). merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I understand, but with only text accepting, then CONCAT will has much less benefit - you can't do a numeric list, for example see concat(c1::text, ',', c2::text, ',' ...) with this is much simpler use a pipes '' || c1 || ',' || c2 ... and this operator does necessary cast self. This function is probably one use case of exception from our rules. At least two, right? Because for that use case you'd probably want concat_ws(). In fact, it's hard for me to think of a variadic text function where you wouldn't want the no casts behavior you're getting via ANY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
On Mon, Jul 26, 2010 at 3:49 PM, Merlin Moncure mmonc...@gmail.com wrote: concat() is not a variadic text function. it is variadic any that happens to do text coercion (not casting) inside the function. The the assumption that concat is casting internally is probably wrong. Suppose I had hacked the int-text cast to call a custom function -- I would very much expect concat() not to produce output from that function, just the vanilla output text (I could always force the cast if I wanted to). concat is just a function that does something highly similar to casting. suppose I had a function count_memory(variadic any) that summed memory usage of input args -- forcing casts would make no sense in that context (I'm not suggesting that you think so -- just bringing up a case that illustrates how forcing cast into the function can change behavior in subtle ways). Right, but I already said I wasn't objecting to the use of variadic ANY in cases like that - only in cases where, as here, you were basically taking any old arguments and forcing them all to text. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] patch (for 9.1) string functions
2010/7/26 Robert Haas robertmh...@gmail.com: On Mon, Jul 26, 2010 at 2:09 PM, Pavel Stehule pavel.steh...@gmail.com wrote: I understand, but with only text accepting, then CONCAT will has much less benefit - you can't do a numeric list, for example see concat(c1::text, ',', c2::text, ',' ...) with this is much simpler use a pipes '' || c1 || ',' || c2 ... and this operator does necessary cast self. This function is probably one use case of exception from our rules. At least two, right? Because for that use case you'd probably want concat_ws(). sorry, yes Pavel In fact, it's hard for me to think of a variadic text function where you wouldn't want the no casts behavior you're getting via ANY. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 3:53 PM, Bruce Momjian br...@momjian.us wrote: Well, that's pretty much saying we won't release before September. Which is kind of a bummer, but I guess that's what happens when we get into vacation season. Yeah, if we are lucky we can do RC1 in mid-August and still release final in August, but that assumes everything is done by then, and that we only need 1-2 RCs. Early September looks more likely. The good thing is that we are deciding now and are continually seeing if we can tighten the schedule. (If we stop trying to tighten the schedule, we get a lose schedule, which no one likes.) I am assuming that if we release beta4 with known bugs, beta5 in mid-August is inevitable. And we're going to need at least a couple of weeks after beta5 before RC1, even assuming no new issues come up. ISTM that if everything goes well we can expect to release in *mid*-September. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
Robert Haas robertmh...@gmail.com writes: Well, that's pretty much saying we won't release before September. Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. Which is kind of a bummer, but I guess that's what happens when we get into vacation season. Yes. If we were at full strength maybe August would be make-able, but there are too many people on vacation right now, and way too many distractions to boot. In any case, now that 9.0 is branched there is not any project-scheduling reason why the final release needs to happen any particular time. I think we need to fall back on our traditional mantra we'll release it when it's ready rather than fret about whether it's August or September or whatever. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Copy path in Dynamic programming
Robert Haas robertmh...@gmail.com writes: On Thu, Jul 22, 2010 at 12:38 PM, vamsi krishna vamsikrishna1...@gmail.com wrote: if lev=5 , and let's say there are two combinations setA = {1,2,3,4,5} and set B={6,7,8,9,10}. I want to reuse the plan of {1.2,3,4,5} for {6,7,8,9,10}. I don't think that makes any sense. Yeah. The theoretical problem is that substituting setB for setA could change the estimated rowcounts, and thus you should not use the same path. The practical problem is that the Path datastructure doesn't store the specific quals to be used, in general --- it relies on the RelOptInfo structures to remember which quals need to be checked during a scan. So you can't just copy the path and substitute some other qual. You could maybe do it if you copied the entire planner workspace, but that's not too practical from a memory consumption standpoint. Not to mention that a lot of the node types in question don't have copyfuncs.c support, which is only partly laziness --- it's also the case that copyObject() is incapable of coping with circular linkages, and the planner data structures are full of those. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, that's pretty much saying we won't release before September. Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. I call bullshit. The six items in the code section of the open items list were reported 14, 5, 5, 1, 27, and 0 days ago. The 27-day old item is purely cosmetic and there's absolutely zero evidence that Simon hasn't done it yet because he's been busy working on 9.1 development. It's much more likely that he hasn't gotten around to taking care of that (and his outstanding 9.1 patch) because he's been busy with everything else in his life other than pgsql-hackers. The remaining items have an average age of precisely 5 days, which hardly sounds like we've been sitting on our hands, especially when you consider that both you and Heikki have been on vacation for longer than that. And it's not as if I haven't been following those issues, either. Had you and Heikki and Peter fallen down a well more or less permanently, I would have patched about half of those bugs by now. The fact that I haven't done so is not because I'm busy working on 9.1 development, but because I respect your expertise and wish to have the benefit of it so as to reduce the chances that I will break things, or, for that matter, fix them in a way that's adequate but not the one you happen to prefer. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, 2010-07-27 at 15:50 -0400, Robert Haas wrote: 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and PageSetTLI() if the page is not new. This seems slightly awkward because most WAL replay stuff doesn't have to worry about zero pages, but in this case I think it does. 2. Have copy_relation_data() initialize new pages. I don't like this because (a) it's not really the job of SET TABLESPACE to clean up zero pages; and (b) it could be an index with different special size, etc., and it doesn't seem like a good place to figure that out. It appears to me that all of the callers of log_newpage() other than copy_relation_data() do so with pages that they've just constructed, and which therefore can't be new. So maybe we could just modify copy_relation_data to check PageIsNew(buf), or something like that, and only call log_newpage() if that returns true. My first concern with that idea was that it may create an inconsistency between the primary and the standby. The primary could have a bunch of zero pages that never make it to the standby. However, it looks like all WAL recovery stuff passes true for init when calling XLogReadBuffer(), so I think it's safe. I'll test it and submit a patch. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, Jul 27, 2010 at 5:08 PM, Jeff Davis pg...@j-davis.com wrote: On Tue, 2010-07-27 at 15:50 -0400, Robert Haas wrote: 1. Have log_newpage() and heap_xlog_newpage() only call PageSetLSN() and PageSetTLI() if the page is not new. This seems slightly awkward because most WAL replay stuff doesn't have to worry about zero pages, but in this case I think it does. 2. Have copy_relation_data() initialize new pages. I don't like this because (a) it's not really the job of SET TABLESPACE to clean up zero pages; and (b) it could be an index with different special size, etc., and it doesn't seem like a good place to figure that out. It appears to me that all of the callers of log_newpage() other than copy_relation_data() do so with pages that they've just constructed, and which therefore can't be new. So maybe we could just modify copy_relation_data to check PageIsNew(buf), or something like that, and only call log_newpage() if that returns true. My first concern with that idea was that it may create an inconsistency between the primary and the standby. The primary could have a bunch of zero pages that never make it to the standby. Maybe I'm slow on the uptake here, but don't the pages start out all-zeroes on the standby just as they do on the primary? The only way it seems like this would be a problem is if a page that previously contained data on the primary was subsequently zeroed without writing a WAL record - or am I confused? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] Toward a column reorder solution
Quoting wiki.postgresql.org/wiki/Alter_column_positionhttp://wiki.postgresql.org/wiki/Alter_column_position : The idea of allowing re-ordering of column position is not one the postgresql developers are against, it is more a case where no one has stepped forward to do the work. Well, a hard journey starts with a single step. Why not, in the next release that requires to run initdb, add a *attraw* column (a better name is welcome) in the catalog that stores the physical position of column forever, i.e., the same semantics of *attnum*? Then, in a future release - 9.1 for example - the postgres team can make * attnum* changeable using something like ALTER COLUMN POSITION? Pros: - Requires only a couple of changes in main postgreSQL code. It seems to be very simple. - Allows a smooth and decentralized rewrite of the whole code that may needs the *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers, tools etc. This will give time to developers of that code to detect the impact of semantics change, make the arrangements necessary and also allow the release of production level software using the new feature before *attnum *becomes changeable. So, when *attnum *becomes read/write, all that software will be ready. Cons - More 4 bytes in each row of the catalog. Nilson
Re: [HACKERS] Toward a column reorder solution
Nilson wrote: Quoting wiki.postgresql.org/wiki/Alter_column_position http://wiki.postgresql.org/wiki/Alter_column_position : The idea of allowing re-ordering of column position is not one the postgresql developers are against, it is more a case where no one has stepped forward to do the work. Well, a hard journey starts with a single step. Why not, in the next release that requires to run initdb, add a *attraw* column (a better name is welcome) in the catalog that stores the physical position of column forever, i.e., the same semantics of *attnum*? Then, in a future release - 9.1 for example - the postgres team can make *attnum* changeable using something like ALTER COLUMN POSITION? Pros: - Requires only a couple of changes in main postgreSQL code. It seems to be very simple. - Allows a smooth and decentralized rewrite of the whole code that may needs the *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers, tools etc. This will give time to developers of that code to detect the impact of semantics change, make the arrangements necessary and also allow the release of production level software using the new feature before *attnum *becomes changeable. So, when *attnum *becomes read/write, all that software will be ready. Cons - More 4 bytes in each row of the catalog. Nilson Please review the previous discussions on this. In particular, see this proposal from Tom Lane that I believe represents the consensus way we want to go on this: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: Well, that's pretty much saying we won't release before September. Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. [poorly worded protest] Sorry - I apologize for that email. As has been pointed out to me off-list, that was too strongly worded and not constructive. Still, I don't think there is much evidence for the proposition that the current delays are caused by having branched early. I think they're caused by people being out of town. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Tue, Jul 27, 2010 at 5:45 PM, Andrew Dunstan and...@dunslane.net wrote: Nilson wrote: Quoting wiki.postgresql.org/wiki/Alter_column_position http://wiki.postgresql.org/wiki/Alter_column_position : The idea of allowing re-ordering of column position is not one the postgresql developers are against, it is more a case where no one has stepped forward to do the work. Well, a hard journey starts with a single step. Why not, in the next release that requires to run initdb, add a *attraw* column (a better name is welcome) in the catalog that stores the physical position of column forever, i.e., the same semantics of *attnum*? Then, in a future release - 9.1 for example - the postgres team can make *attnum* changeable using something like ALTER COLUMN POSITION? Pros: - Requires only a couple of changes in main postgreSQL code. It seems to be very simple. - Allows a smooth and decentralized rewrite of the whole code that may needs the *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers, tools etc. This will give time to developers of that code to detect the impact of semantics change, make the arrangements necessary and also allow the release of production level software using the new feature before *attnum *becomes changeable. So, when *attnum *becomes read/write, all that software will be ready. Cons - More 4 bytes in each row of the catalog. Nilson Please review the previous discussions on this. In particular, see this proposal from Tom Lane that I believe represents the consensus way we want to go on this: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php Alvaro is planning to work on this for 9.1, I believe. http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
Robert Haas wrote: Please review the previous discussions on this. In particular, see this proposal from Tom Lane that I believe represents the consensus way we want to go on this: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php Alvaro is planning to work on this for 9.1, I believe. http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php Yay! cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Tue, 2010-07-27 at 17:56 -0400, Andrew Dunstan wrote: Robert Haas wrote: Please review the previous discussions on this. In particular, see this proposal from Tom Lane that I believe represents the consensus way we want to go on this: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php Alvaro is planning to work on this for 9.1, I believe. http://archives.postgresql.org/pgsql-hackers/2010-07/msg00188.php Yay! Correct. We are also hoping to get some sponsorship for it. https://www.fossexperts.com/ Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering http://twitter.com/cmdpromptinc | http://identi.ca/commandprompt -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
Andrew, The Tom's message was in Dec/2006. We are in 2010. Sincerelly, I'm not afraid to seem naive, but I believe that a column that inherits the persistent semantics of attnum solves the 99.9% problem with column reordering of legacy software. The exceptions seems to be: 1) software that address buffers based on attnum. Tipically a core/hacker software. 2) INSERTs without naming the columns. This could be solved when attnum become changeable with a server ou database variable *allow_attnum_changes* with false default value. The problem addressed by Tom about the need of a primary key for attributes is almost the same of the current solutions to reorder the columns: a) recreate the table b) shift the columns. Nilson On Tue, Jul 27, 2010 at 6:45 PM, Andrew Dunstan and...@dunslane.net wrote: Nilson wrote: Quoting wiki.postgresql.org/wiki/Alter_column_position http://wiki.postgresql.org/wiki/Alter_column_position : The idea of allowing re-ordering of column position is not one the postgresql developers are against, it is more a case where no one has stepped forward to do the work. Well, a hard journey starts with a single step. Why not, in the next release that requires to run initdb, add a *attraw* column (a better name is welcome) in the catalog that stores the physical position of column forever, i.e., the same semantics of *attnum*? Then, in a future release - 9.1 for example - the postgres team can make *attnum* changeable using something like ALTER COLUMN POSITION? Pros: - Requires only a couple of changes in main postgreSQL code. It seems to be very simple. - Allows a smooth and decentralized rewrite of the whole code that may needs the *attraw *attribute - postgreSQL, contribs, pgAdmin, drivers, tools etc. This will give time to developers of that code to detect the impact of semantics change, make the arrangements necessary and also allow the release of production level software using the new feature before *attnum *becomes changeable. So, when *attnum *becomes read/write, all that software will be ready. Cons - More 4 bytes in each row of the catalog. Nilson Please review the previous discussions on this. In particular, see this proposal from Tom Lane that I believe represents the consensus way we want to go on this: http://archives.postgresql.org/pgsql-hackers/2006-12/msg00983.php cheers andrew
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Josh Berkus j...@agliodbs.com writes: A 9.0b3 tester reported this issue with our single most popular PostgreSQL extension, PostGIS: == Postgis makes use of 'PGXS' in postgresql 8.2. Within postgresql-9, datadir and many other variables are defined as multiple values with an append operator, like this: $ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global [snip] datadir := /usr/pgsql-9.0/share This analysis is nonsense on its face --- := is not an append operator and we do not have any multiple values for datadir. The referenced postgis-users thread seems to indicate that the postgis guys found and fixed a problem in their own makefiles. If not, we need a clearer description of what they think the problem is. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote: My first concern with that idea was that it may create an inconsistency between the primary and the standby. The primary could have a bunch of zero pages that never make it to the standby. Maybe I'm slow on the uptake here, but don't the pages start out all-zeroes on the standby just as they do on the primary? The only way it seems like this would be a problem is if a page that previously contained data on the primary was subsequently zeroed without writing a WAL record - or am I confused? The case I was concerned about is when you have a table on the primary with a bunch of zero pages at the end. Then you SET TABLESPACE, and none of the copied pages (or even the fact that they exist) would be sent to the standby, but they would exist on the primary. And later pages may have data, so the standby may see page N but not N-1. Generally, most of the code is not expecting to read or write past the end of the file, unless it's doing an extension. However, I think everything is fine during recovery, because it looks like it's designed to create zero pages as needed. So your idea seems safe to me, although I do still have some doubts because of my lack of knowledge in this area; particularly hot standby conflict detection/resolution. My idea was different: still log the zero page, just don't set LSN or TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't as clean as your idea, but I'm a little more confident that it is correct. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
Robert Haas robertmh...@gmail.com writes: On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. [poorly worded protest] Sorry - I apologize for that email. As has been pointed out to me off-list, that was too strongly worded and not constructive. Still, I don't think there is much evidence for the proposition that the current delays are caused by having branched early. I think they're caused by people being out of town. Well, they're surely both contributing factors. There's no way to run a controlled experiment to determine how much each one is hurting us, so opinions about which is worse can never be more than opinions. I'm sticking with mine though, and for weak evidence will point to the amount of -hackers traffic about 9.1 CF items versus the amount of traffic about how to fix the known bugs. Anyway, I'm back from vacation and will start looking at those bugs as soon as I've caught up on email. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] do we need to postpone beta4?
On Tue, Jul 27, 2010 at 6:42 PM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Tue, Jul 27, 2010 at 4:48 PM, Robert Haas robertmh...@gmail.com wrote: On Tue, Jul 27, 2010 at 4:09 PM, Tom Lane t...@sss.pgh.pa.us wrote: Yup, that's what I think. In fact I think September might be optimistic. This is what happens when you fork early and allow developers to start focusing on new development instead of testing the release branch. [poorly worded protest] Sorry - I apologize for that email. As has been pointed out to me off-list, that was too strongly worded and not constructive. Still, I don't think there is much evidence for the proposition that the current delays are caused by having branched early. I think they're caused by people being out of town. Well, they're surely both contributing factors. There's no way to run a controlled experiment to determine how much each one is hurting us, so opinions about which is worse can never be more than opinions. I'm sticking with mine though, and for weak evidence will point to the amount of -hackers traffic about 9.1 CF items versus the amount of traffic about how to fix the known bugs. I guess I'd counter by pointing out that there are half a dozen bugs and almost 70 patches in the CommitFest. And, again, it's not as if bugs are sitting there being ignored for months on end. To the contrary, we've been largely ignoring new patches for the past five months, but we rarely ignore bugs. When 2 or 3 days go by without a response to a serious bug report, people start posting messages like Hello? Hello? What's going on? (there are several examples of this in just the last week, from at least two different contributors). Anyway, I'm back from vacation and will start looking at those bugs as soon as I've caught up on email. Thanks. Let me know if I'm not picking up something you think I should be looking at. I've been attempting to stay on top of both bug reports and the CommitFest in your absence, which has been keeping me extremely busy, which may account for some portion of the testiness of my previous response. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
Nilson Damasceno wrote: The Tom's message was in Dec/2006. We are in 2010. So what? The problem hasn't changed. Sincerelly, I'm not afraid to seem naive, but I believe that a column that inherits the persistent semantics of attnum solves the 99.9% problem with column reordering of legacy software. You're assuming that the only thing we want to be able to do related to column position is to reorder columns logically. That assumption is not correct. (Incidentally, please don't top-answer). cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parsing of aggregate ORDER BY clauses
Daniel Grace dgr...@wingsnw.com writes: One possible concern might be typecasts that aren't a 1:1 representation. While no two VARCHARs are going to produce the same TEXT, this is not true in other cases (1.1::float::integer and 1.2::float::integer both produce 1, for instance). Off the top of my head, I can't think of a good example where this would cause a problem -- it'd be easy enough to manufacture a possible test case, but it'd be so contrived and I don't know if it's something that would be seen in production code. But if we SELECT SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even if it means the function is called with '1' twice) or floatcol::integer (1.1 and 1.2 are not distinct)? Yes. The current implementation has the advantage that any unique-ifying step is guaranteed to produce outputs that are distinct from the point of view of the aggregate function, whereas if we try to keep the two operations at arms-length, then either we lose that property or we sort-and-unique twice :-(. If memory serves, this type of consideration is also why DISTINCT and GROUP BY are made to follow ORDER BY's choice of semantics in an ordinary SELECT query --- you might find that surprising, but if they weren't on the same page it could be even more surprising. So on reflection I think that the current fix is the best one and we don't want to reconsider it later. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
== Submission review == * Is the patch in context diff format? Yes. * Does it apply cleanly to the current CVS HEAD? Yes. patch -p1 ../xpath_exists-3.patch patching file doc/src/sgml/func.sgml Hunk #1 succeeded at 8642 (offset 16 lines). patching file src/backend/utils/adt/xml.c patching file src/include/catalog/pg_proc.h Hunk #1 succeeded at 4391 (offset 6 lines). patching file src/include/utils/xml.h patching file src/test/regress/expected/xml.out patching file src/test/regress/sql/xml.sql * Does it include reasonable tests, necessary doc patches, etc? Tests: As this is new functionality, it doesn't really need to test much for interactions with other parts of the system. I'm not really an XML expert, so I'd like to punt as to whether it tests enough functionality. Minor quibble with the regression tests: should we be using dollar quotes in things like this? Doubled-up quote marks: SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Dollar quote: SELECT xpath_exists($$//town[text() = 'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Doc patches: Good up to cross-Atlantic differences in spelling (speciali[sz]ed), e.g. == Usability review == Read what the patch is supposed to do, and consider: * Does the patch actually implement that? Yes. * Do we want that? Yes. * Do we already have it? Not really. There are kludges to accomplish these things, but they're available mostly in the sense that a general-purpose language allows you to write code to do anything a Turing machine can do. * Does it follow SQL spec, or the community-agreed behavior? Yes. * Does it include pg_dump support (if applicable)? Not applicable. * Are there dangers? Not that I can see. * Have all the bases been covered? To the extent of my XML knowledge, yes. == Feature test == Apply the patch, compile it and test: * Does the feature work as advertised? Yes. * Are there corner cases the author has failed to consider? Not that I've found. See above re: XML and my vast ignorance of same. * Are there any assertion failures or crashes? No. == Performance review == * Does the patch slow down simple tests? No. * If it claims to improve performance, does it? No such claim made. The kludges needed to reproduce the functionality would certainly consume an enormous number of developer hours, though. * Does it slow down other things? Not that I've found. There might be some minuscule slowing down of the code to the existence of more code paths, but we're a long, long way from having that be something other than noise. == Coding review == Read the changes to the code in detail and consider: * Does it follow the project [http://developer.postgresql.org/pgdocs/postgres/source.html coding guidelines]? Yes. * Are there portability issues? Not that I can see. * Will it work on Windows/BSD etc? Should do. * Are the comments sufficient and accurate? Yes. * Does it do what it says, correctly? Yes, subject to, etc. * Does it produce compiler warnings? No. * Can you make it crash? No. == Architecture review == Consider the changes to the code in the context of the project as a whole: * Is everything done in a way that fits together coherently with other features/modules? Yes. * Are there interdependencies that can cause problems? Not that I've found. Cheers, David. On Tue, Jun 29, 2010 at 11:37:28AM +0100, Mike Fowler wrote: Mike Fowler wrote: Bruce Momjian wrote: I have added this to the next commit-fest: https://commitfest.postgresql.org/action/commitfest_view?id=6 Thanks Bruce. Attached is a revised patch which changes the code slightly such that it uses an older version of the libxml library. I've added comments to the code so that we remember why we didn't use the latest function. After seeing some other posts in the last couple of days, I realised I hadn't documented the function in the SGML. I have now done so, and added a couple of tests with XML literals. Please find the patch attached. Now time to go correct the xmlexists patch too... -- Mike Fowler Registered Linux user: 379787 *** a/doc/src/sgml/func.sgml --- b/doc/src/sgml/func.sgml *** *** 8626,8631 SELECT xpath('/my:a/text()', 'my:a xmlns:my=http://example.com;test/my:a', --- 8626,8664 (1 row) ]]/screen /para + +sect3 + titlexpath_exists/title + + indexterm + primaryxpath_exists/primary + /indexterm + + synopsis + functionxpath_exists/function(replaceablexpath/replaceable, replaceablexml/replaceableoptional,
Re: [HACKERS] Parsing of aggregate ORDER BY clauses
On Tue, Jul 27, 2010 at 7:16 PM, Tom Lane t...@sss.pgh.pa.us wrote: Daniel Grace dgr...@wingsnw.com writes: But if we SELECT SOME_INTEGER_AGGREGATE(DISTINCT floatcol ORDER BY floatcol), should the DISTINCT operate on floatcol (i.e. 1.1 and 1.2 are distinct, even if it means the function is called with '1' twice) or floatcol::integer (1.1 and 1.2 are not distinct)? Yes. The current implementation has the advantage that any unique-ifying step is guaranteed to produce outputs that are distinct from the point of view of the aggregate function, whereas if we try to keep the two operations at arms-length, then either we lose that property or we sort-and-unique twice :-(. Am I misreading this, or did you just answer an either-or question with yes? -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
On Tue, Jul 27, 2010 at 7:33 PM, David Fetter da...@fetter.org wrote: Minor quibble with the regression tests: should we be using dollar quotes in things like this? Doubled-up quote marks: SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Dollar quote: SELECT xpath_exists($$//town[text() = 'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Personally, I don't really see that as an improvement. Dollar-quotes are really nice for longer strings, or where you would otherwise have quadrupled quotes (or more), but I don't see a big advantage to it here. Still, it's a question of opinion more than anything else. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] SSL cipher and version
On Tue, Jul 27, 2010 at 12:06 AM, Tom Lane t...@sss.pgh.pa.us wrote: Robert Haas robertmh...@gmail.com writes: On Mon, Jul 26, 2010 at 9:57 AM, Dave Page dp...@pgadmin.org wrote: On Mon, Jul 26, 2010 at 2:49 PM, Robert Haas robertmh...@gmail.com wrote: Any objections to me committing this? Might wanna fix this first: +PG_FUNCTION_INFO_V1(ssl_veresion); Wow. It works remarkably well without fixing that, but I'll admit that does seem lucky. Well, it's got no arguments, which is the main thing that works differently in call protocol V1. I think you'd find that the PG_RETURN_NULL case doesn't really work though ... It seems to work, but it might be that something's broken under the hood. Anyhow, committed with that correction. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Parsing of aggregate ORDER BY clauses
Robert Haas robertmh...@gmail.com writes: Am I misreading this, or did you just answer an either-or question with yes? I meant Yes, that's an issue. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [RRR] Review: Re: [PATCH] Re: [HACKERS] Adding xpath_exists function
On Tue, 2010-07-27 at 19:41 -0400, Robert Haas wrote: On Tue, Jul 27, 2010 at 7:33 PM, David Fetter da...@fetter.org wrote: Minor quibble with the regression tests: should we be using dollar quotes in things like this? Doubled-up quote marks: SELECT xpath_exists('//town[text() = ''Cwmbran'']','townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Dollar quote: SELECT xpath_exists($$//town[text() = 'Cwmbran']$$,'townstownBidford-on-Avon/towntownCwmbran/towntownBristol/town/towns'::xml); Personally, I don't really see that as an improvement. Dollar-quotes are really nice for longer strings, or where you would otherwise have quadrupled quotes (or more), but I don't see a big advantage to it here. Still, it's a question of opinion more than anything else. I like the idea of using dollar quotes, but I think they should be used for both arguments (or neither). Using $$ for one and then shifting to ' for the second argument with no whitespace at all seems like the least readable option. Regards, Jeff Davis -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Toward a column reorder solution
On Jul 27, 2010, at 3:01 PM, Joshua D. Drake wrote: Correct. We are also hoping to get some sponsorship for it. https://www.fossexperts.com/ Frigging copycat. Any sponsorship for PGXN in there? ;-P Best, David -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.
Tao Ma feng_e...@163.com writes: This is a potential memory error in nodeSubplan.c or execGrouping.c Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME); to reproduce this bug. ... To fix this problem, we can use another memory context to passin BuildTupleHashTable() as the hashtable's tempcxt, or use other memory context as hash table's tempcxt or other ways. Yeah, I think you're right --- we can't get away with reusing the innerecontext's per-tuple context for the hashtable temp contexts. The best solution is probably to make an additional context that does nothing but act as the hashtable temp context. This bug's been there a long time :-(. Surprising that no one found it before. It would be mostly masked in a non-assert build, but not entirely, I think. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Enhancement/Fix for Array Utility Functions
On Fri, Jul 16, 2010 at 4:43 PM, Tom Lane t...@sss.pgh.pa.us wrote: Daniel Farina drfar...@acm.org writes: Generally I think the delimited untoasting of metadata from arrays separately from the payload is Not A Bad Idea. I looked at this patch a bit. I agree that it could be a big win for large external arrays, but ... 1. As-is, it's a significant *pessimization* for small arrays, because the heap_tuple_untoast_attr_slice code does a palloc/copy even when one is not needed because the data is already not toasted. I think there needs to be a code path that avoids that. This seems like it shouldn't be too hard to fix, and I think it should be fixed. 2. Arrays that are large enough to be pushed out to toast storage are almost certainly going to get compressed first. The potential win in this case is very limited because heap_tuple_untoast_attr_slice will fetch and decompress the whole thing. Admittedly this is a limitation of the existing code and not a fault of the patch proper, but still, if you want to make something that's generically useful, you need to do something about that. Perhaps pglz_decompress() could be extended with an argument to say decompress no more than this much --- although that would mean adding another test to its inner loop, so we'd need to check for performance degradation. I'm also unsure how to predict how much compressed data needs to be read in to get at least N bytes of decompressed data. But this part seems to me to be something that can, and probably should, be left for a separate patch. I don't see that we're losing anything this way; we're just not winning as much as we otherwise might. To really fix this, I suspect we'd need a version of pglz_decompress that can operate in streaming mode; you tell it how many bytes you want, and it returns a code indicating that either (a) it decompressed that many bytes or (b) it decompressed less than that many bytes and you can call it again with another chunk of data if you want to keep going. That'd probably be a good thing to have, but I don't think it needs to be a prerequisite for this patch. I'm going to mark this patch Waiting on Author. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] page corruption on 8.3+ that makes it to standby
On Tue, 2010-07-27 at 15:23 -0700, Jeff Davis wrote: On Tue, 2010-07-27 at 17:18 -0400, Robert Haas wrote: My first concern with that idea was that it may create an inconsistency between the primary and the standby. The primary could have a bunch of zero pages that never make it to the standby. Maybe I'm slow on the uptake here, but don't the pages start out all-zeroes on the standby just as they do on the primary? The only way it seems like this would be a problem is if a page that previously contained data on the primary was subsequently zeroed without writing a WAL record - or am I confused? The case I was concerned about is when you have a table on the primary with a bunch of zero pages at the end. Then you SET TABLESPACE, and none of the copied pages (or even the fact that they exist) would be sent to the standby, but they would exist on the primary. And later pages may have data, so the standby may see page N but not N-1. Generally, most of the code is not expecting to read or write past the end of the file, unless it's doing an extension. However, I think everything is fine during recovery, because it looks like it's designed to create zero pages as needed. So your idea seems safe to me, although I do still have some doubts because of my lack of knowledge in this area; particularly hot standby conflict detection/resolution. My idea was different: still log the zero page, just don't set LSN or TLI for a zero page in log_newpage() or heap_xlog_newpage(). This isn't as clean as your idea, but I'm a little more confident that it is correct. Both potential fixes attached and both appear to work. fix1 -- Only call PageSetLSN/TLI inside log_newpage() and heap_xlog_newpage() if the page is not zeroed. fix2 -- Don't call log_newpage() at all if the page is not zeroed. Please review. I don't have a strong opinion about which one should be applied. Regards, Jeff Davis *** a/src/backend/access/heap/heapam.c --- b/src/backend/access/heap/heapam.c *** *** 4079,4086 log_newpage(RelFileNode *rnode, ForkNumber forkNum, BlockNumber blkno, recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); ! PageSetLSN(page, recptr); ! PageSetTLI(page, ThisTimeLineID); END_CRIT_SECTION(); --- 4079,4093 recptr = XLogInsert(RM_HEAP_ID, XLOG_HEAP_NEWPAGE, rdata); ! /* ! * The new page may be uninitialized. If so, we can't set the LSN ! * and TLI because that would corrupt the page. ! */ ! if (!PageIsNew(page)) ! { ! PageSetLSN(page, recptr); ! PageSetTLI(page, ThisTimeLineID); ! } END_CRIT_SECTION(); *** *** 4266,4273 heap_xlog_newpage(XLogRecPtr lsn, XLogRecord *record) Assert(record-xl_len == SizeOfHeapNewpage + BLCKSZ); memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ); ! PageSetLSN(page, lsn); ! PageSetTLI(page, ThisTimeLineID); MarkBufferDirty(buffer); UnlockReleaseBuffer(buffer); } --- 4273,4288 Assert(record-xl_len == SizeOfHeapNewpage + BLCKSZ); memcpy(page, (char *) xlrec + SizeOfHeapNewpage, BLCKSZ); ! /* ! * The new page may be uninitialized. If so, we can't set the LSN ! * and TLI because that would corrupt the page. ! */ ! if (!PageIsNew(page)) ! { ! PageSetLSN(page, lsn); ! PageSetTLI(page, ThisTimeLineID); ! } ! MarkBufferDirty(buffer); UnlockReleaseBuffer(buffer); } *** a/src/backend/commands/tablecmds.c --- b/src/backend/commands/tablecmds.c *** *** 7082,7089 copy_relation_data(SMgrRelation src, SMgrRelation dst, smgrread(src, forkNum, blkno, buf); ! /* XLOG stuff */ ! if (use_wal) log_newpage(dst-smgr_rnode, forkNum, blkno, page); /* --- 7082,7094 smgrread(src, forkNum, blkno, buf); ! /* ! * XLOG stuff ! * ! * Don't log page if it's new, because that will set the LSN ! * and TLI, corrupting the page. ! */ ! if (use_wal !PageIsNew(page)) log_newpage(dst-smgr_rnode, forkNum, blkno, page); /* -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] PostGIS vs. PGXS in 9.0beta3
Tom Lane wrote: Josh Berkus j...@agliodbs.com writes: A 9.0b3 tester reported this issue with our single most popular PostgreSQL extension, PostGIS: == Postgis makes use of 'PGXS' in postgresql 8.2. Within postgresql-9, datadir and many other variables are defined as multiple values with an append operator, like this: $ grep -i datadir /usr/pgsql-9.0/lib/pgxs/src/Makefile.global [snip] datadir := /usr/pgsql-9.0/share This analysis is nonsense on its face --- := is not an append operator and we do not have any multiple values for datadir. The referenced postgis-users thread seems to indicate that the postgis guys found and fixed a problem in their own makefiles. If not, we need a clearer description of what they think the problem is. The real problem has nothing to do with any of the analysis, as you say. It is this: they have an override file for PGXS and it uses $(mkinstalldirs) which we got rid of about a year ago. So apparently they haven't been testing much against any of our alphas or betas or they would have seen this long ago. The correct fix is to do the following in the PostGIS source root: sed -i -e 's/mkinstalldirs/MKDIR_P/' postgis/Makefile.pgxs cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Improper usage of MemoryContext in nodeSubplan.c for buildSubPlanHash() function. This maybe causes allocate memory failed.
I wrote: Tao Ma feng_e...@163.com writes: This is a potential memory error in nodeSubplan.c or execGrouping.c Using select '1'::TEXT IN ((SELECT '1'::NAME) UNION ALL SELECT '1'::NAME); to reproduce this bug. ... To fix this problem, we can use another memory context to passin BuildTupleHashTable() as the hashtable's tempcxt, or use other memory context as hash table's tempcxt or other ways. Yeah, I think you're right --- we can't get away with reusing the innerecontext's per-tuple context for the hashtable temp contexts. The best solution is probably to make an additional context that does nothing but act as the hashtable temp context. I've committed a fix along those lines. Thanks for the report! regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] Performance Enhancement/Fix for Array Utility Functions
1. As-is, it's a significant *pessimization* for small arrays, because the heap_tuple_untoast_attr_slice code does a palloc/copy even when one is not needed because the data is already not toasted. I think there needs to be a code path that avoids that. This seems like it shouldn't be too hard to fix, and I think it should be fixed. Do you have any suggestions where to start? I do agree that this should be fixed as well. I don't have too much time to dedicate to this project. I can try to put in some time this weekend though if it isn't looking too bad. 2. Arrays that are large enough to be pushed out to toast storage are almost certainly going to get compressed first. The potential win in this case is very limited because heap_tuple_untoast_attr_slice will fetch and decompress the whole thing. Admittedly this is a limitation of the existing code and not a fault of the patch proper, but still, if you want to make something that's generically useful, you need to do something about that. Perhaps pglz_decompress() could be extended with an argument to say decompress no more than this much --- although that would mean adding another test to its inner loop, so we'd need to check for performance degradation. I'm also unsure how to predict how much compressed data needs to be read in to get at least N bytes of decompressed data. But this part seems to me to be something that can, and probably should, be left for a separate patch. I don't see that we're losing anything this way; we're just not winning as much as we otherwise might. To really fix this, I suspect we'd need a version of pglz_decompress that can operate in streaming mode; you tell it how many bytes you want, and it returns a code indicating that either (a) it decompressed that many bytes or (b) it decompressed less than that many bytes and you can call it again with another chunk of data if you want to keep going. That'd probably be a good thing to have, but I don't think it needs to be a prerequisite for this patch. Hopefully this is the case. I can try tackling the first part, however. Thanks, Mike -- Michael Lewis lolrus.org mikelikes...@gmail.com