Re: [HACKERS] beta3 & the open items list
Florian Pflug writes: > On Jun 19, 2010, at 21:13 , Tom Lane wrote: >> This is nonsense --- the slave's kernel *will* eventually notice that >> the TCP connection is dead, and tell walreceiver so. I don't doubt >> that the standard TCP timeout is longer than people want to wait for >> that, but claiming that it will never happen is simply wrong. > No, Robert is correct AFAIK. If you're *waiting* for data, TCP > generates no traffic (expect with keepalive enabled). Mph. I was thinking that keepalive was on by default with a very long interval, but I see this isn't so. However, if we enable keepalive, then it's irrelevant to the point anyway. Nobody's produced any evidence that keepalive is an unsuitable solution. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
On Sun, 2010-06-20 at 03:42 +0100, Peter Geoghegan wrote: > >> Thoughts? > It's very cool to be able to write queries like SELECT * FROM payments > WHERE payment_type = 'cash', rather than having to recall time and > again what the PK of cash is within your lookup table. Ahem. That is what a natural key is for :) Joshua D. Drake > > -- > Regards, > Peter Geoghegan > -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
>> Thoughts? > > enum types exist as an optimisation-by-avoidance of referential > integrity. > > We're a relational database, so IMHO we should spend time performance > tuning RI. I take the view that they exist as a way of representing enumerations of application/domain values - if it's hard coded in the application, it's hard coded in the database by using an enum. This is why it isn't that big a problem that they cannot be amended - they ought to be very static, immutable values in the first place. I still think it would be handy to be able to append new values though, and not have to ALTER COLUMN USING to a new enum type. Besides, using enums in place of lookup tables doesn't really make much sense as an optimisation. It's very cool to be able to write queries like SELECT * FROM payments WHERE payment_type = 'cash', rather than having to recall time and again what the PK of cash is within your lookup table. -- Regards, Peter Geoghegan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
Simon Riggs wrote: On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote: Thoughts? enum types exist as an optimisation-by-avoidance of referential integrity. We're a relational database, so IMHO we should spend time performance tuning RI. I don't accept your initial assertion at all. But in any case, these are not mutually exclusive. Your work tuning RI will not obstruct mine in making enums more useful, nor vice versa. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote: > Thoughts? enum types exist as an optimisation-by-avoidance of referential integrity. We're a relational database, so IMHO we should spend time performance tuning RI. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Sat, 2010-06-19 at 14:53 -0400, Robert Haas wrote: > On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark wrote: > > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas wrote: > >> 4. Streaming Replication needs to detect death of master. We need > >> some sort of keep-alive, here. Whether it's at the TCP level (as > >> advocated by Tom Lane and others) or at the protocol level (as > >> advocated by Greg Stark) is something that we have yet to decide; once > >> it's decided, someone will need to do it... > > > > This sounds like a useful feature but I don't see why it's not 9.1 > > material. The status quo is that the expected usage pattern is manual > > failover. As long as the slave responds to manual intervention when in > > this state I don't think this is a blocking issue. Monitoring and > > automatic failover are clearly things we plan to add features to > > handle better in the future. > > Right now, if the SR master reboots unexpectedly (say, power plug pull > and restart), the slave never notices. It just sits there forever > waiting for the next byte of data from the master to arrive (which it > never will). You have to manually restart the server or hit > walreceiver with a SIGTERM to get it to start streaming agian. I > guess we could decide we're just not going to deal with that, but it > seems like a fairly large misfeature to me. Are you saying it doesn't respond to a trigger file any any point? That would be a problem. Sounds like we should have a pg_restart_walreceiver() function. We shouldn't be encouraging people to send signals to backends, its too easy to get wrong. -- Simon Riggs www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Training and Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Jun 19, 2010, at 21:13 , Tom Lane wrote: > Robert Haas writes: >> Right now, if the SR master reboots unexpectedly (say, power plug pull >> and restart), the slave never notices. It just sits there forever >> waiting for the next byte of data from the master to arrive (which it >> never will). > > This is nonsense --- the slave's kernel *will* eventually notice that > the TCP connection is dead, and tell walreceiver so. I don't doubt > that the standard TCP timeout is longer than people want to wait for > that, but claiming that it will never happen is simply wrong. No, Robert is correct AFAIK. If you're *waiting* for data, TCP generates no traffic (expect with keepalive enabled). From the slave's kernel POV, a dead master is therefore indistinguishable from a inactive master. Things are different from a sender's POV, though. Since sent data is ACK'ed by the receiving end, the TCP stack can (and does) detect a broken connection. best regards, Florian Pflug -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
On Sat, Jun 19, 2010 at 4:55 PM, Andrew Dunstan wrote: > Gurjeet Singh wrote: >> >> >> This is very similar to Andrew's original suggestion of splitting 32 bits >> into 16+16, but managed by the machine hence no complicated comparison algos >> needed on our part. Also, since this is all transparent to the SQL >> interface, our dump-reload cycle or Slony replication, etc. should not be >> affected either. >> >> > > It would break the on-disk representation, though. That's not something we > want to do any more if it can possibly be avoided. We want to keep > pg_upgrade working. I was partial to your original idea -- i thought it was quite clever actually. enums are a performance side of a tradeoff already so I think any improvement for them should be looked at through that lens. 16 bits is IMO enough to pick a reasonable bucket size that gives you enough play to handle the vast majority of cases that are appropriate for enums. your workaround in the rare case you actually hit the limitations (most of these would fall under the 'oops, i used the wrong tool' category) seems perfectly ok imo. merlin -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
Gurjeet Singh wrote: This is very similar to Andrew's original suggestion of splitting 32 bits into 16+16, but managed by the machine hence no complicated comparison algos needed on our part. Also, since this is all transparent to the SQL interface, our dump-reload cycle or Slony replication, etc. should not be affected either. It would break the on-disk representation, though. That's not something we want to do any more if it can possibly be avoided. We want to keep pg_upgrade working. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan wrote: > > > David E. Wheeler wrote: > >> On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote: >> >> >> >>> I'd be perfectly happy to hear a reasonable alternative. Assuming we use >>> some integer representation, given two labels represented by n and n+1, we >>> can't add a label between them without rewriting the tables that use the >>> type, whether it's my representation scheme or some other. Maybe we could >>> have a FORCE option which would rewrite if necessary. >>> >>> >> >> People would likely always use it. >> >> Why not use a decimal number? >> >> >> >> > > You are just bumping up the storage cost. Part of the attraction of enums > is their efficiency. > > Probably it'd be the same as the decimal suggestion above, but we can use floating-point data type. It will allow injection of a new label at any stage. CREATE leads to Label1 -> 1.0 Label2 -> 2.0 Label3 -> 3.0 ALTER ... ADD Label4 AFTER Label2; leads to Label1 -> 1.0 Label2 -> 2.0 Label4 -> 2.5 Label3 -> 3.0 ALTER ... ADD Label5 AFTER Label2; leads to Label1 -> 1.0 Label2 -> 2.0 Label5 -> 2.25 Label4 -> 2.5 Label3 -> 3.0 Since floating-point implementation is architecture dependent, the ALTER command should check that the injected value does not equate to any value around it (eg. comparisons of (2.5 == 2) and (2.25 == 2.5) should not yield 0); and if it does, then throw an error and ask the user to use the rewrite-the-table version of the command. And since it is still 32 bit, and comparisons done by machine, performance should be acceptably close to current integer comparisons, and much faster that the cache lookups etc. being proposed. This is very similar to Andrew's original suggestion of splitting 32 bits into 16+16, but managed by the machine hence no complicated comparison algos needed on our part. Also, since this is all transparent to the SQL interface, our dump-reload cycle or Slony replication, etc. should not be affected either. Regards, -- gurjeet.singh @ EnterpriseDB - The Enterprise Postgres Company http://www.EnterpriseDB.com singh.gurj...@{ gmail | yahoo }.com Twitter/Skype: singh_gurjeet Mail sent from my BlackLaptop device
Re: [HACKERS] beta3 & the open items list
On 06/19/2010 09:13 PM, Tom Lane wrote: Robert Haas writes: Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it never will). This is nonsense --- the slave's kernel *will* eventually notice that the TCP connection is dead, and tell walreceiver so. I don't doubt that the standard TCP timeout is longer than people want to wait for that, but claiming that it will never happen is simply wrong. I think that enabling slave-side TCP keepalives and control of the keepalive timeout parameters is probably sufficient for 9.0 here. yeah I would agree - we do have tcp keepalive code in the backend for a while now and adding that to libpq as well just seems like an easy enough fix at this time in the release cycle. Stefan -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote: > On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: > > 4. Streaming Replication needs to detect death of master. We need > > some sort of keep-alive, here. Whether it's at the TCP level (as > > advocated by Tom Lane and others) or at the protocol level (as > > advocated by Greg Stark) is something that we have yet to decide; once > > it's decided, someone will need to do it... > > TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I > humbly suggest we *not* be pedantic and implement something practical > and less prone to variables outside the control of Pg. And has the huge advantage of being implementable in about 5 lines of C (setsockopt + error checking). Considering what time in the release cycle this is... Andres -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
Robert Haas writes: > Right now, if the SR master reboots unexpectedly (say, power plug pull > and restart), the slave never notices. It just sits there forever > waiting for the next byte of data from the master to arrive (which it > never will). This is nonsense --- the slave's kernel *will* eventually notice that the TCP connection is dead, and tell walreceiver so. I don't doubt that the standard TCP timeout is longer than people want to wait for that, but claiming that it will never happen is simply wrong. I think that enabling slave-side TCP keepalives and control of the keepalive timeout parameters is probably sufficient for 9.0 here. regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
Andrew Dunstan writes: > Another thought: could we add a column to pg_type with a flag that's > true if the oids are in sort order? Then the comparison routines could > just look that up in the type cache and if it's true (as it often will > be) just return the oid comparison. Well, having to do a cache lookup already makes it a couple orders of magnitude more expensive than an OID comparison. However, it's hard to say how much that matters in terms of total application performance. We really could do with a bit of performance testing here ... regards, tom lane -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] extensible enum types
Tom Lane wrote: Andrew Dunstan writes: Tom Lane wrote: Insert a sort order column into pg_enum, and rearrange the values in that whenever the user wants to add a new value in a particular place. You give up cheap comparisons in exchange for flexibility. I think lots of people would accept that tradeoff, especially if they could make it per-datatype. But I'm not happy about giving up cheap comparison. I don't think it would be all that bad. We could teach typcache.c to cache the ordering data for any type that's in active use. It'd certainly be a lot more expensive than OID comparison, but perhaps not worse than NUMERIC comparisons. Another thought: could we add a column to pg_type with a flag that's true if the oids are in sort order? Then the comparison routines could just look that up in the type cache and if it's true (as it often will be) just return the oid comparison. cheers andrew -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark wrote: > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas wrote: >> 4. Streaming Replication needs to detect death of master. We need >> some sort of keep-alive, here. Whether it's at the TCP level (as >> advocated by Tom Lane and others) or at the protocol level (as >> advocated by Greg Stark) is something that we have yet to decide; once >> it's decided, someone will need to do it... > > This sounds like a useful feature but I don't see why it's not 9.1 > material. The status quo is that the expected usage pattern is manual > failover. As long as the slave responds to manual intervention when in > this state I don't think this is a blocking issue. Monitoring and > automatic failover are clearly things we plan to add features to > handle better in the future. Right now, if the SR master reboots unexpectedly (say, power plug pull and restart), the slave never notices. It just sits there forever waiting for the next byte of data from the master to arrive (which it never will). You have to manually restart the server or hit walreceiver with a SIGTERM to get it to start streaming agian. I guess we could decide we're just not going to deal with that, but it seems like a fairly large misfeature to me. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas wrote: > 4. Streaming Replication needs to detect death of master. We need > some sort of keep-alive, here. Whether it's at the TCP level (as > advocated by Tom Lane and others) or at the protocol level (as > advocated by Greg Stark) is something that we have yet to decide; once > it's decided, someone will need to do it... This sounds like a useful feature but I don't see why it's not 9.1 material. The status quo is that the expected usage pattern is manual failover. As long as the slave responds to manual intervention when in this state I don't think this is a blocking issue. Monitoring and automatic failover are clearly things we plan to add features to handle better in the future. -- greg -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] beta3 & the open items list
On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote: > 4. Streaming Replication needs to detect death of master. We need > some sort of keep-alive, here. Whether it's at the TCP level (as > advocated by Tom Lane and others) or at the protocol level (as > advocated by Greg Stark) is something that we have yet to decide; once > it's decided, someone will need to do it... TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I humbly suggest we *not* be pedantic and implement something practical and less prone to variables outside the control of Pg. Sincerely, Joshua D. Drake -- PostgreSQL.org Major Contributor Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579 Consulting, Training, Support, Custom Development, Engineering -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
[HACKERS] beta3 & the open items list
It would be nice to get beta3 out the door sooner rather than later, but I sort of feel like we're not ready yet. In fact, we seem to be a bit stalled. The open items list currently lists four items. 1. max_standby_delay. Tom has committed to getting this done, but has been tied up with non-PostgreSQL related work for the last few weeks. 2. infinite repeat of warning message in standby. Heikki changed the code so this isn't a tight loop any more, which is an improvement, but we've discussed the fact that retrying forever may not be the best behavior. http://archives.postgresql.org/pgsql-hackers/2010-06/msg00806.php http://archives.postgresql.org/pgsql-hackers/2010-06/msg00838.php I am not clear, however, on how difficult it is to implement the proposed behavior, and I'm not sure Heikki's on board with the proposed change. 3. supply alternate hstore operator for equals-greater in preparation for later user in function parameter assignment. There's some work left to be done here but it's pretty minor. Mostly we're arguing about whether to call the hstore slice operator +> or & or % or %> -- I've written three patches to rename it so far (to three different alternative names), one of which I committed, and there's still ongoing discussion as to whether to rename it again and/or remove it. Aside from that, we need to deal with the singleton-hstore constructor (text => text); I believe the consensus there is to remove the operator in favor of the underlying hstore(text, text) function and backpatch that function name into the back-branches to facilitate writing hstore code that is portable across major PostgreSQL releases. 4. Streaming Replication needs to detect death of master. We need some sort of keep-alive, here. Whether it's at the TCP level (as advocated by Tom Lane and others) or at the protocol level (as advocated by Greg Stark) is something that we have yet to decide; once it's decided, someone will need to do it... It would be nice if we could make a final push to get these issues resolved and another beta out the door before the end of the month... -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise Postgres Company -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Re: [HACKERS] About tapes
Tom, Robert, thank you. Now it is clearer how space on tapes is recycled. I tried to follow Robert's example but storing one tape per separate file. Read in the first block of each run (hosted by separate tapes and so by separate files) and output them into extra storage, wherever this extra storage is. Again, those first input blocks are now garbage: is it correct? In this case, what happens when trying to recycle those garbage blocks by hosting the result of merging the second block of each run? Il 18/06/2010 23:29, Robert Haas ha scritto: On Fri, Jun 18, 2010 at 3:46 PM, mac_man2...@hotmail.it wrote: Which is the difference between having more than one tape into a file and having one tape per file? It makes it easier to recycle space a little at a time. Suppose you're merging two runs of 100 blocks each. You read in a block from each run and write out two output blocks. Now that you've done that, the first block of each of the input runs is garbage and can be recycled - but if the input runs and the output run are in three separate files, there's no easy way to do that. You can truncate a file (and throw away the end) but there's no easy way to throw away the BEGINNING of a file. So you'll probably have to hold on to the entirety of both inputs until you've written the entirety of the output. On the other hand, suppose you have all the blocks in one big file. The first input run is in blocks 1-100; the second is in blocks 101-200. You can read blocks 1 and 101, say, and write the results to blocks 201 and 202, using extra storage, but only a little bit. When you then read blocks 2 and 102, you write the results to blocks 1 and 100, which are no longer needed, because you've already merged them. When you get done with that, blocks 2 and 102 are now no longer needed and can be used to write the next part of the output. Of course, you have to keep track of which order to reread the blocks in when the sort is done: 201, 202, 1, 101, ... but that's a manageable problem. -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers