Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Tom Lane
Florian Pflug  writes:
> On Jun 19, 2010, at 21:13 , Tom Lane wrote:
>> This is nonsense --- the slave's kernel *will* eventually notice that
>> the TCP connection is dead, and tell walreceiver so.  I don't doubt
>> that the standard TCP timeout is longer than people want to wait for
>> that, but claiming that it will never happen is simply wrong.

> No, Robert is correct AFAIK. If you're *waiting* for data, TCP
> generates no traffic (expect with keepalive enabled).

Mph.  I was thinking that keepalive was on by default with a very long
interval, but I see this isn't so.  However, if we enable keepalive,
then it's irrelevant to the point anyway.  Nobody's produced any
evidence that keepalive is an unsuitable solution.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Joshua D. Drake
On Sun, 2010-06-20 at 03:42 +0100, Peter Geoghegan wrote:
> >> Thoughts?

> It's very cool to be able to write queries like SELECT * FROM payments
> WHERE payment_type = 'cash', rather than having to recall time and
> again what the PK of cash is within your lookup table.

Ahem. That is what a natural key is for :) 

Joshua D. Drake


> 
> -- 
> Regards,
> Peter Geoghegan
> 

-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Peter Geoghegan
>> Thoughts?
>
> enum types exist as an optimisation-by-avoidance of referential
> integrity.
>
> We're a relational database, so IMHO we should spend time performance
> tuning RI.

I take the view that they exist as a way of representing enumerations
of application/domain values - if it's hard coded in the application,
it's hard coded in the database by using an enum. This is why it isn't
that big a problem that they cannot be amended - they ought to be very
static, immutable values in the first place. I still think it would be
handy to be able to append new values though, and not have to ALTER
COLUMN USING to a new enum type. Besides, using enums in place of
lookup tables doesn't really make much sense as an optimisation.

It's very cool to be able to write queries like SELECT * FROM payments
WHERE payment_type = 'cash', rather than having to recall time and
again what the PK of cash is within your lookup table.

-- 
Regards,
Peter Geoghegan

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Andrew Dunstan



Simon Riggs wrote:

On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote:

  

Thoughts?



enum types exist as an optimisation-by-avoidance of referential
integrity.

We're a relational database, so IMHO we should spend time performance
tuning RI. 

  


I don't accept your initial assertion at all. But in any case, these are 
not mutually exclusive. Your work tuning RI will not obstruct mine in 
making enums more useful, nor vice versa.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Simon Riggs
On Fri, 2010-06-18 at 11:50 -0400, Andrew Dunstan wrote:

> Thoughts?

enum types exist as an optimisation-by-avoidance of referential
integrity.

We're a relational database, so IMHO we should spend time performance
tuning RI. 

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Simon Riggs
On Sat, 2010-06-19 at 14:53 -0400, Robert Haas wrote:
> On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark  wrote:
> > On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas  wrote:
> >> 4. Streaming Replication needs to detect death of master.  We need
> >> some sort of keep-alive, here.  Whether it's at the TCP level (as
> >> advocated by Tom Lane and others) or at the protocol level (as
> >> advocated by Greg Stark) is something that we have yet to decide; once
> >> it's decided, someone will need to do it...
> >
> > This sounds like a useful feature but I don't see why it's not 9.1
> > material. The status quo is that the expected usage pattern is manual
> > failover. As long as the slave responds to manual intervention when in
> > this state I don't think this is a blocking issue. Monitoring and
> > automatic failover are clearly things we plan to add features to
> > handle better in the future.
> 
> Right now, if the SR master reboots unexpectedly (say, power plug pull
> and restart), the slave never notices.  It just sits there forever
> waiting for the next byte of data from the master to arrive (which it
> never will).  You have to manually restart the server or hit
> walreceiver with a SIGTERM to get it to start streaming agian.  I
> guess we could decide we're just not going to deal with that, but it
> seems like a fairly large misfeature to me.

Are you saying it doesn't respond to a trigger file any any point? That
would be a problem.

Sounds like we should have a pg_restart_walreceiver() function. We
shouldn't be encouraging people to send signals to backends, its too
easy to get wrong.

-- 
 Simon Riggs   www.2ndQuadrant.com
 PostgreSQL Development, 24x7 Support, Training and Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Florian Pflug
On Jun 19, 2010, at 21:13 , Tom Lane wrote:
> Robert Haas  writes:
>> Right now, if the SR master reboots unexpectedly (say, power plug pull
>> and restart), the slave never notices.  It just sits there forever
>> waiting for the next byte of data from the master to arrive (which it
>> never will).
> 
> This is nonsense --- the slave's kernel *will* eventually notice that
> the TCP connection is dead, and tell walreceiver so.  I don't doubt
> that the standard TCP timeout is longer than people want to wait for
> that, but claiming that it will never happen is simply wrong.

No, Robert is correct AFAIK. If you're *waiting* for data, TCP generates no 
traffic (expect with keepalive enabled). From the slave's kernel POV, a dead 
master is therefore indistinguishable from a inactive master.

Things are different from a sender's POV, though. Since sent data is ACK'ed by 
the receiving end, the TCP stack can (and does) detect a broken connection.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Merlin Moncure
On Sat, Jun 19, 2010 at 4:55 PM, Andrew Dunstan  wrote:
> Gurjeet Singh wrote:
>>
>>
>> This is very similar to Andrew's original suggestion of splitting 32 bits
>> into 16+16, but managed by the machine hence no complicated comparison algos
>> needed on our part. Also, since this is all transparent to the SQL
>> interface, our dump-reload cycle or Slony replication, etc. should not be
>> affected either.
>>
>>
>
> It would break the on-disk representation, though. That's not something we
> want to do any more if it can possibly be avoided. We want to keep
> pg_upgrade working.

I was partial to your original idea -- i thought it was quite clever
actually.  enums are a performance side of a tradeoff already so I
think any improvement for them should be looked at through that lens.

16 bits is IMO enough to pick a reasonable bucket size that gives you
enough play to handle the vast majority of cases that are appropriate
for enums.  your workaround in the rare case you actually hit the
limitations (most of these would fall under the 'oops, i used the
wrong tool' category) seems perfectly ok imo.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Andrew Dunstan



Gurjeet Singh wrote:



This is very similar to Andrew's original suggestion of splitting 32 
bits into 16+16, but managed by the machine hence no complicated 
comparison algos needed on our part. Also, since this is all 
transparent to the SQL interface, our dump-reload cycle or Slony 
replication, etc. should not be affected either.





It would break the on-disk representation, though. That's not something 
we want to do any more if it can possibly be avoided. We want to keep 
pg_upgrade working.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Gurjeet Singh
On Fri, Jun 18, 2010 at 12:59 PM, Andrew Dunstan wrote:

>
>
> David E. Wheeler wrote:
>
>> On Jun 18, 2010, at 9:34 AM, Andrew Dunstan wrote:
>>
>>
>>
>>> I'd be perfectly happy to hear a reasonable alternative. Assuming we use
>>> some integer representation, given two labels represented by n and n+1, we
>>> can't add a label between them without rewriting the tables that use the
>>> type, whether it's my representation scheme or some other. Maybe we could
>>> have a FORCE option which would rewrite if necessary.
>>>
>>>
>>
>> People would likely always use it.
>>
>> Why not use a decimal number?
>>
>>
>>
>>
>
> You are just bumping up the storage cost. Part of the attraction of enums
> is their efficiency.
>
>
Probably it'd be the same as the decimal suggestion above, but we can use
floating-point data type.

It will allow injection of a new label at any stage.

CREATE leads to

Label1 -> 1.0
Label2 -> 2.0
Label3 -> 3.0

ALTER ... ADD Label4 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label4 -> 2.5
Label3 -> 3.0

ALTER ... ADD Label5 AFTER Label2; leads to
Label1 -> 1.0
Label2 -> 2.0
Label5 -> 2.25
Label4 -> 2.5
Label3 -> 3.0

Since floating-point implementation is architecture dependent, the ALTER
command should check that the injected value does not equate to any value
around it (eg. comparisons of (2.5  == 2) and (2.25 == 2.5) should not yield
0); and if it does, then throw an error and ask the user to use the
rewrite-the-table version of the command.

And since it is still 32 bit, and comparisons done by machine, performance
should be acceptably close to current integer comparisons, and much faster
that the cache lookups etc. being proposed.

This is very similar to Andrew's original suggestion of splitting 32 bits
into 16+16, but managed by the machine hence no complicated comparison algos
needed on our part. Also, since this is all transparent to the SQL
interface, our dump-reload cycle or Slony replication, etc. should not be
affected either.

Regards,
-- 
gurjeet.singh
@ EnterpriseDB - The Enterprise Postgres Company
http://www.EnterpriseDB.com

singh.gurj...@{ gmail | yahoo }.com
Twitter/Skype: singh_gurjeet

Mail sent from my BlackLaptop device


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Stefan Kaltenbrunner

On 06/19/2010 09:13 PM, Tom Lane wrote:

Robert Haas  writes:

Right now, if the SR master reboots unexpectedly (say, power plug pull
and restart), the slave never notices.  It just sits there forever
waiting for the next byte of data from the master to arrive (which it
never will).


This is nonsense --- the slave's kernel *will* eventually notice that
the TCP connection is dead, and tell walreceiver so.  I don't doubt
that the standard TCP timeout is longer than people want to wait for
that, but claiming that it will never happen is simply wrong.

I think that enabling slave-side TCP keepalives and control of the
keepalive timeout parameters is probably sufficient for 9.0 here.


yeah I would agree - we do have tcp keepalive code in the backend for a 
while now and adding that to libpq as well just seems like an easy 
enough fix at this time in the release cycle.



Stefan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Andres Freund
On Saturday 19 June 2010 18:05:34 Joshua D. Drake wrote:
> On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote:
> > 4. Streaming Replication needs to detect death of master.  We need
> > some sort of keep-alive, here.  Whether it's at the TCP level (as
> > advocated by Tom Lane and others) or at the protocol level (as
> > advocated by Greg Stark) is something that we have yet to decide; once
> > it's decided, someone will need to do it...
> 
> TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I
> humbly suggest we *not* be pedantic and implement something practical
> and less prone to variables outside the control of Pg.
And has the huge advantage of being implementable in about 5 lines of C 
(setsockopt + error checking). Considering what time in the release cycle this 
is...

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Tom Lane
Robert Haas  writes:
> Right now, if the SR master reboots unexpectedly (say, power plug pull
> and restart), the slave never notices.  It just sits there forever
> waiting for the next byte of data from the master to arrive (which it
> never will).

This is nonsense --- the slave's kernel *will* eventually notice that
the TCP connection is dead, and tell walreceiver so.  I don't doubt
that the standard TCP timeout is longer than people want to wait for
that, but claiming that it will never happen is simply wrong.

I think that enabling slave-side TCP keepalives and control of the
keepalive timeout parameters is probably sufficient for 9.0 here.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Tom Lane
Andrew Dunstan  writes:
> Another thought: could we add a column to pg_type with a flag that's 
> true if the oids are in sort order? Then the comparison routines could 
> just look that up in the type cache and if it's true (as it often will 
> be) just return the oid comparison.

Well, having to do a cache lookup already makes it a couple orders of
magnitude more expensive than an OID comparison.  However, it's hard to
say how much that matters in terms of total application performance.
We really could do with a bit of performance testing here ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] extensible enum types

2010-06-19 Thread Andrew Dunstan



Tom Lane wrote:

Andrew Dunstan  writes:
  

Tom Lane wrote:


Insert a sort order column into pg_enum, and rearrange the values in
that whenever the user wants to add a new value in a particular place.
You give up cheap comparisons in exchange for flexibility.  I think lots
of people would accept that tradeoff, especially if they could make it
per-datatype.
  


  

But I'm not happy about giving up cheap comparison.



I don't think it would be all that bad.  We could teach typcache.c to
cache the ordering data for any type that's in active use.  It'd
certainly be a lot more expensive than OID comparison, but perhaps not
worse than NUMERIC comparisons.

  
  


Another thought: could we add a column to pg_type with a flag that's 
true if the oids are in sort order? Then the comparison routines could 
just look that up in the type cache and if it's true (as it often will 
be) just return the oid comparison.


cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Robert Haas
On Sat, Jun 19, 2010 at 2:46 PM, Greg Stark  wrote:
> On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas  wrote:
>> 4. Streaming Replication needs to detect death of master.  We need
>> some sort of keep-alive, here.  Whether it's at the TCP level (as
>> advocated by Tom Lane and others) or at the protocol level (as
>> advocated by Greg Stark) is something that we have yet to decide; once
>> it's decided, someone will need to do it...
>
> This sounds like a useful feature but I don't see why it's not 9.1
> material. The status quo is that the expected usage pattern is manual
> failover. As long as the slave responds to manual intervention when in
> this state I don't think this is a blocking issue. Monitoring and
> automatic failover are clearly things we plan to add features to
> handle better in the future.

Right now, if the SR master reboots unexpectedly (say, power plug pull
and restart), the slave never notices.  It just sits there forever
waiting for the next byte of data from the master to arrive (which it
never will).  You have to manually restart the server or hit
walreceiver with a SIGTERM to get it to start streaming agian.  I
guess we could decide we're just not going to deal with that, but it
seems like a fairly large misfeature to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Greg Stark
On Sat, Jun 19, 2010 at 2:43 PM, Robert Haas  wrote:
> 4. Streaming Replication needs to detect death of master.  We need
> some sort of keep-alive, here.  Whether it's at the TCP level (as
> advocated by Tom Lane and others) or at the protocol level (as
> advocated by Greg Stark) is something that we have yet to decide; once
> it's decided, someone will need to do it...

This sounds like a useful feature but I don't see why it's not 9.1
material. The status quo is that the expected usage pattern is manual
failover. As long as the slave responds to manual intervention when in
this state I don't think this is a blocking issue. Monitoring and
automatic failover are clearly things we plan to add features to
handle better in the future.


-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] beta3 & the open items list

2010-06-19 Thread Joshua D. Drake
On Sat, 2010-06-19 at 09:43 -0400, Robert Haas wrote:

> 4. Streaming Replication needs to detect death of master.  We need
> some sort of keep-alive, here.  Whether it's at the TCP level (as
> advocated by Tom Lane and others) or at the protocol level (as
> advocated by Greg Stark) is something that we have yet to decide; once
> it's decided, someone will need to do it...

TCP involves unknowns, such as firewalls, vpn routers and ssh tunnels. I
humbly suggest we *not* be pedantic and implement something practical
and less prone to variables outside the control of Pg.

Sincerely,

Joshua D. Drake


-- 
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 509.416.6579
Consulting, Training, Support, Custom Development, Engineering


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] beta3 & the open items list

2010-06-19 Thread Robert Haas
It would be nice to get beta3 out the door sooner rather than later,
but I sort of feel like we're not ready yet.  In fact, we seem to be a
bit stalled.  The open items list currently lists four items.

1. max_standby_delay.  Tom has committed to getting this done, but has
been tied up with non-PostgreSQL related work for the last few weeks.

2. infinite repeat of warning message in standby.  Heikki changed the
code so this isn't a tight loop any more, which is an improvement, but
we've discussed the fact that retrying forever may not be the best
behavior.

http://archives.postgresql.org/pgsql-hackers/2010-06/msg00806.php
http://archives.postgresql.org/pgsql-hackers/2010-06/msg00838.php

I am not clear, however, on how difficult it is to implement the
proposed behavior, and I'm not sure Heikki's on board with the
proposed change.

3. supply alternate hstore operator for equals-greater in preparation
for later user in function parameter assignment.  There's some work
left to be done here but it's pretty minor.  Mostly we're arguing
about whether to call the hstore slice operator +> or & or % or %> --
I've written three patches to rename it so far (to three different
alternative names), one of which I committed, and there's still
ongoing discussion as to whether to rename it again and/or remove it.
Aside from that, we need to deal with the singleton-hstore constructor
(text => text); I believe the consensus there is to remove the
operator in favor of the underlying hstore(text, text) function and
backpatch that function name into the back-branches to facilitate
writing hstore code that is portable across major PostgreSQL releases.

4. Streaming Replication needs to detect death of master.  We need
some sort of keep-alive, here.  Whether it's at the TCP level (as
advocated by Tom Lane and others) or at the protocol level (as
advocated by Greg Stark) is something that we have yet to decide; once
it's decided, someone will need to do it...

It would be nice if we could make a final push to get these issues
resolved and another beta out the door before the end of the month...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise Postgres Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] About tapes

2010-06-19 Thread mac_man2...@hotmail.it

Tom, Robert,
thank you.

Now it is clearer how space on tapes is recycled.

I tried to follow Robert's example but storing one tape per separate file.
Read in the first block of each run (hosted by separate tapes and so by 
separate files) and output them into extra storage, wherever this extra 
storage is.

Again, those first input blocks are now garbage: is it correct?
In this case, what happens when trying to recycle those garbage blocks 
by hosting the result of merging the second block of each run?




Il 18/06/2010 23:29, Robert Haas ha scritto:

On Fri, Jun 18, 2010 at 3:46 PM, mac_man2...@hotmail.it
  wrote:
   

Which is the difference between having more than one tape into a file and
having one tape per file?
 

It makes it easier to recycle space a little at a time.  Suppose
you're merging two runs of 100 blocks each.  You read in a block from
each run and write out two output blocks.  Now that you've done that,
the first block of each of the input runs is garbage and can be
recycled - but if the input runs and the output run are in three
separate files, there's no easy way to do that.  You can truncate a
file (and throw away the end) but there's no easy way to throw away
the BEGINNING of a file.  So you'll probably have to hold on to the
entirety of both inputs until you've written the entirety of the
output.

On the other hand, suppose you have all the blocks in one big file.
The first input run is in blocks 1-100; the second is in blocks
101-200.  You can read blocks 1 and 101, say, and write the results to
blocks 201 and 202, using extra storage, but only a little bit.  When
you then read blocks 2 and 102, you write the results to blocks 1 and
100, which are no longer needed, because you've already merged them.
When you get done with that, blocks 2 and 102 are now no longer needed
and can be used to write the next part of the output.  Of course, you
have to keep track of which order to reread the blocks in when the
sort is done: 201, 202, 1, 101, ... but that's a manageable problem.

   



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers