Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Andres Freund
On 2014-02-18 18:10:02 -0800, Peter Geoghegan wrote:
 On Tue, Feb 18, 2014 at 5:50 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
  Peter Geoghegan wrote:
  I've had multiple complaints of apparent data loss on 9.3.2 customer
  databases. There are 2 total, both complaints from the past week, one
  of which I was able to confirm. The customer's complaint is that
  certain rows are either visible or invisible, depending on whether an
  index scan is used or a sequential scan (I confirmed this with an
  EXPLAIN ANALYZE).
 
  The multixact bugs would cause tuples to be hidden at the heap level.
  If the tuples are visible in a seqscan, then these are more likely to be
  related to index problems, not multixact problem.
 
 I guess I wasn't clear enough here: The row in question was visible
 with a sequential scans but *not* with an index scan. So you have it
 backwards here (understandably).

Was there an index only scan or just a index scan? Any chance of a
corrupted index?

Do you still have the page from before you did the VACUUM?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund and...@2ndquadrant.com wrote:
 Was there an index only scan or just a index scan? Any chance of a
 corrupted index?

Just an index scan. I think it's unlikely to be a corrupt index,
because the customer said that he dropped and restored the index, and
it's difficult to imagine how VACUUM FREEZE could then have impacted a
plan with only a sequential scan.

 Do you still have the page from before you did the VACUUM?

It seems likely that I could get it, given that the problem persisted
over several days (apparently there was an earlier, manual attempt to
repair the damage by hand). I'm reasonably confident that I could get
back an affected database by performing a point-in-time recovery. That
would also give me the benefit of an environment that I could break as
needed.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is anyone aware of data loss causing MultiXact bugs in 9.3.2?

2014-02-19 Thread Andres Freund
On 2014-02-19 00:55:03 -0800, Peter Geoghegan wrote:
 On Wed, Feb 19, 2014 at 12:40 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  Was there an index only scan or just a index scan? Any chance of a
  corrupted index?
 
 Just an index scan. I think it's unlikely to be a corrupt index,
 because the customer said that he dropped and restored the index, and
 it's difficult to imagine how VACUUM FREEZE could then have impacted a
 plan with only a sequential scan.

I am wondering whether it's possibly either the vm or the all-visible
flag on the page...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Etsuro Fujita

(2014/02/19 12:12), Kyotaro HORIGUCHI wrote:

At Tue, 18 Feb 2014 19:24:50 +0900, Etsuro Fujita wrote

From: Shigeru Hanada [mailto:shigeru.han...@gmail.com]



I'm not sure that allowing ALTER TABLE against parent table affects
descendants even some of them are foreign table.  I think the rule should
be simple enough to understand for users, of course it should be also
consistent and have backward compatibility.


Yeah, the understandability is important.  But I think the
flexibility is also important.  In other words, I think it is a
bit too inflexible that we disallow e.g., SET STORAGE to be set
on an inheritance tree that contains foreign table(s) because
we disallow SET STORAGE to be set on foreign tables directly.


What use case needs such a flexibility precedes the lost behavior
predictivity of ALTER TABLE and/or code maintenancibility(more
ordinally words must be...) ? I don't agree with the idea that
ALTER TABLE implicitly affects foreign children for the reason in
the upthread. Also turning on/off feature implemented as special
syntax seems little hope.


It is just my personal opinion, but I think it would be convenient for 
users to alter inheritance trees that contain foreign tables the same 
way as inheritance trees that don't contain any foreign tables, without 
making user conscious of the inheritance trees contains foreign tables 
or not.  Imagine we have an inheritance tree that contains only plain 
tables and then add a foreign table as a child of the inheritance tree. 
 Without the flexiblity, we would need to change the way of altering 
the structure of the inheritance tree (e.g., ADD CONSTRAINT) to a 
totally different one, immediately when adding the foreign table.  I 
don't think that would be easy to use.



What I think should be newly allowed to be set on foreign tables is

* ADD table_constraint
* DROP CONSTRAINT


As of 9.3

http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html


Consistency with the foreign server is not checked when a
column is added or removed with ADD COLUMN or DROP COLUMN, a
NOT NULL constraint is added, or a column type is changed with
SET DATA TYPE. It is the user's responsibility to ensure that
the table definition matches the remote side.


So I belive implicit and automatic application of any constraint
on foreign childs are considerably danger.


We spent a lot of time discussing this issue, and the consensus is that 
it's users' fault if there are some tuples on the remote side violating 
a given constraint, as mentioned in the documentation.



* [NO] INHERIT parent_table


Is this usable for inheritance foreign children? NO INHERIT
removes all foreign children but INHERIT is nop.


I didn't express clearly.  Sorry for that.  Let me explain about it.

* ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the 
target table as a new child of the parent table.
* ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the 
target table from the list of children of the parent table.


Thanks,

Best regards,
Etsuro Fujita


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Emre Hasegeli e...@hasegeli.com writes:
 How about only removing the inet and the cidr operator classes
 from btree_gist. btree-gist-drop-inet-v2.patch does that.

 I'm not sure which part of no you didn't understand, but to be
 clear: you don't get to break existing installations.

 Assuming that this opclass is sufficiently better than the existing one,
 it would sure be nice if it could become the default; but I've not seen
 any proposal in this thread that would allow that without serious upgrade
 problems.  I think the realistic alternatives so far are (1) new opclass
 is not the default, or (2) this patch gets rejected.

 We should probably expend some thought on a general approach to
 replacing the default opclass for a datatype, because I'm sure this
 will come up again.  Right now I don't see a feasible way.

It seems odd for default to be a property of the opclass rather than
a separate knob.  What comes to mind is something like:

ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;

Not sure how much that helps.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Greg Stark
On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Agreed; that was the original plan, but implementation delays
 prevented the whole vision/discussion/implementation. Requirements
 from various areas include WAL rate limiting for replication, I/O rate
 limiting, hard CPU and I/O limits for security and mixed workload
 coexistence.

 I'd still like to get something on this in 9.4 that alleviates the
 replication issues, leaving wider changes for later releases.

My first reaction was that we should just have a generic I/O resource
throttling. I was only convinced this was a reasonable idea by the
replication use case. It would help me to understand the specific
situations where replication breaks down due to WAL bandwidth
starvation. Heroku has had some problems with slaves falling behind
though the immediate problems that causes is the slave filling up disk
which we could solve more directly by switching to archive mode rather
than slowing down the master.

But I would suggest you focus on a specific use case that's
problematic so we can judge better if the implementation is really
fixing it.

 The vacuum_* parameters don't allow any control over WAL production,
 which is often the limiting factor. I could, for example, introduce a
 new parameter for vacuum_cost_delay that provides a weighting for each
 new BLCKSZ chunk of WAL, then rename all parameters to a more general
 form. Or I could forget that and just press ahead with the patch as
 is, providing a cleaner interface in next release.

 It's also interesting to wonder about the relationship to
 CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
 that that's *cheap* (1 test and branch) as long as nothing is pending.
 I don't want to see a bunch of arithmetic added to it.

 Good point.

I think it should be possible to actually merge it into
CHECK_FOR_INTERRUPTS. Have a single global flag
io_done_since_check_for_interrupts which is set to 0 after each
CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
in the normal case.

In fact you could do all the arithmetic when you do the wal write.
Only set the flag if the bandwidth consumed is above the budget. Then
the flag should only ever be set when you're about to sleep.

I would dearly love to see a generic I/O bandwidth limits so it would
be nice to see a nicely general pattern here that could be extended
even if we only target wal this release.

I'm going to read the existing patch now, do you think it's ready to
go or did you want to do more work based on the feedback?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 Here are updated patches to use pg_lsn instead of pglsn...
 Should I register this patch somewhere to avoid having it lost in the void?
 Regards,

Well, I committed this, but the buildfarm's deeply unhappy with it.
Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
some platforms...  and I'm not sure what to do about that, right
off-hand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Andres Freund
On 2014-02-19 09:24:03 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
  michael.paqu...@gmail.com wrote:
  Here are updated patches to use pg_lsn instead of pglsn...
  Should I register this patch somewhere to avoid having it lost in the void?
  Regards,
 
 Well, I committed this, but the buildfarm's deeply unhappy with it.
 Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
 some platforms...  and I'm not sure what to do about that, right
 off-hand.

The relevant bit probably is:

pg_lsn.c: In function 'pg_lsn_out':
pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' 
[-Wimplicit-function-declaration]

GET_8_BYTES only exists for 64bit systems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-19 09:24:03 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 9:32 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  On Thu, Feb 6, 2014 at 11:26 AM, Michael Paquier
  michael.paqu...@gmail.com wrote:
  Here are updated patches to use pg_lsn instead of pglsn...
  Should I register this patch somewhere to avoid having it lost in the void?
  Regards,

 Well, I committed this, but the buildfarm's deeply unhappy with it.
 Apparently the use of GET_8_BYTES() and SET_8_BYTES() is no good on
 some platforms...  and I'm not sure what to do about that, right
 off-hand.

 The relevant bit probably is:

 pg_lsn.c: In function 'pg_lsn_out':
 pg_lsn.c:59:2: warning: implicit declaration of function 'GET_8_BYTES' 
 [-Wimplicit-function-declaration]

 GET_8_BYTES only exists for 64bit systems.

Right, I got that far.  So it looks like float8, int8, timestamp,
timestamptz, and money all have behavior contingent on
USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
since we've already marched pretty far down that path I suppose we
should keep marching.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Do you know the reason for increased max latency due to xlog scaling?

2014-02-19 Thread MauMau

From: Jeff Janes jeff.ja...@gmail.com

I thought that this was the point I was making, not the point I was
missing.  You have the same hard drives you had before, but now due to a
software improvement you are cramming 5 times more stuff through them.
Yeah, you will get bigger latency spikes.  Why wouldn't you?  You are now
beating the snot out of your hard drives, whereas before you were not.

If you need 10,000 TPS, then you need to upgrade to 9.4.  If you need it
with low maximum latency as well, then you probably need to get better IO
hardware as well (maybe not--maybe more tuning could help).  With 9.3 you
didn't need better IO hardware, because you weren't capable of maxing out
what you already had.  With 9.4 you can max it out, and this is a good
thing.

If you need 10,000 TPS but only 2000 TPS are completing under 9.3, then
what is happening to the other 8000 TPS? Whatever is happening to them, it
must be worse than a latency spike.

On the other hand, if you don't need 10,000 TPS, than measuring max 
latency

at 10,000 TPS is the wrong thing to measure.


Thank you, I've probably got the point --- you mean the hard disk for WAL is 
the bottleneck.  But I still wonder a bit why the latency spike became so 
bigger even with # of clients fewer than # of CPU cores.  I suppose the 
requests get processed more smoothly when the number of simultaneous 
requests is small.  Anyway, I want to believe the latency spike would become 
significantly smaller on an SSD.


Regards
MauMau




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Feb 17, 2014 at 3:56 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 We should probably expend some thought on a general approach to
 replacing the default opclass for a datatype, because I'm sure this
 will come up again.  Right now I don't see a feasible way.

 It seems odd for default to be a property of the opclass rather than
 a separate knob.  What comes to mind is something like:
 ALTER TYPE sniffle SET DEFAULT OPERATOR CLASS FOR btree TO achoo_ops;
 Not sure how much that helps.

Not at all, AFAICS.  If it were okay to decide that some formerly-default
opclass is no longer default, then having such a command would be better
than manually manipulating pg_opclass.opcdefault --- but extension upgrade
scripts could certainly do the latter if they had to.  The problem here
is whether it's upgrade-safe to make such a change at all; having
syntactic sugar doesn't make it safer.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com wrote:
 GET_8_BYTES only exists for 64bit systems.

 Right, I got that far.  So it looks like float8, int8, timestamp,
 timestamptz, and money all have behavior contingent on
 USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
 since we've already marched pretty far down that path I suppose we
 should keep marching.

You need somebody to help you with getting that working on 32-bit
platforms?  Because it needs to get fixed, or reverted, PDQ.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 10:03 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Wed, Feb 19, 2014 at 9:30 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
 GET_8_BYTES only exists for 64bit systems.

 Right, I got that far.  So it looks like float8, int8, timestamp,
 timestamptz, and money all have behavior contingent on
 USE_FLOAT8_BYVAL, making that symbol a misnomer in every way.  But
 since we've already marched pretty far down that path I suppose we
 should keep marching.

 You need somebody to help you with getting that working on 32-bit
 platforms?  Because it needs to get fixed, or reverted, PDQ.

Hopefully the commit I just pushed will fix it.  It now works on my
machine with and without --disable-float8-byval.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Bernd Helmle



--On 18. Februar 2014 22:23:59 +0200 Heikki Linnakangas hlinn...@iki.fi 
wrote:



I considered it a new feature, so not back-patching was the default. If
you want to back-patch it, I won't object.


That was my original feeling, too, but +1 for backpatching.

--
Thanks

Bernd


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread David Fetter
On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
  Add a GUC to report whether data page checksums are enabled.
 
 Is there are reason this wasn't back-patched to 9.3?  I think it should
 be.

+1 for back-patching.

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL Rate Limiting

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 8:28 AM, Greg Stark st...@mit.edu wrote:
 On Mon, Jan 20, 2014 at 5:37 PM, Simon Riggs si...@2ndquadrant.com wrote:

 Agreed; that was the original plan, but implementation delays
 prevented the whole vision/discussion/implementation. Requirements
 from various areas include WAL rate limiting for replication, I/O rate
 limiting, hard CPU and I/O limits for security and mixed workload
 coexistence.

 I'd still like to get something on this in 9.4 that alleviates the
 replication issues, leaving wider changes for later releases.

 My first reaction was that we should just have a generic I/O resource
 throttling. I was only convinced this was a reasonable idea by the
 replication use case. It would help me to understand the specific
 situations where replication breaks down due to WAL bandwidth
 starvation. Heroku has had some problems with slaves falling behind
 though the immediate problems that causes is the slave filling up disk
 which we could solve more directly by switching to archive mode rather
 than slowing down the master.

 But I would suggest you focus on a specific use case that's
 problematic so we can judge better if the implementation is really
 fixing it.

 The vacuum_* parameters don't allow any control over WAL production,
 which is often the limiting factor. I could, for example, introduce a
 new parameter for vacuum_cost_delay that provides a weighting for each
 new BLCKSZ chunk of WAL, then rename all parameters to a more general
 form. Or I could forget that and just press ahead with the patch as
 is, providing a cleaner interface in next release.

 It's also interesting to wonder about the relationship to
 CHECK_FOR_INTERRUPTS --- although I think that currently, we assume
 that that's *cheap* (1 test and branch) as long as nothing is pending.
 I don't want to see a bunch of arithmetic added to it.

 Good point.

 I think it should be possible to actually merge it into
 CHECK_FOR_INTERRUPTS. Have a single global flag
 io_done_since_check_for_interrupts which is set to 0 after each
 CHECK_FOR_INTERRUPTS and set to 1 whenever any wal is written. Then
 CHECK_FOR_INTERRUPTS turns into two tests and branches instead of one
 in the normal case.

 In fact you could do all the arithmetic when you do the wal write.
 Only set the flag if the bandwidth consumed is above the budget. Then
 the flag should only ever be set when you're about to sleep.

 I would dearly love to see a generic I/O bandwidth limits so it would
 be nice to see a nicely general pattern here that could be extended
 even if we only target wal this release.

 I'm going to read the existing patch now, do you think it's ready to
 go or did you want to do more work based on the feedback?

Well, *I* don't think this is ready to go.  A WAL rate limit that only
limits WAL sometimes still doesn't impress me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Greg Stark
On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas robertmh...@gmail.com wrote:
 Hopefully the commit I just pushed will fix it.  It now works on my
 machine with and without --disable-float8-byval.

It builds and passes here on 32bits


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 19, 2014 at 11:03 AM, Greg Stark st...@mit.edu wrote:
 On Wed, Feb 19, 2014 at 3:11 PM, Robert Haas robertmh...@gmail.com wrote:
 Hopefully the commit I just pushed will fix it.  It now works on my
 machine with and without --disable-float8-byval.

 It builds and passes here on 32bits

The buildfarm seems happy so far, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Description for pg_replslot in docs

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 10:50 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Tue, Feb 18, 2014 at 12:43 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 Description for contents of PGDATA is mentioned at
 following page in docs:
 http://www.postgresql.org/docs/devel/static/storage-file-layout.html

 Isn't it better to have description of pg_replslot in the same
 place?
 Definitely. +1.

Yep, good catch.  Fixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] pg_ctl stop times out when it should respond quickly

2014-02-19 Thread Robert Haas
On Mon, Feb 17, 2014 at 11:29 AM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 The pg_regress part is ugly.  However, pg_regress is doing something
 unusual when starting postmaster itself, so the ugly coding to stop it
 seems to match.  If we wanted to avoid the ugliness here, the right fix
 would be to use pg_ctl to start postmaster as well as to stop it.

I wonder if this would change the behavior in cases where we hit ^C
during the regression tests.  Right now I think that kills the
postmaster as well as pg_regress, but if we used pg_ctl, it might not,
because pg_regress uses fork()+exec(), but pg_ctl uses system() to
launch a shell which is in turn instructed to background the
postmaster.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Robert Haas
On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/5/14, 1:31 PM, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.

 If we're going to do that, I suggest pg_lsn rather than pglsn.  We
 already have pg_node_tree, so using underscores for separation would
 be more consistent.

 Yes, that's a good precedent in multiple ways.
 Here are updated patches to use pg_lsn instead of pglsn...

OK, so I think this stuff is all committed now, with assorted changes.
 Thanks for your work on this.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Sat, Feb 15, 2014 at 6:59 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  [ new patches ]

 0001 already needs minor

 Hm?

 If there are conflicts, I'll push/send a rebased tomorrow or monday.

As you guessed, the missing word was rebasing.  It's a trivial
conflict though, so please don't feel the need to repost just for
that.

 +* the transaction's invalidations. This currently won't be needed if
 +* we're just skipping over the transaction, since that currently 
 only
 +* happens when starting decoding, after we invalidated all caches, 
 but
 +* transactions in other databases might have touched shared 
 relations.

 I'm having trouble understanding what this means, especially the part
 after the but.

 Let me rephrase, maybe that will help:

 This currently won't be needed if we're just skipping over the
 transaction because currenlty we only do so during startup, to get to
 the first transaction the client needs. As we have reset the catalog
 caches before starting to read WAL and we haven't yet touched any
 catalogs there can't be anything to invalidate.

 But if we're forgetting this commit because it's it happened in
 another database, the invalidations might be important, because they
 could be for shared catalogs and we might have loaded data into the
 relevant syscaches.

 Better?

Much!  Please include that text, or something like it.

 +   if (IsTransactionState() 
 +   GetTopTransactionIdIfAny() != InvalidTransactionId)
 +   ereport(ERROR,
 +   (errcode(ERRCODE_ACTIVE_SQL_TRANSACTION),
 +errmsg(cannot perform changeset
 extraction in transaction that has performed writes)));

 This is sort of an awkward restriction, as it makes it hard to compose
 this feature with others.  What underlies the restriction, can we get
 rid of it, and if not can we include a comment here explaining why it
 exists?

 Well, you can write the result into a table if you're halfway
 careful... :)

 I think it should be fairly easy to relax the restriction to creating a
 slot, but not getting data from it. Do you think that would that be
 sufficient?

That would be a big improvement, for sure, and might be entirely sufficient.

 +   /* register output plugin name with slot */
 +   strncpy(NameStr(slot-data.plugin), plugin,
 +   NAMEDATALEN);
 +   NameStr(slot-data.plugin)[NAMEDATALEN - 1] = '\0';

 If it's safe to do this without a lock, I don't know why.

 Well, the worst that could happen is that somebody else doing a SELECT *
 FROM pg_replication_slot gets a incomplete plugin name... But we
 certainly can hold the spinlock during it if you think that's better.

Isn't the worst thing that can happen that they copy garbage out of
the buffer, because the new name is longer than the old and only
partially written?

 More broadly, I wonder why this is_init stuff is in here at all.
 Maybe the stuff that happens in the is_init case should be done in the
 caller, or another helper function.

 The reason I moved it in there is that after the walsender patch there
 are two callers and the stuff is sufficiently complex that I having it
 twice lead to bugs.
 The reason it's currenlty the same function is that sufficiently much of
 the code would have to be shared that I found it it ugly to split. I'll
 have a look whether I can figure something out.

I was thinking that the is_init portion could perhaps be done first,
in a *previous* function call, and adjusted in such a way that the
non-is-init path can be used for both case here.

 I don't think this is a very good idea.  The problem with doing things
 during error recovery that can themselves fail is that you'll lose the
 original error, which is not cool, and maybe even blow out the error
 stack.  Many people have confuse PG_TRY()/PG_CATCH() with an
 exception-handling system, but it's not.  One way to fix this is to
 put some of the initialization logic in ReplicationSlotCreate() just
 prior to calling CreateSlotOnDisk().  If the work that needs to be
 done is too complex or protracted to be done there, then I think that
 it should be pulled out of the act of creating the replication slot
 and made to happen as part of first use, or as a separate operation
 like PrepareForLogicalDecoding.

 I think what should be done here is adding a drop_on_release flag. As
 soon as everything important is done, it gets unset.

That might be more elegant, but I don't think it really fixes
anything, because backing stuff out from on disk can fail.  AIUI, your
whole concern here is that you don't want the slot creation to fail
halfway through and leave behind the slot, but what you've got here
doesn't prevent that; it just makes it less likely.  The more I think
about it, 

Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Sun, Feb 16, 2014 at 12:12 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-15 17:29:04 -0500, Robert Haas wrote:
 On Fri, Feb 14, 2014 at 4:55 AM, Andres Freund and...@2ndquadrant.com 
 wrote:
  [ new patches ]

 0001 already needs minor

 + * copied stuff from tuptoaster.c. Perhaps there should be toast_internal.h?

 Yes, please.  If you can submit a separate patch creating this file
 and relocating this stuff there, I will commit it.

 I started to work on that, but I am not sure we actually need it
 anymore. tuptoaster.h isn't included in that many places, so perhaps we
 should just add it there?

That seems fine to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Alvaro Herrera
Pavel Stehule escribió:

  7) Why do the functions accept only the timezone abbreviation, not the
 full name? I find it rather confusing, because the 'timezone' option
 uses the full name, and we're using this as the default. But doing
 'show timestamp' and using the returned value fails. Is it possible
 to fix this somehow?
 
 A only abbreviation is allowed for timetz type. Timestamp can work with
 full time zone names. A rules (behave) should be same as input functions
 for types: timestamptz and timetz.
 
 postgres=# select '10:10:10 CET'::timetz;
timetz
 ─
  10:10:10+01
 (1 row)
 
 postgres=# select '10:10:10 Europe/Prague'::timetz;
 ERROR:  invalid input syntax for type time with time zone: 10:10:10
 Europe/Prague
 LINE 1: select '10:10:10 Europe/Prague'::timetz;
^
 
 This limit is due used routines limits.

I think this is a strange limitation, and perhaps it should be fixed
rather than inflicting the limitation on the new function.

I tweaked your patch a bit, attached; other than defining what to do
about full TZ names in timetz, this seems ready to commit.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index be548d7..69eb45f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6725,6 +6725,32 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
row
 entry
  indexterm
+  primarymake_interval/primary
+ /indexterm
+ literal
+  function
+   make_interval(parameteryears/parameter typeint/type DEFAULT 0,
+   parametermonths/parameter typeint/type DEFAULT 0,
+   parameterweeks/parameter typeint/type DEFAULT 0,
+   parameterdays/parameter typeint/type DEFAULT 0,
+   parameterhours/parameter typeint/type DEFAULT 0,
+   parametermins/parameter typeint/type DEFAULT 0,
+   parametersecs/parameter typedouble precision/type DEFAULT 0.0)
+  /function
+ /literal
+/entry
+entrytypeinterval/type/entry
+entry
+ Create interval from years, months, weeks, days, hours, minutes and
+ seconds fields
+/entry
+entryliteralmake_interval(days := 10)/literal/entry
+entryliteral10 days/literal/entry
+   /row
+
+   row
+entry
+ indexterm
   primarymake_time/primary
  /indexterm
  literal
@@ -6746,6 +6772,81 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
row
 entry
  indexterm
+  primarymake_timetz/primary
+ /indexterm
+ literal
+  function
+   make_timetz(parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type,
+   optional parametertimezone/parameter typetext/type /optional)
+  /function
+ /literal
+/entry
+entrytypetime with time zone/type/entry
+entry
+ Create time with time zone from hour, minute and seconds fields. When
+ parametertimezone/parameter is not specified, then current time zone
+ is used.
+/entry
+entryliteralmake_timetz(8, 15, 23.5)/literal/entry
+entryliteral08:15:23.5+01/literal/entry
+   /row
+
+   row
+entry
+ indexterm
+  primarymake_timestamp/primary
+ /indexterm
+ literal
+  function
+   make_timestamp(parameteryear/parameter typeint/type,
+   parametermonth/parameter typeint/type,
+   parameterday/parameter typeint/type,
+   parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type)
+  /function
+ /literal
+/entry
+entrytypetimestamp/type/entry
+entry
+ Create timestamp from year, month, day, hour, minute and seconds fields
+/entry
+entryliteralmake_timestamp(1-23, 7, 15, 8, 15, 23.5)/literal/entry
+entryliteral2013-07-15 08:15:23.5/literal/entry
+   /row
+
+   row
+entry
+ indexterm
+  primarymake_timestamptz/primary
+ /indexterm
+ literal
+  function
+   make_timestamptz(parameteryear/parameter typeint/type,
+   parametermonth/parameter typeint/type,
+   parameterday/parameter typeint/type,
+   parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type,
+   optional parametertimezone/parameter typetext/type /optional)
+  /function
+ /literal
+/entry
+entrytypetimestamp with time zone/type/entry
+entry
+ Create timestamp with time zone from year, 

Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Tue, Feb 18, 2014 at 4:07 AM, Andres Freund and...@2ndquadrant.com wrote:
 2. I think the snapshot-export code is fundamentally misdesigned.  As
 I said before, the idea that we're going to export one single snapshot
 at one particular point in time strikes me as extremely short-sighted.

 I don't think so. It's precisely what you need to implement a simple
 replication solution. Yes, there are usecases that could benefit from
 more possibilities, but that's always the case.

  For example, consider one-to-many replication where clients may join
 or depart the replication group at any time.  Whenever somebody joins,
 we just want a snapshot, LSN pair such that they can apply all
 changes after the LSN except for XIDs that would have been visible to
 the snapshot.

 And? They need to create individual replication slots, which each will
 get a snapshot.

So we have to wait for startup N times, and transmit the change stream
N times, instead of once?  Blech.

 And in fact, we don't even need any special machinery
 for that; the client can just make a connection and *take a snapshot*
 once decoding is initialized enough.

 No, they can't. Two reasons: For one the commit order between snapshots
 and WAL isn't necessarily the same.

So what?

 For another, clients now need logic
 to detect whether a transaction's contents has already been applied or
 has not been applied yet, that's nontrivial.

My point is, I think we should be trying to *make* that trivial,
rather than doing this.

 3. As this feature is proposed, the only plugin we'll ship with 9.4 is
 a test_decoding plugin which, as its own documentation says, doesn't
 do anything especially useful.  What exactly do we gain by forcing
 users who want to make use of these new capabilities to write C code?

 It gains us to have a output plugin in which we can easily demonstrate
 features so they can be tested in the regression tests. Which I find to
 be rather important.
 Just like e.g. the test_shm_mq stuff doesn't do anything really useful.

It definitely doesn't, but this patch is a lot closer to being done
than parallel query is, so I'm not sure it's a fair comparison.

 The test_decoding plugin doesn't seem tremendously
 much simpler than something that someone could actually use, so why
 not make that the goal?

 For one, it being a designated toy plugin allows us to easily change it,
 to showcase/test new features. For another, I still don't agree that
 it's easy to agree to an output format. I think we should include some
 that matured into 9.5.

I regret that more effort has not been made in that area.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Draft release notes up for review

2014-02-19 Thread Marti Raudsepp
On Sun, Feb 16, 2014 at 10:41 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Any comments before I start transposing them into the back branches?

Sorry I'm late.

 Shore up GRANT ... WITH ADMIN OPTION restrictions (Noah Misch)

I'm not familiar with the phrase Shore up, I think it should use
more precise language: are the privilege checks getting more strict or
less strict?



Wow, there are quite a lot of items this time. Have you considered
grouping the items by their impact, for example security/data
corruption/crash/correctness/other? I think that would make it easier
for readers to find items they're interested in. Most changes seem
pretty straightforward to categorize; there are always outliers, but
even if a few items are miscategorized, that's an improvement over
what we have now. Of course someone has to  be willing to do that work.

If this warrants more discussion, I can draft out a proposal in a new topic.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Robert Haas
On Tue, Feb 18, 2014 at 4:33 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-17 21:35:23 -0500, Robert Haas wrote:
 What
 I don't understand is why we're not taking the test_decoding module,
 polishing it up a little to produce some nice, easily
 machine-parseable output, calling it basic_decoding, and shipping
 that.  Then people who want something else can build it, but people
 who are happy with something basic will already have it.

 Because every project is going to need their own plugin
 *anyway*. Londiste, slony sure are going to ignore changes to relations
 they don't need. Querying their own metadata. They will want
 compatibility to the earlier formats as far as possible. Sometime not
 too far away they will want to optionally support binary output because
 it's so much faster.
 There's just not much chance that either of these will be able to agree
 on a format short term.

Ah, so part of what you're expecting the output plugin to do is
filtering.  I can certainly see where there might be considerable
variation between solutions in that area - but I think that's separate
from the question of formatting per se.  Although I think we should
have an in-core output plugin with filtering capabilities eventually,
I'm happy to define that as out of scope for 9.4.  But isn't there a
way that we can ship something that will due for people who want to
just see the database's entire change stream float by?

TBH, as compared to what you've got now, I think this mostly boils
down to a question of quoting and escaping.  I'm not really concerned
with whether we ship something that's perfectly efficient, or that has
filtering capabilities, or that has a lot of fancy bells and whistles.
 What I *am* concerned about is that if the user updates a text field
that contains characters like  or ' or : or [ or ] or , that somebody
might be using as delimiters in the output format, that a program can
still parse that output format and reliably determine what the actual
change was.  I don't care all that much whether we use JSON or CSV or
something custom, but the data that gets spit out should not have
SQL-injection-like vulnerabilities.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2014-02-19 Thread Marti Raudsepp
On Wed, Feb 12, 2014 at 11:54 PM, Marti Raudsepp ma...@juffo.org wrote:
 With partial-sort-basic-1 and this fix on the same test suite, the
 planner overhead is now a more manageable 0.5% to 1.3%; one test is
 faster by 0.5%.

Ping, Robert or anyone, does this overhead seem bearable or is that
still too much?

Do these numbers look conclusive enough or should I run more tests?

 I think the 1st patch now has a bug in initial_cost_mergejoin; you
 still pass the presorted_keys argument to cost_sort, making it
 calculate a partial sort cost

Ping, Alexander?

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
2014-02-19 19:01 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule escribió:

   7) Why do the functions accept only the timezone abbreviation, not the
  full name? I find it rather confusing, because the 'timezone' option
  uses the full name, and we're using this as the default. But doing
  'show timestamp' and using the returned value fails. Is it possible
  to fix this somehow?
 
  A only abbreviation is allowed for timetz type. Timestamp can work with
  full time zone names. A rules (behave) should be same as input functions
  for types: timestamptz and timetz.
 
  postgres=# select '10:10:10 CET'::timetz;
 timetz
  ─
   10:10:10+01
  (1 row)
 
  postgres=# select '10:10:10 Europe/Prague'::timetz;
  ERROR:  invalid input syntax for type time with time zone: 10:10:10
  Europe/Prague
  LINE 1: select '10:10:10 Europe/Prague'::timetz;
 ^
 
  This limit is due used routines limits.

 I think this is a strange limitation, and perhaps it should be fixed
 rather than inflicting the limitation on the new function.

 I tweaked your patch a bit, attached; other than defining what to do
 about full TZ names in timetz, this seems ready to commit.


I have not a objection - thank you

Pavel



 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] Changeset Extraction v7.6.1

2014-02-19 Thread Euler Taveira
On 18-02-2014 06:33, Andres Freund wrote:
 I really hope there will be nicer ones by the time 9.4 is
 released. Euler did send in a json plugin
 http://archives.postgresql.org/message-id/52A5BFAE.1040209%2540timbira.com.br
 , but there hasn't too much feedback yet. It's hard to start discussing
 something that needs a couple of patches to pg before you can develop
 your own patch...
 
BTW, I've updated that code to reflect the recent changes in the API and
publish it in [1]. This version is based on the Andres' branch
xlog-decoding-rebasing-remapping. I'll continue to polish this code.


Regards,


[1] https://github.com/eulerto/wal2json


-- 
   Euler Taveira   Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
2014-02-19 19:01 GMT+01:00 Alvaro Herrera alvhe...@2ndquadrant.com:

 Pavel Stehule escribió:

   7) Why do the functions accept only the timezone abbreviation, not the
  full name? I find it rather confusing, because the 'timezone' option
  uses the full name, and we're using this as the default. But doing
  'show timestamp' and using the returned value fails. Is it possible
  to fix this somehow?
 
  A only abbreviation is allowed for timetz type. Timestamp can work with
  full time zone names. A rules (behave) should be same as input functions
  for types: timestamptz and timetz.
 
  postgres=# select '10:10:10 CET'::timetz;
 timetz
  ─
   10:10:10+01
  (1 row)
 
  postgres=# select '10:10:10 Europe/Prague'::timetz;
  ERROR:  invalid input syntax for type time with time zone: 10:10:10
  Europe/Prague
  LINE 1: select '10:10:10 Europe/Prague'::timetz;
 ^
 
  This limit is due used routines limits.

 I think this is a strange limitation, and perhaps it should be fixed
 rather than inflicting the limitation on the new function.


I though about it, and now I am thinking so timezone in format
'Europe/Prague' is together with time ambiguous

We can do it, but we have to expect so calculation will be related to
current date - and I am not sure if it is correct, because someone can
write some like

make_date(x,x,x) + make_timetz(..) - and result will be damaged.




 I tweaked your patch a bit, attached; other than defining what to do
 about full TZ names in timetz, this seems ready to commit.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Alvaro Herrera
Pavel Stehule escribió:

 I though about it, and now I am thinking so timezone in format
 'Europe/Prague' is together with time ambiguous
 
 We can do it, but we have to expect so calculation will be related to
 current date - and I am not sure if it is correct, because someone can
 write some like
 
 make_date(x,x,x) + make_timetz(..) - and result will be damaged.

Hmm, I see your point --- the make_timetz() call would use today's
timezone displacement, which might be different from the one used in the
make_date() result.  That would result in a botched timestamptz
sometimes, but it might escape testing because it's subtle and depends
on the input data.

However, your proposal is to use an abbreviation timezone, thereby
forcing the user to select the correct timezone i.e. the one that
matches the make_date() arguments.  I'm not sure this is much of an
improvement, because then the user is faced with the difficult problem
of figuring out the correct abbreviation in the first place.

I think there is little we can do to solve the problem at this level; it
seems to me that the right solution here is to instruct users to use
make_date() only in conjunction with make_time(), that is, produce a
timezone-less timestamp; and then apply a AT TIME ZONE operator to the
result.  That could take a full timezone name, and that would always
work correctly.

My conclusion here is that the time with time zone datatype is broken
in itself, because of this kind of ambiguity.  Maybe we should just
avoid offering more functionality on top of it, that is get rid of
make_timetz() in this patch?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Pavel Stehule
Dne 19. 2. 2014 21:20 Alvaro Herrera alvhe...@2ndquadrant.com napsal(a):

 Pavel Stehule escribió:

  I though about it, and now I am thinking so timezone in format
  'Europe/Prague' is together with time ambiguous
 
  We can do it, but we have to expect so calculation will be related to
  current date - and I am not sure if it is correct, because someone can
  write some like
 
  make_date(x,x,x) + make_timetz(..) - and result will be damaged.

 Hmm, I see your point --- the make_timetz() call would use today's
 timezone displacement, which might be different from the one used in the
 make_date() result.  That would result in a botched timestamptz
 sometimes, but it might escape testing because it's subtle and depends
 on the input data.

 However, your proposal is to use an abbreviation timezone, thereby
 forcing the user to select the correct timezone i.e. the one that
 matches the make_date() arguments.  I'm not sure this is much of an
 improvement, because then the user is faced with the difficult problem
 of figuring out the correct abbreviation in the first place.

 I think there is little we can do to solve the problem at this level; it
 seems to me that the right solution here is to instruct users to use
 make_date() only in conjunction with make_time(), that is, produce a
 timezone-less timestamp; and then apply a AT TIME ZONE operator to the
 result.  That could take a full timezone name, and that would always
 work correctly.

 My conclusion here is that the time with time zone datatype is broken
 in itself, because of this kind of ambiguity.  Maybe we should just
 avoid offering more functionality on top of it, that is get rid of
 make_timetz() in this patch?


+1

Pavel
 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


Re: Fwd: [HACKERS] patch: make_timestamp function

2014-02-19 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 My conclusion here is that the time with time zone datatype is broken
 in itself, because of this kind of ambiguity.

That's the conclusion that's been arrived at by pretty much everybody
who's looked at it with any care.

 Maybe we should just
 avoid offering more functionality on top of it, that is get rid of
 make_timetz() in this patch?

+1.  We don't need to encourage people to use that type.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Neil Thombre
I was trying to understand  (and then perhaps mimic) how pg_standby does a
fast failover.

My current understanding is that when a secondary db is in standby mode, it
will exhaust all the archive log to be replayed from the primary and then
start streaming. It is at this point that xlog.c checks for the existence
of a trigger file to promote the secondary. This was been a cause of some
irritation for some of our customers who do not really care  about catching
up all the way. I want to achieve the exact semantics of pg_standby's fast
failover option.

I manipulated the restore command to return 'failure' when the word fast
is present in the trigger file (see below), hoping that when I want a
secondary database to come out fast, I can just echo the word fast into
the trigger file thereby simulating pg_standby's fast failover behavior.
However, that did not work. Techically, I did not truncate the trigger file
like how pg_standby.

New restore_command = ! fgrep -qsi fast trigger_file  Old
restore_command


And that is where I have a question. I noticed that in pg_standby.c when we
detect the word fast in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not upsetting the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the fast
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we upset the server if we don't?


Assuming this question is answered and I get a better understanding, I have
a follow up question. If  truncation is indeed necessary, can I simulate
the truncation by manipulating restore_command and achieve the same effect
as a fast failover in pg_standby?



Thanks in advance for the help.

Neil


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Emre Hasegeli
2014-02-19 16:52, Tom Lane t...@sss.pgh.pa.us:
 Not at all, AFAICS.  If it were okay to decide that some formerly-default
 opclass is no longer default, then having such a command would be better
 than manually manipulating pg_opclass.opcdefault --- but extension upgrade
 scripts could certainly do the latter if they had to.  The problem here
 is whether it's upgrade-safe to make such a change at all; having
 syntactic sugar doesn't make it safer.

It is not hard to support != operator on the new operator class. That would
make it functionally compatible with the ones on btree_gist. That way,
changing the default would not break pg_dump dumps, it would only upgrade
the index to the new one.

pg_upgrade dumps current objects in the extension. It fails to restore them,
if the new operator class exists as the default. I do not really understand
how pg_upgrade handle extension upgrades. I do not have a solution to this.

It would be sad if we cannot make the new operator class default because of
the btree_gist implementation which is not useful for inet data types. You
wrote on 2010-10-11 about it [1]:

 Well, actually the btree_gist implementation for inet is a completely
 broken piece of junk: it thinks that convert_network_to_scalar is 100%
 trustworthy and can be used as a substitute for the real comparison
 functions, which isn't even approximately true.  I'm not sure why
 Teodor implemented it like that instead of using the type's real
 comparison functions, but it's pretty much useless if you want the
 same sort order that the type itself defines.

[1] http://www.postgresql.org/message-id/8973.1286841...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_standby: Question about truncation of trigger file in fast failover

2014-02-19 Thread Heikki Linnakangas

On 02/19/2014 11:15 PM, Neil Thombre wrote:

And that is where I have a question. I noticed that in pg_standby.c when we
detect the word fast in the trigger file we truncate the file.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L456

There is also a comment above it about not upsetting the server.

https://github.com/postgres/postgres/blob/REL9_1_11/contrib/pg_standby/pg_standby.c#L454

What is the purpose of truncating the file? To do a smart failover once you
come out of standby? But, when I look at xlog.c, when we come out of
standby due to a failure returned by restore_command, we call
CheckForStandbyTrigger() here:

https://github.com/postgres/postgres/blob/REL9_1_11/src/backend/access/transam/xlog.c#L10441

Now, CheckForStandbyTrigger() unlinks the trigger file. I noticed through
the debugger that the unlinking happens before xlog.c makes a call to the
next restore_command.  So, what is the reason for truncating the fast
word from the trigger file if the file is going to be deleted soon after it
is discovered? How will we upset the server if we don't?


At end-of-recovery, the server will fetch again the last WAL file that 
was replayed. If it can no longer find it, because restore_command now 
returns an error even though it succeeded for the same file few seconds 
earlier, it will throw an error and refuse to start up.


That's the way it used to be until 9.2, anyway. In 9.2, the behavior was 
changed, so that the server keeps all the files restored from archive, 
in pg_xlog, so that it can access them again. I haven't tried, but it's 
possible that the truncation is no longer necessary. Try it, with 9.1 
and 9.3, and see what happens.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST support for inet datatypes

2014-02-19 Thread Tom Lane
Emre Hasegeli e...@hasegeli.com writes:
 [ cites bug #5705 ]

Hm.  I had forgotten how thoroughly broken btree_gist's inet and cidr
opclasses are.  There was discussion at the time of just ripping them
out despite the compatibility hit.  We didn't do it, but if we had
then life would be simpler now.

Perhaps it would be acceptable to drop the btree_gist implementation
and teach pg_upgrade to refuse to upgrade if the old database contains
any such indexes.  I'm not sure that solves the problem, though, because
I think pg_upgrade will still fail if the opclass is in the old database
even though unused (because you'll get the complaint about multiple
default opclasses anyway).  The only simple way to avoid that is to not
have btree_gist loaded at all in the old DB, and I doubt that's an
acceptable upgrade restriction.  It would require dropping indexes of
the other types supported by btree_gist, which would be a pretty hard
sell since those aren't broken.

Really what we'd need here is for btree_gist to be upgraded to a version
that doesn't define gist_inet_ops (or at least doesn't mark it as default)
*before* pg_upgrading to a server version containing the proposed new
implementation.  Not sure how workable that is.  Could we get away with
having pg_upgrade unset the default flag in the old database?
(Ick, but there are no very good alternatives here ...)

BTW, I'd not been paying any attention to this thread, but now that
I scan through it, I think the proposal to change the longstanding
names of the inet operators has zero chance of being accepted.
Consistency with the names chosen for range operators is a mighty
weak argument compared to backwards compatibility.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Andres Freund
On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Yes, that's a good precedent in multiple ways.
  Here are updated patches to use pg_lsn instead of pglsn...
 
 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.

cool, thanks you two.

I wonder if pg_stat_replication shouldn't be updated to use it as well?
SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
that as names that are possible candidates for conversion.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
Hi,

I want to propose a new feature called priority table or cache table.
This is same as regular table except the pages of these tables are having
high priority than normal tables. These tables are very useful, where a
faster query processing on some particular tables is expected.

The same faster query processing can be achieved by placing the tables on a
tablespace of ram disk. In this case there is a problem of data loss in
case of system shutdown. To avoid this there is a need of continuous backup
of this tablespace and WAL files is required. The priority table feature
will solve these problems by providing the similar functionality.

User needs a careful decision in deciding how many tables which require a
faster access, those can be declared as priority tables and also these
tables should be in small in both number of columns and size.


New syntax:

create [priority] Table ...;

or

Create Table .. [ buffer_pool = priority | default ];

By adding a new storage parameter of buffer_pool to specify the type of
buffer pool this table can use.

The same can be extended for index also.


Solution -1:

This solution may not be a proper one, but it is simple. So while placing
these table pages into buffer pool, the usage count is changed to double
max buffer usage count instead of 1 for normal tables. Because of this
reason there is a less chance of these pages will be moved out of buffer
pool. The queries which operates on these tables will be faster because of
less I/O. In case if the tables are not used for a long time, then only the
first query on the table will be slower and rest of the queries are faster.

Just for test, a new bool member can be added to RELFILENODE structure to
indicate the table type is priority or not. Using this while loading the
page the usage count can be modified.

The pg_buffercache output of a priority table:

postgres=# select * from pg_buffercache where relfilenode=16385;
 bufferid | relfilenode | reltablespace | reldatabase | relforknumber |
relblocknumber | isdirty | usagecount
---+---+---+-++-+-+
   270 |16385 |   1663 |  12831 |
0 |  0 |t   | 10


Solution - 2:

By keeping an extra flag in the buffer to know whether the buffer is used
for a priority table or not? By using this flag while replacing a buffer
used for priority table some extra steps needs to be taken care like
1. Only another page of priority table can replace this priority page.
2. Only after at least two complete cycles of clock sweep, a normal table
page can replace this.

In this case the priority buffers are present in memory for long time as
similar to the solution-1, but not guaranteed always.


Solution - 3:

Create an another buffer pool called priority buffer pool similar to
shared buffer pool to place the priority table pages. A new guc parameter
called priority_buffers can be added to the get the priority buffer pool
size from the user. The Maximum limit of these buffers can be kept smaller
value to make use of it properly.

As an extra care, whenever any page needs to move out of the priority
buffer pool a warning is issued, so that user can check whether the
configured the priority_buffers size is small or the priority tables are
grown too much as not expected?

In this case all the pages are always loaded into memory thus the queries
gets the faster processing.

IBM DB2 have the facility of creating one more buffer pools and fixing
specific tables and indexes into them. Oracle is also having a facility to
specify the buffer pool option as keep or recycle.

I am preferring syntax-2 and solution-3. please provide your
suggestions/improvements.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 (2014/02/12 3:03), Tom Lane wrote:
 Also, the only remaining usage of dllwrap is in src/bin/pgevent/Makefile.
 Do we need that either?

 Sorry for the late reply.
 Attached is a patch to remove dllwarp from pgevent/Makefile.

Pushed, thanks.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Tom Lane
Haribabu Kommi kommi.harib...@gmail.com writes:
 I want to propose a new feature called priority table or cache table.
 This is same as regular table except the pages of these tables are having
 high priority than normal tables. These tables are very useful, where a
 faster query processing on some particular tables is expected.

Why exactly does the existing LRU behavior of shared buffers not do
what you need?

I am really dubious that letting DBAs manage buffers is going to be
an improvement over automatic management.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/5/14, 1:31 PM, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.

 If we're going to do that, I suggest pg_lsn rather than pglsn.  We
 already have pg_node_tree, so using underscores for separation would
 be more consistent.

 Yes, that's a good precedent in multiple ways.
 Here are updated patches to use pg_lsn instead of pglsn...

 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.
Thanks!
Oops, it looks like I am coming after the battle (time difference does
not help). I'll be more careful to test such patches on 32b platforms
as well in the future.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 1:01 AM, David Fetter da...@fetter.org wrote:
 On Tue, Feb 18, 2014 at 04:39:27PM -0300, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
  Add a GUC to report whether data page checksums are enabled.

 Is there are reason this wasn't back-patched to 9.3?  I think it should
 be.

 +1 for back-patching.
Back-patching would be interesting for existing applications, but -1
as it is a new feature :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 Haribabu Kommi kommi.harib...@gmail.com writes:
  I want to propose a new feature called priority table or cache table.
  This is same as regular table except the pages of these tables are having
  high priority than normal tables. These tables are very useful, where a
  faster query processing on some particular tables is expected.

 Why exactly does the existing LRU behavior of shared buffers not do
 what you need?


Lets assume a database having 3 tables, which are accessed regularly. The
user is expecting a faster query results on one table.
Because of LRU behavior which is not happening some times. So if we just
separate those table pages into an another buffer
pool then all the pages of that table resides in memory and gets faster
query processing.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 9:43 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Thu, Feb 20, 2014 at 2:47 AM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
 On Thu, Feb 6, 2014 at 3:48 AM, Peter Eisentraut pete...@gmx.net wrote:
 On 2/5/14, 1:31 PM, Robert Haas wrote:
 On Tue, Feb 4, 2014 at 3:26 PM, Peter Eisentraut pete...@gmx.net wrote:
 Perhaps this type should be called pglsn, since it's an
 implementation-specific detail and not a universal concept like int,
 point, or uuid.

 If we're going to do that, I suggest pg_lsn rather than pglsn.  We
 already have pg_node_tree, so using underscores for separation would
 be more consistent.

 Yes, that's a good precedent in multiple ways.
 Here are updated patches to use pg_lsn instead of pglsn...

 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.
 Thanks!
 Oops, it looks like I am coming after the battle (time difference does
 not help). I'll be more careful to test such patches on 32b platforms
 as well in the future.
After re-reading the code, I found two incorrect comments in the new
regression tests. Patch fixing them is attached.
Thanks,
-- 
Michael
diff --git a/src/backend/utils/adt/pg_lsn.c b/src/backend/utils/adt/pg_lsn.c
index e2b528a..fae12e1 100644
--- a/src/backend/utils/adt/pg_lsn.c
+++ b/src/backend/utils/adt/pg_lsn.c
@@ -19,7 +19,7 @@
 #include utils/pg_lsn.h
 
 #define MAXPG_LSNLEN			17
-#define MAXPG_LSNCOMPONENT	8
+#define MAXPG_LSNCOMPONENT		8
 
 /*--
  * Formatting and conversion routines.
diff --git a/src/test/regress/expected/pg_lsn.out b/src/test/regress/expected/pg_lsn.out
index 01d2983..504768c 100644
--- a/src/test/regress/expected/pg_lsn.out
+++ b/src/test/regress/expected/pg_lsn.out
@@ -52,13 +52,13 @@ SELECT '0/16AE7F8'  pg_lsn '0/16AE7F7';
  t
 (1 row)
 
-SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results
+SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn;
  ?column? 
 --
-1
 (1 row)
 
-SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct
+SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn;
  ?column? 
 --
 1
diff --git a/src/test/regress/sql/pg_lsn.sql b/src/test/regress/sql/pg_lsn.sql
index dddafb3..1634d37 100644
--- a/src/test/regress/sql/pg_lsn.sql
+++ b/src/test/regress/sql/pg_lsn.sql
@@ -21,5 +21,5 @@ SELECT '0/16AE7F8' = '0/16AE7F8'::pg_lsn;
 SELECT '0/16AE7F8'::pg_lsn != '0/16AE7F7';
 SELECT '0/16AE7F7'  '0/16AE7F8'::pg_lsn;
 SELECT '0/16AE7F8'  pg_lsn '0/16AE7F7';
-SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn; -- No negative results
-SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn; -- correct
+SELECT '0/16AE7F7'::pg_lsn - '0/16AE7F8'::pg_lsn;
+SELECT '0/16AE7F8'::pg_lsn - '0/16AE7F7'::pg_lsn;

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-19 Thread Tom Lane
I wrote:
 The minimum-refactoring solution to this would be to tweak
 pg_do_encoding_conversion() so that if the src_encoding is SQL_ASCII but
 the dest_encoding isn't, it does pg_verify_mbstr() rather than nothing.

 I'm not sure if this would break anything we need to have work,
 though.  Thoughts?  Do we want to back-patch such a change?

 I looked through all the callers of pg_do_encoding_conversion(), and
 AFAICS this change is a good idea.  There are a whole bunch of places
 that use pg_do_encoding_conversion() to convert from the database encoding
 to encoding X (most usually UTF8), and right now if you do that in a
 SQL_ASCII database you have no assurance whatever that what is produced
 is actually valid in encoding X.  I think we need to close that loophole.

The more I looked into mbutils.c, the less happy I got.  The attached
proposed patch takes care of the missing-verification hole in
pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid
of what I believe to be obsolete provisions in pg_do_encoding_conversion
to work if called outside a transaction --- if you consider it working
to completely fail to honor its API contract.  That should no longer be
necessary now that we perform client-server encoding conversions via
perform_default_encoding_conversion rather than here.

For testing I inserted an Assert(IsTransactionState()); at the top of
pg_do_encoding_conversion(), and couldn't trigger it, so I'm fairly sure
this change is OK.  Still, back-patching it might not be a good thing.
On the other hand, if there is still any code path that can get here
outside a transaction, we probably ought to find out rather than allow
completely bogus data to be returned.

I also made some more-cosmetic improvements, notably removing a bunch
of Asserts that are certainly dead code because the relevant variables
are never NULL.

I've not done anything yet about simplifying unnecessary calls of
pg_do_encoding_conversion into pg_server_to_any/pg_any_to_server.

How much of this is back-patch material, do you think?

regards, tom lane

diff --git a/src/backend/utils/mb/mbutils.c b/src/backend/utils/mb/mbutils.c
index 08440e9..7f43cae 100644
*** a/src/backend/utils/mb/mbutils.c
--- b/src/backend/utils/mb/mbutils.c
***
*** 1,10 
! /*
!  * This file contains public functions for conversion between
!  * client encoding and server (database) encoding.
   *
!  * Tatsuo Ishii
   *
!  * src/backend/utils/mb/mbutils.c
   */
  #include postgres.h
  
--- 1,36 
! /*-
   *
!  * mbutils.c
!  *	  This file contains functions for encoding conversion.
   *
!  * The string-conversion functions in this file share some API quirks.
!  * Note the following:
!  *
!  * The functions return a palloc'd, null-terminated string if conversion
!  * is required.  However, if no conversion is performed, the given source
!  * string pointer is returned as-is.
!  *
!  * Although the presence of a length argument means that callers can pass
!  * non-null-terminated strings, care is required because the same string
!  * will be passed back if no conversion occurs.  Such callers *must* check
!  * whether result == src and handle that case differently.
!  *
!  * If the source and destination encodings are the same, the source string
!  * is returned without any verification; it's assumed to be valid data.
!  * If that might not be the case, the caller is responsible for validating
!  * the string using a separate call to pg_verify_mbstr().  Whenever the
!  * source and destination encodings are different, the functions ensure that
!  * the result is validly encoded according to the destination encoding.
!  *
!  *
!  * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
!  * Portions Copyright (c) 1994, Regents of the University of California
!  *
!  *
!  * IDENTIFICATION
!  *	  src/backend/utils/mb/mbutils.c
!  *
!  *-
   */
  #include postgres.h
  
*** InitializeClientEncoding(void)
*** 290,296 
  int
  pg_get_client_encoding(void)
  {
- 	Assert(ClientEncoding);
  	return ClientEncoding-encoding;
  }
  
--- 316,321 
*** pg_get_client_encoding(void)
*** 300,328 
  const char *
  pg_get_client_encoding_name(void)
  {
- 	Assert(ClientEncoding);
  	return ClientEncoding-name;
  }
  
  /*
!  * Apply encoding conversion on src and return it. The encoding
!  * conversion function is chosen from the pg_conversion system catalog
!  * marked as default. If it is not found in the schema search path,
!  * it's taken from pg_catalog schema. If it even is not in the schema,
!  * warn and return src.
!  *
!  * If conversion occurs, a palloc'd null-terminated string is returned.
!  * In the case of no conversion, src is returned.
!  *
!  * CAUTION: although the presence of a length argument means that callers
! 

Re: [HACKERS] should we add a XLogRecPtr/LSN SQL type?

2014-02-19 Thread Michael Paquier
On Thu, Feb 20, 2014 at 8:55 AM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-02-19 12:47:40 -0500, Robert Haas wrote:
 On Wed, Feb 5, 2014 at 9:26 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:
  Yes, that's a good precedent in multiple ways.
  Here are updated patches to use pg_lsn instead of pglsn...

 OK, so I think this stuff is all committed now, with assorted changes.
  Thanks for your work on this.

 cool, thanks you two.

 I wonder if pg_stat_replication shouldn't be updated to use it as well?
 SELECT * FROM pg_attribute WHERE attname ~ '(location|lsn)'; only shows
 that as names that are possible candidates for conversion.
I was sure to have forgotten some views or functions in the previous
patch... Please find attached a patch making pg_stat_replication use
pg_lsn instead of the existing text fields.
Regards,
-- 
Michael
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index a37e6b6..370857a 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -1490,24 +1490,24 @@ postgres: replaceableuser/ replaceabledatabase/ replaceablehost/ re
 /row
 row
  entrystructfieldsent_location//entry
- entrytypetext//entry
+ entrytypepg_lsn//entry
  entryLast transaction log position sent on this connection/entry
 /row
 row
  entrystructfieldwrite_location//entry
- entrytypetext//entry
+ entrytypepg_lsn//entry
  entryLast transaction log position written to disk by this standby
   server/entry
 /row
 row
  entrystructfieldflush_location//entry
- entrytypetext//entry
+ entrytypepg_lsn//entry
  entryLast transaction log position flushed to disk by this standby
   server/entry
 /row
 row
  entrystructfieldreplay_location//entry
- entrytypetext//entry
+ entrytypepg_lsn//entry
  entryLast transaction log position replayed into the database on this
   standby server/entry
 /row
diff --git a/src/backend/replication/walsender.c b/src/backend/replication/walsender.c
index 06b22e2..048367a 100644
--- a/src/backend/replication/walsender.c
+++ b/src/backend/replication/walsender.c
@@ -67,6 +67,7 @@
 #include utils/builtins.h
 #include utils/guc.h
 #include utils/memutils.h
+#include utils/pg_lsn.h
 #include utils/ps_status.h
 #include utils/resowner.h
 #include utils/timeout.h
@@ -2137,7 +2138,6 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 	{
 		/* use volatile pointer to prevent code rearrangement */
 		volatile WalSnd *walsnd = WalSndCtl-walsnds[i];
-		char		location[MAXFNAMELEN];
 		XLogRecPtr	sentPtr;
 		XLogRecPtr	write;
 		XLogRecPtr	flush;
@@ -2171,28 +2171,19 @@ pg_stat_get_wal_senders(PG_FUNCTION_ARGS)
 		else
 		{
 			values[1] = CStringGetTextDatum(WalSndGetStateString(state));
-
-			snprintf(location, sizeof(location), %X/%X,
-	 (uint32) (sentPtr  32), (uint32) sentPtr);
-			values[2] = CStringGetTextDatum(location);
+			values[2] = LSNGetDatum(sentPtr);
 
 			if (write == 0)
 nulls[3] = true;
-			snprintf(location, sizeof(location), %X/%X,
-	 (uint32) (write  32), (uint32) write);
-			values[3] = CStringGetTextDatum(location);
+			values[3] = LSNGetDatum(write);
 
 			if (flush == 0)
 nulls[4] = true;
-			snprintf(location, sizeof(location), %X/%X,
-	 (uint32) (flush  32), (uint32) flush);
-			values[4] = CStringGetTextDatum(location);
+			values[4] = LSNGetDatum(flush);
 
 			if (apply == 0)
 nulls[5] = true;
-			snprintf(location, sizeof(location), %X/%X,
-	 (uint32) (apply  32), (uint32) apply);
-			values[5] = CStringGetTextDatum(location);
+			values[5] = LSNGetDatum(apply);
 
 			values[6] = Int32GetDatum(sync_priority[i]);
 
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 11c1e1a..a2cc19f 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -2634,7 +2634,7 @@ DATA(insert OID = 1936 (  pg_stat_get_backend_idset		PGNSP PGUID 12 1 100 0 0 f
 DESCR(statistics: currently active backend IDs);
 DATA(insert OID = 2022 (  pg_stat_get_activity			PGNSP PGUID 12 1 100 0 0 f f f f f t s 1 0 2249 23 {23,26,23,26,25,25,25,16,1184,1184,1184,1184,869,25,23} {i,o,o,o,o,o,o,o,o,o,o,o,o,o,o} {pid,datid,pid,usesysid,application_name,state,query,waiting,xact_start,query_start,backend_start,state_change,client_addr,client_hostname,client_port} _null_ pg_stat_get_activity _null_ _null_ _null_ ));
 DESCR(statistics: information about currently active backends);
-DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249  {23,25,25,25,25,25,23,25} {o,o,o,o,o,o,o,o} {pid,state,sent_location,write_location,flush_location,replay_location,sync_priority,sync_state} _null_ pg_stat_get_wal_senders _null_ _null_ _null_ ));
+DATA(insert OID = 3099 (  pg_stat_get_wal_senders	PGNSP PGUID 12 1 10 0 0 f f f f f t s 0 0 2249  {23,25,3220,3220,3220,3220,23,25} {o,o,o,o,o,o,o,o} 

Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 Attached is a patch to remove dllwarp from pgevent/Makefile.

Actually, it looks like this patch doesn't work at all:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53

Did I fat-finger the commit somehow?  I made a couple of cosmetic
changes, or at least I thought they were cosmetic.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Alvaro Herrera
Tom Lane escribió:
 Hiroshi Inoue in...@tpf.co.jp writes:
  Attached is a patch to remove dllwarp from pgevent/Makefile.
 
 Actually, it looks like this patch doesn't work at all:
 
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53
 
 Did I fat-finger the commit somehow?  I made a couple of cosmetic
 changes, or at least I thought they were cosmetic.

The format of the file is completely unlike that of the other *dll.def
files we use elsewhere.  AFAICT each line is

tabsymbolname@tabsome_number

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Hiroshi Inoue
(2014/02/20 10:32), Tom Lane wrote:
 Hiroshi Inoue in...@tpf.co.jp writes:
 Attached is a patch to remove dllwarp from pgevent/Makefile.
 
 Actually, it looks like this patch doesn't work at all:

Strangely enough it works here though I see double EXPORTS lines
in libpgeventdll.def.

 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=jacanadt=2014-02-20%2001%3A00%3A53
 
 Did I fat-finger the commit somehow?  I made a couple of cosmetic
 changes, or at least I thought they were cosmetic.

Seems EXPORTS line in exports.txt should be removed.

regards,
Hiroshi Inoue




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-02-19 Thread Tom Lane
Hiroshi Inoue in...@tpf.co.jp writes:
 Seems EXPORTS line in exports.txt should be removed.

Done.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: [COMMITTERS] pgsql: Add a GUC to report whether data page checksums are enabled.

2014-02-19 Thread Peter Geoghegan
On Wed, Feb 19, 2014 at 4:49 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 +1 for back-patching.
 Back-patching would be interesting for existing applications, but -1
 as it is a new feature :)

I think that it rises to the level of an omission in 9.3 that now
requires correction. Many of our users couldn't run pg_controldata
even if they'd heard of it...


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Amit Kapila
On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
kommi.harib...@gmail.com wrote:
 On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
  I want to propose a new feature called priority table or cache
  table.
  This is same as regular table except the pages of these tables are
  having
  high priority than normal tables. These tables are very useful, where a
  faster query processing on some particular tables is expected.

 Why exactly does the existing LRU behavior of shared buffers not do
 what you need?


 Lets assume a database having 3 tables, which are accessed regularly. The
 user is expecting a faster query results on one table.
 Because of LRU behavior which is not happening some times.

I think this will not be a problem for regularly accessed tables(pages),
as per current algorithm they will get more priority before getting
flushed out of shared buffer cache.
Have you come across any such case where regularly accessed pages
get lower priority than non-regularly accessed pages?

However it might be required for cases where user wants to control
such behaviour and pass such hints through table level option or some
other way to indicate that he wants more priority for certain tables
irrespective
of their usage w.r.t other tables.

Now I think here important thing to find out is how much helpful it is for
users or why do they want to control such behaviour even when Database
already takes care of such thing based on access pattern.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: BUG #9210: PostgreSQL string store bug? not enforce check with correct characterSET/encoding

2014-02-19 Thread Noah Misch
On Wed, Feb 19, 2014 at 08:22:13PM -0500, Tom Lane wrote:
 The more I looked into mbutils.c, the less happy I got.  The attached
 proposed patch takes care of the missing-verification hole in
 pg_do_encoding_conversion() and pg_server_to_any(), and also gets rid
 of what I believe to be obsolete provisions in pg_do_encoding_conversion
 to work if called outside a transaction --- if you consider it working
 to completely fail to honor its API contract.  That should no longer be
 necessary now that we perform client-server encoding conversions via
 perform_default_encoding_conversion rather than here.

I like these changes.  In particular, coping with the absence of a conversion
function by calling ereport(LOG) and returning the source string was wrong for
nearly every caller, but you'd need to try an encoding like MULE_INTERNAL to
notice the problem.  Good riddance.

 How much of this is back-patch material, do you think?

None of it.  While many of the failures to validate against a character
encoding are clear bugs, applications hum along in spite of such bugs and
break when we tighten the checks.  I don't see a concern to override that
here.  Folks who want the tighter checking have some workarounds available.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Priority table or Cache table

2014-02-19 Thread Haribabu Kommi
On Thu, Feb 20, 2014 at 2:26 PM, Amit Kapila amit.kapil...@gmail.comwrote:

 On Thu, Feb 20, 2014 at 6:24 AM, Haribabu Kommi
 kommi.harib...@gmail.com wrote:
  On Thu, Feb 20, 2014 at 11:38 AM, Tom Lane t...@sss.pgh.pa.us wrote:
   I want to propose a new feature called priority table or cache
   table.
   This is same as regular table except the pages of these tables are
   having
   high priority than normal tables. These tables are very useful, where
 a
   faster query processing on some particular tables is expected.
 
  Why exactly does the existing LRU behavior of shared buffers not do
  what you need?
 
 
  Lets assume a database having 3 tables, which are accessed regularly. The
  user is expecting a faster query results on one table.
  Because of LRU behavior which is not happening some times.

 I think this will not be a problem for regularly accessed tables(pages),
 as per current algorithm they will get more priority before getting
 flushed out of shared buffer cache.
 Have you come across any such case where regularly accessed pages
 get lower priority than non-regularly accessed pages?


Because of other regularly accessed tables, some times the table which
expects faster results is getting delayed.


 However it might be required for cases where user wants to control
 such behaviour and pass such hints through table level option or some
 other way to indicate that he wants more priority for certain tables
 irrespective
 of their usage w.r.t other tables.

 Now I think here important thing to find out is how much helpful it is for
 users or why do they want to control such behaviour even when Database
 already takes care of such thing based on access pattern.


Yes it is useful in cases where the application always expects the faster
results whether the table is used regularly or not.

Regards,
Hari Babu
Fujitsu Australia


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Kyotaro HORIGUCHI
Hello, 

 It is just my personal opinion, but I think it would be convenient for
 users to alter inheritance trees that contain foreign tables the same
 way as inheritance trees that don't contain any foreign tables,
 without making user conscious of the inheritance trees contains
 foreign tables or not.  Imagine we have an inheritance tree that
 contains only plain tables and then add a foreign table as a child of
 the inheritance tree.  Without the flexiblity, we would need to change
 the way of altering the structure of the inheritance tree (e.g., ADD
 CONSTRAINT) to a totally different one, immediately when adding the
 foreign table.  I don't think that would be easy to use.

I personally don't see significant advantages such a
flexibility. Although my concerns here are only two points,
unanticipated application and maintenancibility.  I gave a
consideration on these issues again.

Then, I think it could be enough by giving feedback to operators
for the first issue.

=# ALTER TABLE parent ADD CHECK (tempmin  tempmax),
  ALTER tempmin SET NOT NULL, 
  ALTER tempmin SET DEFAULT 0;
NOTICE: Child foregn table child01 is affected.
NOTICE: Child foregn table child02 is affected
NOTICE: Child foregn table child03 rejected 'alter tempmin set default'

What do you think about this? It looks a bit too loud for me
though...

Then the second issue, however I don't have enough idea of how
ALTER TABLE works, the complexity would be reduced if acceptance
chek for alter actions would done on foreign server or data
wrapper side, not on the core of ALTER TABLE. It would also be a
help to output error messages like above. However, (NO)INHERIT
looks a little different..


  http://www.postgresql.org/docs/9.3/static/sql-alterforeigntable.html
 
  Consistency with the foreign server is not checked when a
  column is added or removed with ADD COLUMN or DROP COLUMN, a
  NOT NULL constraint is added, or a column type is changed with
  SET DATA TYPE. It is the user's responsibility to ensure that
  the table definition matches the remote side.
 
  So I belive implicit and automatic application of any constraint
  on foreign childs are considerably danger.
 
 We spent a lot of time discussing this issue, and the consensus is
 that it's users' fault if there are some tuples on the remote side
 violating a given constraint, as mentioned in the documentation.

I'm worried about not that but the changes and possible
inconsistency would take place behind operators' backs. And this
looks to cause such inconsistencies for me.

  * [NO] INHERIT parent_table
 
  Is this usable for inheritance foreign children? NO INHERIT
  removes all foreign children but INHERIT is nop.
 
 I didn't express clearly.  Sorry for that.  Let me explain about it.
 
 * ALTER FOREIGN TABLE target_table *INHERIT* parent_table: Add the
 * target table as a new child of the parent table.
 * ALTER FOREIGN TABLE target_table *NO INHERIT* parent_table: Remove the
 * target table from the list of children of the parent table.

I got it, thank you. It alone seems no probmen but also doesn't
seem to be a matter of 'ALTER TABLE'. Could you tell me how this
is related to 'ALTER TABLE'?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] inherit support for foreign tables

2014-02-19 Thread Kyotaro HORIGUCHI
Hi,

At Wed, 19 Feb 2014 16:17:05 +0900, Shigeru Hanada wrote
 2014-02-18 19:29 GMT+09:00 Kyotaro HORIGUCHI 
 horiguchi.kyot...@lab.ntt.co.jp:
  Could you guess any use cases in which we are happy with ALTER
  TABLE's inheritance tree walking?  IMHO, ALTER FOREIGN TABLE
  always comes with some changes of the data source so implicitly
  invoking of such commands should be defaultly turned off.
 
 Imagine a case that foreign data source have attributes (A, B, C, D)
 but foreign tables and their parent ware defined as (A, B, C).  If
 user wants to use D as well, ALTER TABLE parent ADD COLUMN D type
 would be useful (rather necessary?) to keep consistency.

Hmm. I seems to me an issue of mis-configuration at first
step. However, my anxiety is - as in my message just before -
ALTER'ing foreign table definitions without any notice to
operatos and irregular or random logic on check applicability(?)
of ALTER actions.

 Changing data type from compatible one (i.e., int to numeric,
 varchar(n) to text), adding CHECK/NOT NULL constraint would be also
 possible.

I see, thank you. Changing data types are surely valuable but
also seems difficult to check validity:(

Anyway, I gave a second thought on this issue. Please have a look
on that.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Selecting large tables gets killed

2014-02-19 Thread Ashutosh Bapat
Hi All,
Here is a strange behaviour with master branch with head at

commit d3c4c471553265e7517be24bae64b81967f6df40
Author: Peter Eisentraut pete...@gmx.net
Date:   Mon Feb 10 21:47:19 2014 -0500

The OS is
[ashutosh@ubuntu repro]uname -a
Linux ubuntu 3.2.0-59-generic #90-Ubuntu SMP Tue Jan 7 22:43:51 UTC 2014
x86_64 x86_64 x86_64 GNU/Linux

This is a VM hosted on Mac-OS X 10.7.5

Here's the SQL script

[ashutosh@ubuntu repro]cat big_select_killed.sql
drop table big_tab;
create table big_tab (val int, val2 int, str varchar);

insert into big_tab select x, x, lpad('string', 100, x::text)
from generate_series(1, 1000) x;

select * from big_tab;

The last select causes the Killed message.

[ashutosh@ubuntu repro]psql -d postgres -f big_select_killed.sql
DROP TABLE
CREATE TABLE
INSERT 0 1000
Killed

There is a message in server log
FATAL:  connection to client lost
STATEMENT:  select * from big_tab;

Any SELECT selecting all the rows is getting psql killed but not SELECT
count(*)

[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type help for help.

postgres=# select count(*) from big_tab;
  count
--
 1000
(1 row)

postgres=# select * from big_tab;
Killed

[ashutosh@ubuntu repro]psql -d postgres
psql (9.4devel)
Type help for help.

Below is the buffer cache size and the relation size (if anyone cares)
postgres=# show shared_buffers;
 shared_buffers

 128MB
(1 row)

postgres=# select pg_relation_size('big_tab'::regclass);
 pg_relation_size
--
   1412415488
(1 row)

postgres=# select pg_relation_size('big_tab'::regclass)/1024/1024; -- IN
MBs to be simple
 ?column?
--
 1346
(1 row)

There are no changes in default configuration. Using unix sockets
[ashutosh@ubuntu repro]ls /tmp/.s.PGSQL.5432*
/tmp/.s.PGSQL.5432  /tmp/.s.PGSQL.5432.lock

Looks like a bug in psql to me. Does anybody see that behaviour?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] PoC: Partial sort

2014-02-19 Thread Alexander Korotkov
On Thu, Feb 13, 2014 at 1:54 AM, Marti Raudsepp ma...@juffo.org wrote:

 I think the 1st patch now has a bug in initial_cost_mergejoin; you
 still pass the presorted_keys argument to cost_sort, making it
 calculate a partial sort cost, but generated plans never use partial
 sort. I think 0 should be passed instead. Patch attached, needs to be
 applied on top of partial-sort-basic-1 and then reverse-applied on
 partial-sort-merge-1.


It doesn't look so for me. Merge join doesn't find partial sort especially.
But if path with some presorted pathkeys will be accidentally selected then
partial sort will be used. See create_mergejoin_plan function. So, I think
this cost_sort call is relevant to create_mergejoin_plan. If we don't want
partial sort to be used in such rare cases then we should revert it from
both places. However, I doubt that it does any overhead, so we can leave it
as is.

--
With best regards,
Alexander Korotkov.