date:20111201

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread YAMAMOTO Takashi

hi,

 On Wed, Nov 30, 2011 at 1:37 AM, YAMAMOTO Takashi
 y...@mwd.biglobe.ne.jp wrote:
 Yes, I would expect that.  What kind of increase are you seeing?  Is
 it causing a problem for you, or are you just making an observation?

 i was curious because my application uses async commits mainly to
 avoid frequent fsync.  i have no numbers right now.
 
 Oh, that's interesting.  Why do you want to avoid frequent fsyncs?

simply because it was expensive on my environment.

 I
 thought the point of synchronous_commit=off was to move the fsyncs to
 the background, but not necessarily to decrease the frequency.

it makes sense.
but it's normal for users to abuse features. :)

YAMAMOTO Takashi

 
 -- 
 Robert Haas
 EnterpriseDB: http://www.enterprisedb.com
 The Enterprise PostgreSQL Company
 
 -- 
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread Andres Freund

Hi Robert, 
On Wednesday, November 30, 2011 02:10:00 PM Robert Haas wrote:
 On Wed, Nov 30, 2011 at 1:37 AM, YAMAMOTO Takashi
 
 y...@mwd.biglobe.ne.jp wrote:
  Yes, I would expect that.  What kind of increase are you seeing?  Is
  it causing a problem for you, or are you just making an observation?
  
  i was curious because my application uses async commits mainly to
  avoid frequent fsync.  i have no numbers right now.
 Oh, that's interesting.  Why do you want to avoid frequent fsyncs?  I
 thought the point of synchronous_commit=off was to move the fsyncs to
 the background, but not necessarily to decrease the frequency.
Is that so? If it wouldn't avoid fsyncs how could you reach multiple thousand 
TPS in a writing pgbench run on a pretty ordinary system with fsync=on?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Accounting for toast in query planner. (gin/gist indexes).

2011-12-01 Thread jesper

 Jesper Krogh jes...@krogh.cc writes:
 I have currently hit a problem which I dug into finding the cause for,
 in
 particular, searching in GIN indices seems in some situations to
 un-fairly favor Sequential Scans.

 Googling a bit I found this page:
 http://postgis.refractions.net/docs/ch06.html#id2635817
 Describing the excact problem.

 It seemed to be discussed back in the pre 8.1 days and wasn't
 solved there, is there a chance someone may address it in 9.2 ?
 http://archives.postgresql.org/pgsql-performance/2005-02/msg00041.php

 Don't hold your breath.  There's a huge amount of work to do there,
 and nobody's even thinking about it.  The problem has actually gotten
 worse since 8.1, because the index access patterns are more complex
 and the planner has less not more info about them (because we pushed
 the determination of what's lossy to run time).  Accumulating stats
 about how toasty the heap tuples are would be only the first step
 towards producing better cost estimates here --- you'd also need some
 way of estimating how lossy an index is, for instance, so you could
 guess how many heap tuples will be visited.  And on top of that,
 for a lot of these complicated operators even our estimates of the final
 number of matches are pretty crummy.  I'm not sure if having an explicit
 model of what's happening inside the index would help.  Maybe it would,
 since the condition actually being tested by the index is probably
 simpler than the operator proper, but right now the planner has no clue
 that they're different.

I dont understand the details about lossy-ness, but even with the changes
for getting a cost estimate for GIN indexes that went into 9.1 (which
made the situation quite a lot better) we still favor Sequential Scans
in situations where it is a really bad idea. The cost of doing filtering
on something that is huge and relies in toast are not accounted for.

Taking the same dataset to the worst situation is a query like: this
key exists in all of the dataset, how many are there.. the alternative
query-plans are here:

2011-12-01 09:59:46.428 jk=# explain (analyze on, buffers on) select
count(id) from ftstest where fts @@ to_tsquery('english','test1');
  QUERY PLAN
--
 Aggregate  (cost=9591.79..9591.80 rows=1 width=4) (actual
time=109.084..109.084 rows=1 loops=1)
   Buffers: shared hit=1 read=4595
   -  Bitmap Heap Scan on ftstest  (cost=2149.59..9091.92 rows=199947
width=4) (actual time=38.613..87.784 rows=199939 loops=1)
 Recheck Cond: (fts @@ '''test1'''::tsquery)
 Buffers: shared hit=1 read=4595
 -  Bitmap Index Scan on ftstest_fts_idx1  (cost=0.00..2099.60
rows=199947 width=0) (actual time=37.847..37.847 rows=199939
loops=1)
   Index Cond: (fts @@ '''test1'''::tsquery)
   Buffers: shared hit=1 read=152
 Total runtime: 109.129 ms
(9 rows)

Time: 109.842 ms
2011-12-01 10:02:19.047 jk=# set enable_seqscan to on;
SET
Time: 0.122 ms
2011-12-01 10:02:23.657 jk=# explain (analyze on, buffers on) select
count(id) from ftstest where fts @@ to_tsquery('english','test1');
   QUERY PLAN

 Aggregate  (cost=7442.87..7442.88 rows=1 width=4) (actual
time=25266.047..25266.048 rows=1 loops=1)
   Buffers: shared hit=651448 read=561776
   -  Seq Scan on ftstest  (cost=0.00..6943.00 rows=199947 width=4)
(actual time=0.015..25177.959 rows=199939 loops=1)
 Filter: (fts @@ '''test1'''::tsquery)
 Rows Removed by Filter: 61
 Buffers: shared hit=651448 read=561776
 Total runtime: 25266.080 ms
(7 rows)

Time: 25266.740 ms
2011-12-01 10:02:49.936 jk=#

For FTS it hits double, since the tsvector coloum typically is in TOAST
(since it is large) and rarely is in memory (since it is rarely returned
to the user or used for anything except from building the index).

As can be seen, it hits 651448 buffers in the latter (default) query
and only 4596 in the former. Would just naively accounting for the
need of TOAST-access level that x10 difference out?

Secondly I could bump the default cost of ts_match_vq/ts_match_qv a
bit up, since the cost of doing that computation is probably not as
cheap as a ordinary boolean function.

I'm not trying to argue that the solution is simple, since my insights
are limited there, only that the problem forces me to drop in
set enable_seqscan to off in random places in the code. (which
i definately would  prefer to go without).

Jesper

-- 
Jesper


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow substitute allocators for PGresult.

2011-12-01 Thread Kyotaro HORIGUCHI

Hello, 

me I'll put further thought into dblink-plus taking it into
me account.
..
me Please let me have a little more time.

I've inquired the developer of dblink-plus about
RegisterResourceReleaseCallback(). He said that the function is
in bad compatibility with current implementation. In addition to
this, storing into tuplestore directly seems to me a good idea
than palloc'ed PGresult.

So I tried to make libpq/PGresult be able to handle alternative
tuple store by hinting to PGconn, and modify dblink to use the
mechanism as the first sample code.

I will show it as a series of patches in next message.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Allow substitute allocators for PGresult.

2011-12-01 Thread Kyotaro HORIGUCHI

Hello, This is the next version of Allow substitute allocators
for PGresult.

Totally chaning the concept from the previous one, this patch
allows libpq to handle alternative tuple store for received
tuples.

Design guidelines are shown below.

 - No need to modify existing client code of libpq.

 - Existing libpq client runs with roughly same performance, and
   dblink with modification runs faster to some extent and
   requires less memory.

I have measured roughly of run time and memory requirement for
three configurations on CentOS6 on Vbox with 2GB mem 4 cores
running on Win7-Corei7, transferring (30 bytes * 2 cols) *
200 tuples (120MB net) within this virutal machine. The
results are below.

  xfer time Peak RSS
Original: 6.02s 850MB
libpq patch + Original dblink   : 6.11s 850MB
full patch  : 4.44s 643MB

xfer time here is the mean of five 'real time's measured by
running sql script like this after warmup run.

=== test.sql
select dblink_connect('c', 'host=localhost port=5432 dbname=test');
select * from dblink('c', 'select a,c from foo limit 200') as (a text, b 
bytea) limit 1;

select dblink_disconnect('c');
===
$  for i in $(seq 1 10); do time psql test -f t.sql; done 
===

Peak RSS is measured by picking up heap Rss in /proc/[pid]/smaps.


It seems somewhat slow using patched libpq and original dblink,
but it seems within error range too. If this amount of slowdown
is not permissible, it might be improved by restoring the static
call route before for extra redundancy of the code.

On the other hand, full patch version seems obviously fast and
requires less memory. Isn't it nice?

This patch consists of two sub patches.

The first is a patch for libpq to allow rewiring tuple storage
mechanism. But default behavior is not changed. Existing libpq
client should run with it.

The second is modify dblink to storing received tuples into
tuplestore directly using the mechanism above.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index a360d78..1af8df6 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,7 +160,3 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
-PQregisterTupleAdder	  161
-PQgetAsCstring		  162
-PQgetAddTupleParam	  163
-PQsetAddTupleErrMes	  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 437be26..50f3f83 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2692,7 +2692,6 @@ makeEmptyPGconn(void)
 	conn-allow_ssl_try = true;
 	conn-wait_ssl_try = false;
 #endif
-	conn-addTupleFunc = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
@@ -5065,10 +5064,3 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
-
-void
-PQregisterTupleAdder(PGconn *conn, addTupleFunction func, void *param)
-{
-	conn-addTupleFunc = func;
-	conn-addTupleFuncParam = param;
-}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index c8ec9bd..113aab0 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -48,6 +48,7 @@ char	   *const pgresStatus[] = {
 static int	static_client_encoding = PG_SQL_ASCII;
 static bool static_std_strings = false;
 
+
 static PGEvent *dupEvents(PGEvent *events, int count);
 static bool PQsendQueryStart(PGconn *conn);
 static int PQsendQueryGuts(PGconn *conn,
@@ -65,9 +66,7 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
-static void *pqDefaultAddTupleFunc(PGresult *res, AddTupFunc func,
-   int id, size_t len);
-static void *pqAddTuple(PGresult *res, PGresAttValue *tup);
+
 
 /* 
  * Space management for PGresult.
@@ -161,9 +160,6 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
-	result-addTupleFunc = pqDefaultAddTupleFunc;
-	result-addTupleFuncParam = NULL;
-	result-addTupleFuncErrMes = NULL;
 
 	if (conn)
 	{
@@ -198,12 +194,6 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 			}
 			result-nEvents = conn-nEvents;
 		}
-
-		if (conn-addTupleFunc)
-		{
-			result-addTupleFunc = conn-addTupleFunc;
-			result-addTupleFuncParam = conn-addTupleFuncParam;
-		}
 	}
 	else
 	{
@@ -497,33 +487,6 @@ PQresultAlloc(PGresult *res, size_t nBytes)
 	return pqResultAlloc(res, nBytes, TRUE);
 }
 
-void *
-pqDefaultAddTupleFunc(PGresult *res, AddTupFunc func, int id, size_t len)
-{
-	void *p;
-
-	switch (func)
-	{
-		case ADDTUP_ALLOC_TEXT:
-			return pqResultAlloc(res, len, TRUE);
-
-		case ADDTUP_ALLOC_BINARY:
-

Re: [HACKERS] Allow substitute allocators for PGresult.

2011-12-01 Thread Kyotaro HORIGUCHI

Ouch! I'm sorry for making a reverse patch for the first modification.

This is an amendment of the message below. The body text is
copied into this message.

http://archives.postgresql.org/message-id/20111201.192419.103527179.horiguchi.kyot...@oss.ntt.co.jp

===
Hello, This is the next version of Allow substitute allocators
for PGresult.

Totally chaning the concept from the previous one, this patch
allows libpq to handle alternative tuple store for received
tuples.

Design guidelines are shown below.

 - No need to modify existing client code of libpq.

 - Existing libpq client runs with roughly same performance, and
   dblink with modification runs faster to some extent and
   requires less memory.

I have measured roughly of run time and memory requirement for
three configurations on CentOS6 on Vbox with 2GB mem 4 cores
running on Win7-Corei7, transferring (30 bytes * 2 cols) *
200 tuples (120MB net) within this virutal machine. The
results are below.

  xfer time Peak RSS
Original: 6.02s 850MB
libpq patch + Original dblink   : 6.11s 850MB
full patch  : 4.44s 643MB

xfer time here is the mean of five 'real time's measured by
running sql script like this after warmup run.

=== test.sql
select dblink_connect('c', 'host=localhost port=5432 dbname=test');
select * from dblink('c', 'select a,c from foo limit 200') as (a text, b 
bytea) limit 1;

select dblink_disconnect('c');
===
$  for i in $(seq 1 10); do time psql test -f t.sql; done 
===

Peak RSS is measured by picking up heap Rss in /proc/[pid]/smaps.


It seems somewhat slow using patched libpq and original dblink,
but it seems within error range too. If this amount of slowdown
is not permissible, it might be improved by restoring the static
call route before for extra redundancy of the code.

On the other hand, full patch version seems obviously fast and
requires less memory. Isn't it nice?

This patch consists of two sub patches.

The first is a patch for libpq to allow rewiring tuple storage
mechanism. But default behavior is not changed. Existing libpq
client should run with it.

The second is modify dblink to storing received tuples into
tuplestore directly using the mechanism above.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center


diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 1af8df6..a360d78 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -160,3 +160,7 @@ PQconnectStartParams  157
 PQping158
 PQpingParams  159
 PQlibVersion  160
+PQregisterTupleAdder	  161
+PQgetAsCstring		  162
+PQgetAddTupleParam	  163
+PQsetAddTupleErrMes	  164
diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index 50f3f83..437be26 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -2692,6 +2692,7 @@ makeEmptyPGconn(void)
 	conn-allow_ssl_try = true;
 	conn-wait_ssl_try = false;
 #endif
+	conn-addTupleFunc = NULL;
 
 	/*
 	 * We try to send at least 8K at a time, which is the usual size of pipe
@@ -5064,3 +5065,10 @@ PQregisterThreadLock(pgthreadlock_t newhandler)
 
 	return prev;
 }
+
+void
+PQregisterTupleAdder(PGconn *conn, addTupleFunction func, void *param)
+{
+	conn-addTupleFunc = func;
+	conn-addTupleFuncParam = param;
+}
diff --git a/src/interfaces/libpq/fe-exec.c b/src/interfaces/libpq/fe-exec.c
index 113aab0..c8ec9bd 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -48,7 +48,6 @@ char	   *const pgresStatus[] = {
 static int	static_client_encoding = PG_SQL_ASCII;
 static bool static_std_strings = false;
 
-
 static PGEvent *dupEvents(PGEvent *events, int count);
 static bool PQsendQueryStart(PGconn *conn);
 static int PQsendQueryGuts(PGconn *conn,
@@ -66,7 +65,9 @@ static PGresult *PQexecFinish(PGconn *conn);
 static int PQsendDescribe(PGconn *conn, char desc_type,
 			   const char *desc_target);
 static int	check_field_number(const PGresult *res, int field_num);
-
+static void *pqDefaultAddTupleFunc(PGresult *res, AddTupFunc func,
+   int id, size_t len);
+static void *pqAddTuple(PGresult *res, PGresAttValue *tup);
 
 /* 
  * Space management for PGresult.
@@ -160,6 +161,9 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 	result-curBlock = NULL;
 	result-curOffset = 0;
 	result-spaceLeft = 0;
+	result-addTupleFunc = pqDefaultAddTupleFunc;
+	result-addTupleFuncParam = NULL;
+	result-addTupleFuncErrMes = NULL;
 
 	if (conn)
 	{
@@ -194,6 +198,12 @@ PQmakeEmptyPGresult(PGconn *conn, ExecStatusType status)
 			}
 			result-nEvents = conn-nEvents;
 		}
+
+		if (conn-addTupleFunc)
+		{
+			result-addTupleFunc = conn-addTupleFunc;
+			result-addTupleFuncParam = conn-addTupleFuncParam;
+		}
 	}
 	else
 	{
@@ -487,6 +497,33 @@ PQresultAlloc(PGresult *res, size_t nBytes)

Re: [HACKERS] [PATCH] optional cleaning queries stored in pg_stat_statements

2011-12-01 Thread Greg Smith


On 11/25/2011 10:52 AM, Jeff Janes wrote:

Given Peter's patch on the same subject, should we now mark this one
as rejected in the commitfest app?
https://commitfest.postgresql.org/action/patch_view?id=681


That may be premature.  While the testing I've done so far suggests 
Peter's idea is the better route forward, the much higher complexity 
level does mean it's surely possible for it to be rejected.  We have a 
reviewer signed up for Peter's submission now; I think what to do with 
Tomas's entry will be an easier to make once that report is in.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch to allow users to kill their own queries

2011-12-01 Thread Greg Smith


On 11/16/2011 01:28 PM, Daniel Farina wrote:

As it would turn out, a patch for this has already been submitted:
http://archives.postgresql.org/pgsql-hackers/2011-10/msg1.php

There was some wrangling on whether it needs to be extended to be
useful, but for our purposes the formulation already posted already
captures vital value for us, and in that form appears to be fairly
uncontentious. I have moved it to the current commitfest, with a
comment linking to the 'please revive this patch' thread whereby a
second glance at what to do about this was conducted.


Yeah, that one got a raw deal; it should be in the *current* 
CommitFest--you had it in the next one.  I'll join the chorus to just 
allow people to fire just the pg_cancel_backend pea shooter foot gun 
targeted only at their own feet, make most users happy, and punt 
anything more complicated off as troublesome relative to its benefit.  
That's what Daniel has said, what his co-worker Edward also implemented, 
what Noah thought was good enough given other security mechanisms in 
place, and what Tom thinks is reasonable too.


This I feel is important, so I'm going to add myself as the next 
reviewer and include it in my testing run tomorrow.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Refactoring on DROP/ALTER SET SCHEMA/ALTER RENAME TO statement

2011-12-01 Thread Kohei KaiGai

2011/11/30 Robert Haas robertmh...@gmail.com:
 On Sun, Nov 27, 2011 at 3:14 PM, Kohei KaiGai kai...@kaigai.gr.jp wrote:
 If we add new properties that required by AlterObjectNamespace, as
 you suggested, it will allow to reduce number of caller of this routine
 mechanically with small changes.
 Should I try this reworking with this way?

 Yeah, let's try that for starters, and then see if anything further
 suggests itself.

OK. I marked this patch as Returned with Feedback.
I will try it again.
In the timeframe of this commit-fest, I'll focuse on remaining my
patches and reviewing pgsql-fdw.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inverse convertion for pg_mb2wchar

2011-12-01 Thread Alexander Korotkov

On Thu, Dec 1, 2011 at 12:30 AM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Nov 21, 2011 at 11:49 AM, Alexander Korotkov
 aekorot...@gmail.com wrote:
  I've a question about pg_mb2wchar function. Is there any way for inverse
  convertion pg_wchar* to char*?
  I've looked to pg_wchar_tbl table definition, and I didn't find anything
  about inverse transformation. So, any change to get inverse convertion?
  I'm experimenting with index support for regexp search and I'm trying to
 get
  some characters back from color map.

 Well, any char can presumably also be represented as a wchar, but the
 reverse isn't necessarily true...


So, if wchar can't be presented as sequence of chars, it means that it
can't occurs in string in server encoding. In this situation it's enough
for me to know that it is so for paticular wchar.
I found that for UTF8 uncoding wchar is unicode. For single-bytes encodings
wchar just hold original value in it's lower byte. And there are some
conversions for PG_EUC_JP, PG_EUC_CN, PG_EUC_KR, PG_EUC_TW, PG_EUC_JIS_2004
which are not clear for me, but it's seems to be feasible to write inverse
conversion code using existing code of direct conversion.


 (What's a color map?)


Regexp engine translates regexp to finite state automatum. In order to
automatum don't have too much arc, characters are grouped to colors.
Colormap maps wchar - color number. Since analyze automatum produced by
regexp engine, I need to convert wchar to original character.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] WIP: index support for regexp search

2011-12-01 Thread Alexander Korotkov

On Thu, Dec 1, 2011 at 12:29 AM, Robert Haas robertmh...@gmail.com wrote:

 Please add this patch here so it does not get lost in the shuffle:

 https://commitfest.postgresql.org/action/commitfest_view/open


Done.

--
With best regards,
Alexander Korotkov.

Re: [HACKERS] patch for type privileges

2011-12-01 Thread Yeb Havinga


On 2011-11-29 18:47, Peter Eisentraut wrote:

On tis, 2011-11-29 at 07:07 +0200, Peter Eisentraut wrote:

On mån, 2011-11-28 at 11:41 +0100, Yeb Havinga wrote:

On 2011-11-15 21:50, Peter Eisentraut wrote:

Patch attached.

I cannot get the patch to apply, this is the output of patch -p1
--dry-run on HEAD.

I need to remerge it against concurrent range type activity.

New patch attached.


I'm looking at your patch. One thing that puzzled me for a while was 
that I could not restrict access to base types (either built-in or user 
defined). Is this intentional?


regards,
Yeb Havinga


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Nov 30, 2011 at 3:58 PM, Stephen Frost sfr...@snowman.net wrote:
  Erm, isn't there a contrib type that already does all that for you..?
  ip4r or whatever?  Just saying, if you're looking for that capability..
 
 Oh, huh, good to know.  Still, I'm not sure why you need to load a
 separate type to get this... there's no reason why the built-in CIDR
 type couldn't support it.

The semantics of that type aren't what people actually want and there's
been push-back about changing it due to backwards compatibility, etc.
That's my recollection of the situation, anyway.  I'm sure there's all
kinds of fun talk in the archives about it.

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 4:09 AM, Andres Freund and...@anarazel.de wrote:
 Oh, that's interesting.  Why do you want to avoid frequent fsyncs?  I
 thought the point of synchronous_commit=off was to move the fsyncs to
 the background, but not necessarily to decrease the frequency.
 Is that so? If it wouldn't avoid fsyncs how could you reach multiple thousand
 TPS in a writing pgbench run on a pretty ordinary system with fsync=on?

Eh, well, what would stop you from achieving that?  An fsync operation
that occurs in the background doesn't block further transactions from
completing.  Meanwhile, getting the WAL records on disk faster allows
us to set hint bits sooner, which is a significant win, as shown by
the numbers I posted upthread.

One possible downside of trying to kick off the fsync more quickly is
that if there are a continuous stream of background fsyncs going on, a
process that needs to do an XLogFlush in the foreground (i.e. a
synchronous_commit=on transaction in the middle of many
synchronous_commit=off transactions) might be more likely to find an
fsync already in progress and therefore need to wait until it
completes before starting the next one, slowing things down.  But I'm
a bit reluctant to believe that is a real effect without some data.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread karavelov

- Цитат от Stephen Frost (sfr...@snowman.net), на 01.12.2011 в 15:56 -

 * Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Nov 30, 2011 at 3:58 PM, Stephen Frost sfr...@snowman.net wrote:
  Erm, isn't there a contrib type that already does all that for you..?
  ip4r or whatever?  Just saying, if you're looking for that capability..
 
 Oh, huh, good to know.  Still, I'm not sure why you need to load a
 separate type to get this... there's no reason why the built-in CIDR
 type couldn't support it.
 
 The semantics of that type aren't what people actually want and there's
 been push-back about changing it due to backwards compatibility, etc.
 That's my recollection of the situation, anyway.  I'm sure there's all
 kinds of fun talk in the archives about it.
 

I have reached one or two times to use build-in inet/cidr types but the lack of
indexing support for contains op was stopping me - i have used ip4r extension.

I do not think that adding index support to a datatype classifies as semantic 
change that will break backward compatibility.

Best regards
--
Luben Karavelov

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread Andres Freund

On Thursday, December 01, 2011 03:11:43 PM Robert Haas wrote:
 On Thu, Dec 1, 2011 at 4:09 AM, Andres Freund and...@anarazel.de wrote:
  Oh, that's interesting.  Why do you want to avoid frequent fsyncs?  I
  thought the point of synchronous_commit=off was to move the fsyncs to
  the background, but not necessarily to decrease the frequency.
  
  Is that so? If it wouldn't avoid fsyncs how could you reach multiple
  thousand TPS in a writing pgbench run on a pretty ordinary system with
  fsync=on?
 Eh, well, what would stop you from achieving that?  An fsync operation
 that occurs in the background doesn't block further transactions from
 completing. 
But it will slow down overall system io. For one an fsync() on linux will 
cause a queue drain on the io submit queue. For another it counts against the 
total available random io ops a device can do.
Which in turn will cause slowdown for anything else doing syncronous random 
io. I.e. read(2).

 Meanwhile, getting the WAL records on disk faster allows
 us to set hint bits sooner, which is a significant win, as shown by
 the numbers I posted upthread.
Oh, that part I dont doubt. Sorry for that.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2011-12-01 Thread Mikko Tiihonen


On 11/30/2011 06:52 PM, Merlin Moncure wrote:

On Mon, Nov 28, 2011 at 9:18 AM, Mikko Tiihonen
mikko.tiiho...@nitorcreations.com  wrote:

Hi,

As discussed few days ago it would be beneficial if we could change the v3
backend-client protocol without breaking backwards compatibility.

Here is a working patch that exports a supported_binary_minor constant and a
binary_minor session variable and a that can be used by clients to enable
newer features.

I also added an example usage where the array encoding for constant size
elements is optimized if binary_minor version is new enough.

I have coded the client side support for binary_minor for jdbc driver and
tested that it worked. But lets first discuss if this is an acceptable path
forward.


Regarding your TODO in the code comments about interactions with
replication:  I think it should be removed.  WAL streaming depends on
more things being identical than what is guaranteed here which is
basically the protocol + data formats.


OK. I'll remove the comments about replication.


I think all references to
'protocol' should be removed;  Binary wire formats != protocol: the
protocol could bump to v4 but the wire formats (by happenstance) could
all still continue to work -- therefore the version isn't minor (it's
not dependent on protocol version but only on itself).  Therefore,
don't much like the name 'supported_binary_minor'.  How about
binary_format_version?


I was thinking that it would be possible use the new minor version to
introduce also new protocol messages such as streaming of large data
into database without knowing it's size beforehand.


Also, is a non granular approach really buying us anything here?  ISTM
*something* is likely to change format on most releases of the server
so I'm wondering what's the point (if you don't happen to be on the
same x.x release of the server, you might as well assume to not match
or at least 'go on your own risk). The value added to the client
version query is quite low.


You have a very good point about changes in every new postgres
version. Either text or the binary encoding is likely to change for
some types in each new release.

There needs to be decision on official policy about breaking backwards
compatibility of postgresql clients. Is it:

A) Every x.y postgres release is free to break any parts of the
   * protocol
   * text encoding
   * binary protocol
   as long as it is documented
   - all client libraries should be coded so that they refuse to
  support version x.y+1 or newer (they might have a option to
  override this but there are no guarantees that it would work)
   Good: no random bugs when using old client library
   Bad: initial complaints from users before they understand that
this is the best option for everyone

B) Every x.y postgres release must guarantee that no client visible
   parts of protocol, text encoding or binary encoding will change
   from previous release in the v3 protocol. If any changes are
   needed then a new protocol version must be created.
   - very high barrier for new features
   Good: can upgrade server without updating clients
   Bad: new features are only introduced very rarely after enough
have accumulated
   Bad: many feature/change patches will rot while waiting for the
upcoming new version

C) There is effort to try avoiding incompatible changes. Some
   changes are blocked because it is detected that they can break
   backwards compatibility while others are let through (often with
   some compatibility option on server side to fall back to
   previous functionality (f.ex. bytea hex encoding).
   - As far as I understand this is the current situation.
   Good: has worked so far
   Bad: accumulates compatibility flags in the server

D) My proposed compromise where there is one minor_version setting
   and code in server to support all different minor versions.
   The client requests the minor version so that all releases
   default to backwards compatible version.
   When there combinations starts to create too much maintenance
   overhead a new clean v4 version of protocol is specified.
   Good: Keeps full backwards compatibility
   Good: Allows introducing changes at any time
   Bad: Accumulates conditional code to server and clients until
new version of protocol is released


I'd like the official policy to be A, otherwise I'll push for D.

-Mikko

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread Jeff Janes

On Thu, Dec 1, 2011 at 6:11 AM, Robert Haas robertmh...@gmail.com wrote:

 One possible downside of trying to kick off the fsync more quickly is
 that if there are a continuous stream of background fsyncs going on, a
 process that needs to do an XLogFlush in the foreground (i.e. a
 synchronous_commit=on transaction in the middle of many
 synchronous_commit=off transactions) might be more likely to find an
 fsync already in progress and therefore need to wait until it
 completes before starting the next one, slowing things down.

Waiting until the other one completes is how it currently is
implemented, but is it necessary from a correctness view?  It seems
like the WALWriteLock only needs to protect the write, and not the
sync (assuming the sync method allows those to be separate actions),
and that there could be multiple fsync requests from different
processes pending at the same time without a correctness problem.
After dropping the WALWriteLock and doing the fsync, it would then
have to take the lock again or maybe just a spinlock to update the
accounting for how far the log has been flushed.  So rather than one
committing process blocking on fsync and bunch of others blocking on
WALWriteLock, you could have all of them blocking on different fsyncs
and let the kernel deal with waking them up.  I don't know at all
whether this would actually be an improvement, assuming it would even
be safe.  Reading the xlog.c code, it is hard to tell which
designs/features are there for safety and which ones are there for
suspected performance reasons.

Sorry for high-jacking your topic, it is just something I had been
thinking about for a while.


Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] PostgreSQL fails to build with 32bit MinGW-w64

2011-12-01 Thread Andrew Dunstan




On 11/27/2011 09:18 AM, NISHIYAMA Tomoaki wrote:

Hi,


+/* __MINGW64_VERSION_MAJOR is related to both 32/64 bit gcc compiles by
+ * mingw-w64, however it gots defined only after
Why not use __MINGW32__, which is defined without including any headers?

Because it's defined by other than mingw-w64 compilers.

I see. That's because mingw (not -w64).
Should it be ok if mingw is ok with that condition?


This really breaks mingw gcc 4.6.1 :( it does not have crtdefs.h)
If moving downward do not break MSVC, perhaps its the good way.
Otherwise, we might check for the presence of crtdefs.h with configure?




I have looked at this a bit. It's fairly ugly, and the only moderately 
clean way I see forward is a configure test to check for a mingw-w64 
compiler, e.g. by running gcc -E -dM over a file which just includes 
stdio.h and checking for the definedness of __MINGW64__VERSION_MAJOR, or 
something similar.



cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Andres Freund

Hi Dimitri, Hi all,

On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
 As proposed by Jan Wieck on hackers and discussed in Ottawa at the
 Clusters Hackers Meeting, the most (only?) workable way to ever have DDL
 triggers is to have command triggers instead.
 Rather than talking about catalogs and the like, it's all about firing a
 registered user-defined function before or after processing the utility
 command, or instead of processing it.  This naturally involves a new
 catalog (pg_cmdtrigger) and some new subcommands of CREATE and DROP
 TRIGGER, and some support code for calling the function at the right
 time.
Great ;)

fwiw I think thats the way forward as well.

The patch obviously isn't thought to be commitable yet, so I am not going to 
do a very detailed code level review. Attached is a patch with low level 
targeted comments and such I noticed while reading it.

At this state the important things are highlevel so I will try to concentrate 
on those:

== the trigger function ==
I don't like the set of options parsed to the trigger functions very much. To 
cite an example of yours those currently are:
* cmd_string text
* cmd_nodestring text
* schemaname text
* relnametext

I think at least a
* command_tagtext
is missing. Sure, you can disambiguate it by creating a function for every 
command type but that seems pointlessly complex for many applications. And 
adding it should be trivial as its already tracked ;)

Currently the examples refer to relname as relname which I dislike as that 
seems to be too restricted to one use case. The code is already doing it 
correctly (as objectname) though ;)

Why is there explicit documentation of not checking the arguments of said 
trigger function? That seems to be a bit strange to me.

schemaname currently is mightily unusable because whether its sent or not 
depends wether the table was created with a schemaname specified or not. That 
doesn't seem to be a good approach.

That brings me to the next point:

== normalization of statements ==
Currently the normalization is a bit confusing. E.g. for:

CREATE SCHEMA barf;
SET search_path = barf;
CREATE TYPE bart AS ENUM ('a', 'b');
CREATE TABLE bar(a int, b bigint, c serial, d text, e varchar, F text, g 
bart, h int4);

one gets

CREATE TABLE bar (a pg_catalog.int4,b pg_catalog.int8,c serial,d text,e 
pg_catalog.varchar,F text,g bart,h int4);

Which points out two problems:
* inconsistent schema prefixing
* no quoting

Imo the output should fully schema qualify anything including operators. Yes, 
thats rather ugly but anything else seems to be too likely to lead to 
problems.

As an alternative it would be possible to pass the current search path but 
that doesn't seem to the best approach to me;

Currently CHECK() constraints are not decodable, but inside those the same 
quoting/qualifying rules should apply.


Then there is nice stuff like CREATE TABLE  (LIKE ...);
What shall that return in future? I vote for showing it as the plain CREATE 
TABLE where all columns are specified.

That touches another related topic. Dim's work in adding support for utility 
cmd support in nodeToString and friends possibly will make the code somewhat 
command trigger specific. Is there any other usage we envision?

== grammar == 

* multiple actions:
I think it would sensible to allow multiple actions on which to trigger to be 
specified just as you can for normal triggers. I also would like an ALL option

* options:
Currently the grammar allows options to be passed to command triggers. Do we 
want to keep that? If yes, with the same arcane set of datatypes as in data 
triggers?
If yes it needs to be wired up.

* schema qualification:
An option to add triggers only to a specific schema would be rather neat for 
many scenarios.
I am not totally sure if its worth the hassle of defining what happens in the 
edge cases of e.g. moving from one into another schema though.

== locking ==

Currently there is no locking strategy at all. Which e.g. means that there is 
no protection against two sessions changing stuff concurrently or such.

Was that just left out - which is ok - or did you miss that?

I think it would be ok to just always grab row level locks there.

== permissions ==

Command triggers should only be allowed for the database owner. 

== recursion ==
I contrast to data triggers command triggers cannot change what is done. They 
can either prevent it from running or not. Which imo is fine.
The question I have is whether it would be a good idea to make it easy to 
prevent recursion Especially if the triggers wont be per schema.


== calling the trigger ==

Imo the current callsite in ProcessUtility isn't that good. I think it would 
make more sense moving it into standard_ProcessUtility. It should be *after* 
the check_xact_readonly check in there in my opinion.

I am also don't think its a good idea to run the 
ExecBeforeOrInsteadOfCommandTriggers stuff there because thats allso

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Peter Geoghegan

Attached is revision of my patch with some clean-ups. In particular,
I'm now using switch statements for greater readability, plus
supporting fast path sorting of the time datatype. I've also updated
the documentation on Date/Time Types to note the additional
disadvantage of using the deprecated store timestamp + friends as
double precision floating-point numbers compile time option.

There is one aspect to this optimisation that I haven't touched on,
which is the effect on memory consumption. I think that much of the
value that this patch will deliver will come from being able to
release sort memory earlier. Consider that the substantial
improvements in raw sorting speed (far more substantial than the
improvements in query runtime) will sometimes result in a concomitant
reduction in the time that the executor holds onto memory allocated
for sorting. Maybe the effect will only be really noticeable for plans
with a sort node as their root node, but that isn't exactly a rare
occurrence, particularly among large, expensive sorts.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services
From cffad91578edd698e89ef2bf7384d82aee26b57e Mon Sep 17 00:00:00 2001
From: peter2ndQuadrant pe...@2ndquadrant.com
Date: Sun, 20 Nov 2011 20:36:25 +
Subject: [PATCH] Initial commit of optimization

Stop directly using oid

Added int8 quicksort fast path specialisation, which can also be used in
place of F_TIMESTAMP_CMP for HAVE_INT64_TIMESTAMP builds.

Rebased, revised patch for -hackers, with timestamp and int8 fast path
sorting using the same int8 specialization.

Remove unneeded line

Rebase for -hackers

Descriminate against cases where nKeys  1 when selecting sort
specialisation.

We now support fast path sorting with multiple scanKeys.

We now inline for one scanKey, but do not inline our comparetup_heap
replacement when multiple scanKeys are used (although this is at the
compiler's discretion). However, when multiple scanKeys are used, we
still use a specialisation that will elide the SQL-function-call
machinery for the first scanKey, which is almost as good for most cases.

Further formatting clean-ups

Move specialization selection logic to tuplesort_begin_heap(), per
suggestion of Robert Haas

Add simple switch for testing optimization.

Improvements to template qsort_arg comments. Support fast path sorting
of dates.

Added float fast path sorting

Patch mimimization.

Additional clean-up.

Indentation issue.

Specialization clean-ups

Use switch statement for sort specialization selection

Further comment clean-up.

Additional clean-up

Make comparators consistent

Added date fast path support, documentation updates

Fix template_qsort_arg.h assertion

Assert nScanKeys == 1 for inlining case
---
 doc/src/sgml/datatype.sgml |6 +
 src/backend/utils/sort/tuplesort.c |  223 -
 src/include/utils/template_qsort_arg.h |  288 
 src/port/qsort.c   |3 +-
 4 files changed, 512 insertions(+), 8 deletions(-)
 create mode 100644 src/include/utils/template_qsort_arg.h

diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index fe59a1c..f5dfbbb 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -1619,6 +1619,12 @@ SELECT E'\\xDEADBEEF';
 floating-point case, large typeinterval/type values degrade in
 precision as the size of the interval increases.
/para
+   para
+Note that if the server is built to represent these types as
+floating point numbers, that will prevent it from using fast path
+sort specializations for the types, and sorting columns of those
+types will be noticeably slower.
+   /para
/note
 
para
diff --git a/src/backend/utils/sort/tuplesort.c b/src/backend/utils/sort/tuplesort.c
index 3505236..0a14a90 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -107,11 +107,13 @@
 #include miscadmin.h
 #include pg_trace.h
 #include utils/datum.h
+#include utils/fmgroids.h
 #include utils/logtape.h
 #include utils/lsyscache.h
 #include utils/memutils.h
 #include utils/pg_rusage.h
 #include utils/rel.h
+#include utils/template_qsort_arg.h
 #include utils/tuplesort.h
 
 
@@ -226,6 +228,13 @@ struct Tuplesortstate
 		   Tuplesortstate *state);
 
 	/*
+	 * This specialization function pointer is sometimes used as an alternative
+	 * to the standard qsort_arg, when it has been determined that we can 
+	 * benefit from various per-type performance optimizations.
+	 */
+	void (*qsort_arg_spec)(void *a, size_t n, size_t es, void *arg);
+
+	/*
 	 * Function to copy a supplied input tuple into palloc'd space and set up
 	 * its SortTuple representation (ie, set tuple/datum1/isnull1).  Also,
 	 * state-availMem must be decreased by the amount of space used for the
@@ -457,6 +466,11 @@ static void tuplesort_heap_insert(Tuplesortstate *state, SortTuple

Re: [HACKERS] synchronous commit vs. hint bits

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 9:58 AM, Jeff Janes jeff.ja...@gmail.com wrote:
 Waiting until the other one completes is how it currently is
 implemented, but is it necessary from a correctness view?  It seems
 like the WALWriteLock only needs to protect the write, and not the
 sync (assuming the sync method allows those to be separate actions),
 and that there could be multiple fsync requests from different
 processes pending at the same time without a correctness problem.

I've wondered about that, too.  At least on Linux, the overhead of a
system call seems to be pretty low - e.g. the ridiculous number of
lseek calls we do on a pgbench -S doesn't seem create much overhead
until the inode mutex starts to become contended; and that problem
should be fixed in Linux 3.2.  But I'm not sure if system calls are
similarly cheap on all platforms, or even if it's true on Linux for
fsync() in particular.

There's another possible approach here, too: instead of waiting to set
hint bits until the commit record hits the disk, we could allow the
hint bits to set immediately on the condition that we don't write it
out until the commit record hits the disk.  Bumping the page LSN would
do that, but I think that might be problematic since setting hint bits
isn't WAL-logged.  If so, we could possibly fix that by storing a
second LSN for the page out of line, e.g. in the buffer descriptor.
That might be even faster than speeding up the WAL flush.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add minor version to v3 protocol to allow changes without breaking backwards compatibility

2011-12-01 Thread Merlin Moncure

On Thu, Dec 1, 2011 at 8:42 AM, Mikko Tiihonen
mikko.tiiho...@nitorcreations.com wrote:
 On 11/30/2011 06:52 PM, Merlin Moncure wrote:

 On Mon, Nov 28, 2011 at 9:18 AM, Mikko Tiihonen
 mikko.tiiho...@nitorcreations.com  wrote:

 Hi,

 As discussed few days ago it would be beneficial if we could change the
 v3
 backend-client protocol without breaking backwards compatibility.

 Here is a working patch that exports a supported_binary_minor constant
 and a
 binary_minor session variable and a that can be used by clients to enable
 newer features.

 I also added an example usage where the array encoding for constant size
 elements is optimized if binary_minor version is new enough.

 I have coded the client side support for binary_minor for jdbc driver and
 tested that it worked. But lets first discuss if this is an acceptable
 path
 forward.


 Regarding your TODO in the code comments about interactions with
 replication:  I think it should be removed.  WAL streaming depends on
 more things being identical than what is guaranteed here which is
 basically the protocol + data formats.


 OK. I'll remove the comments about replication.


 I think all references to
 'protocol' should be removed;  Binary wire formats != protocol: the
 protocol could bump to v4 but the wire formats (by happenstance) could
 all still continue to work -- therefore the version isn't minor (it's
 not dependent on protocol version but only on itself).  Therefore,
 don't much like the name 'supported_binary_minor'.  How about
 binary_format_version?


 I was thinking that it would be possible use the new minor version to
 introduce also new protocol messages such as streaming of large data
 into database without knowing it's size beforehand.


 Also, is a non granular approach really buying us anything here?  ISTM
 *something* is likely to change format on most releases of the server
 so I'm wondering what's the point (if you don't happen to be on the
 same x.x release of the server, you might as well assume to not match
 or at least 'go on your own risk). The value added to the client
 version query is quite low.


 You have a very good point about changes in every new postgres
 version. Either text or the binary encoding is likely to change for
 some types in each new release.

 There needs to be decision on official policy about breaking backwards
 compatibility of postgresql clients. Is it:

 A) Every x.y postgres release is free to break any parts of the
   * protocol
   * text encoding
   * binary protocol
   as long as it is documented
   - all client libraries should be coded so that they refuse to
      support version x.y+1 or newer (they might have a option to
      override this but there are no guarantees that it would work)
   Good: no random bugs when using old client library
   Bad: initial complaints from users before they understand that
        this is the best option for everyone

 B) Every x.y postgres release must guarantee that no client visible
   parts of protocol, text encoding or binary encoding will change
   from previous release in the v3 protocol. If any changes are
   needed then a new protocol version must be created.
   - very high barrier for new features
   Good: can upgrade server without updating clients
   Bad: new features are only introduced very rarely after enough
        have accumulated
   Bad: many feature/change patches will rot while waiting for the
        upcoming new version

 C) There is effort to try avoiding incompatible changes. Some
   changes are blocked because it is detected that they can break
   backwards compatibility while others are let through (often with
   some compatibility option on server side to fall back to
   previous functionality (f.ex. bytea hex encoding).
   - As far as I understand this is the current situation.
   Good: has worked so far
   Bad: accumulates compatibility flags in the server

 D) My proposed compromise where there is one minor_version setting
   and code in server to support all different minor versions.
   The client requests the minor version so that all releases
   default to backwards compatible version.
   When there combinations starts to create too much maintenance
   overhead a new clean v4 version of protocol is specified.
   Good: Keeps full backwards compatibility
   Good: Allows introducing changes at any time
   Bad: Accumulates conditional code to server and clients until
        new version of protocol is released


 I'd like the official policy to be A, otherwise I'll push for D.

In issue A, you mentioned that client libraries should refuse to
support version x+1 or newer.  If by client libraries, you mean
'libpq', then this is absolutely not going to fly -- libpq has no such
restriction now and adding it isn't doing users any favors IMO.  The
problem is not libpq/jdbc etc, but application code.  I'll say again,
wire format compatibility issues are fundamentally different  from the
protocol issues; decades of user application

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 11:44 AM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Attached is revision of my patch with some clean-ups. In particular,
 I'm now using switch statements for greater readability, plus
 supporting fast path sorting of the time datatype. I've also updated
 the documentation on Date/Time Types to note the additional
 disadvantage of using the deprecated store timestamp + friends as
 double precision floating-point numbers compile time option.

One thing I'm starting to get a bit concerned about is the effect of
this patch on the size of the resulting binary.  The performance
results you've posted are getting steadily more impressive as you get
into this, which is cool, but it seems like the number of copies of
the qsort logic that you're proposing to include in the resulting
binary is also steadily climbing.  On my system, a stripped postgres
binary built with my usual compile options (except --enable-cassert,
which I took out) is 49336 bytes bigger with this patch applied, an
increase of close to 1%.  We might need to be a little bit choosy
about this, because I don't think that we want to end up with a
situation where some noticeable percentage of the final binary
consists of differently-inlined versions of the quicksort algorithm -
especially because there may be other places where we want to do
similar kinds of inlining.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 9:12 AM,  karave...@mail.bg wrote:
 I do not think that adding index support to a datatype classifies as
 semantic
 change that will break backward compatibility.

Me neither.  The ip4r type also supports ranges that aren't on
CIDR-block boundaries, which probably isn't something that makes sense
to incorporate into cidr.  But not everyone needs that, and some
people might also need support for ipv6 CIDR blocks, which ip4r
doesn't support.  So I don't necessarily see the existence of ip4r as
a reason why cidr shouldn't have better indexing support.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] FlexLocks

2011-12-01 Thread Robert Haas

On Wed, Nov 30, 2011 at 7:01 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov wrote:

 OK.  There are a few things I found in this pass which missed in the
 last.  One contrib module was missed, I found another typo in a
 comment, and I think we can reduce the include files a bit.  Rather
 than describe it, I'm attaching a patch file over the top of your
 patches with what I think might be a good idea.

 This time with it really attached.

Thanks, I've merged those in.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Andres Freund

Hi,

On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
 exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
 and my proposal would be to add specific call sites to the functions
 I've provided in the attached patch rather than try to contort them into
 being a good citizen as a utility command.
I would very much prefer to make them utility commands and rip out the 
executor support for that.
It doesn't look that complicated and would allow us to get rid of the 
partially duplicated defineRelation and related stuff from the executor where 
its violating my aestetic feelings ;)

Is there something making that especially hard that I overlooked? The 
infrastructure for doing so seems to be there.


Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Tom Lane

Andres Freund and...@anarazel.de writes:
 On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
 exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
 and my proposal would be to add specific call sites to the functions
 I've provided in the attached patch rather than try to contort them into
 being a good citizen as a utility command.

 I would very much prefer to make them utility commands and rip out the 
 executor support for that.
 Is there something making that especially hard that I overlooked? The 
 infrastructure for doing so seems to be there.

Well, I think the main problem is going to be shunting the query down
the right parsing track (SELECT versus utility-statement) early enough.
Making this work cleanly would be a bigger deal than I think you're
thinking.

But yeah, it would be nicer if the executor didn't have to worry about
this.  +1 if you're willing to take on the work.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco

All,I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping. I had some hardware issues on the replica system and now I am getting the following in my pg_log/* files. Same2 lines over and over since yesterday.2011-12-01 07:46:30 EST LOG: restored log file "0001028E00E5" from archive2011-12-01 07:46:30 EST LOG: incorrect resource manager data checksum in record at 28E/E555E1B8Anything I can do on the replica or do I have to start over?Finally, I know this is not the correct list, I tried general with no answer.ThanksJim___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 1:41 PM, Jim Buttafuoco j...@contacttelecom.com wrote:
 2011-12-01 07:46:30 EST  LOG:  restored log file 0001028E00E5 
 from archive
 2011-12-01 07:46:30 EST  LOG:  incorrect resource manager data checksum in 
 record at 28E/E555E1B8

 Anything I can do on the replica or do I have to start over?

I think you want to rebuild the standby.  Even if you could repair the
damaged WAL record, how can you have any confidence that there is no
other corruption?

Note that rsync has some options to only copy the changed data, which
might greatly accelerated resyncing the standby from the master.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jerry Sievers

Jim Buttafuoco j...@contacttelecom.com writes:

 All,

 I have a large PG 9.1.1 server (over 1TB of data) and replica using log 
 shipping.  I had some hardware issues on the
 replica system and now I am getting the following in my pg_log/* files.  Same 
 2 lines over and over since yesterday.

 2011-12-01 07:46:30 EST  LOG:  restored log file 0001028E00E5 
 from archive
 2011-12-01 07:46:30 EST  LOG:  incorrect resource manager data checksum in 
 record at 28E/E555E1B8

 Anything I can do on the replica or do I have to start over?

INspect that WAL segment or possibly the one immediatly following it
in comparison to another copy if you still have it on the master or a
central WAL repository.

A standby crashing meanwhile copying in a WAL segment and/or synching
one to disk could result in ramdon corruption.

If you have another copy of the segment and does not compare equal to
the one your standby is trying to read, try another copy.

 Finally, I know this is not the correct list, I tried general with no answer.

The admin list is the right one for such a post probably.

HTH

 Thanks
 Jim
 ___

 [cid]

 Jim Buttafuoco
 j...@contacttelecom.com
 603-647-7170 ext. - Office
 603-490-3409 - Cell
 jimbuttafuoco - Skype


-- 
Jerry Sievers
Postgres DBA/Development Consulting
e: postgres.consult...@comcast.net
p: 305.321.1144

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco

the WAL file on the master is long gone, how would one inspect the web segment? Any way to have PG "move" on?On Dec 1, 2011, at 2:02 PM, Jerry Sievers wrote:Jim Buttafuoco j...@contacttelecom.com writes:All,I have a large PG 9.1.1 server (over 1TB of data) and replica using log shipping. I had some hardware issues on thereplica system and now I am getting the following in my pg_log/* files. Same 2 lines over and over since yesterday.2011-12-01 07:46:30 EST LOG: restored log file "0001028E00E5" from archive2011-12-01 07:46:30 EST LOG: incorrect resource manager data checksum in record at 28E/E555E1B8Anything I can do on the replica or do I have to start over?INspect that WAL segment or possibly the one immediatly following itin comparison to another copy if you still have it on the master or acentral WAL repository.A standby crashing meanwhile copying in a WAL segment and/or synchingone to disk could result in ramdon corruption.If you have another copy of the segment and does not compare equal tothe one your standby is trying to read, try another copy.Finally, I know this is not the correct list, I tried general with no answer.The admin list is the right one for such a post probably.HTHThanksJim___[cid]Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype-- Jerry SieversPostgres DBA/Development Consultinge: postgres.consult...@comcast.netp: 305.321.1144___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype

Re: [HACKERS] Command Triggers

2011-12-01 Thread Andres Freund

On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
 Andres Freund and...@anarazel.de writes:
  On Tuesday, November 08, 2011 06:47:13 PM Dimitri Fontaine wrote:
  exception to that rule would be SELECT INTO and CREATE TABLE AS SELECT,
  and my proposal would be to add specific call sites to the functions
  I've provided in the attached patch rather than try to contort them into
  being a good citizen as a utility command.
  
  I would very much prefer to make them utility commands and rip out the
  executor support for that.
  Is there something making that especially hard that I overlooked? The
  infrastructure for doing so seems to be there.
 
 Well, I think the main problem is going to be shunting the query down
 the right parsing track (SELECT versus utility-statement) early enough.
 Making this work cleanly would be a bigger deal than I think you're
 thinking.
Yes. I forgot how ugly SELECT INTO is...

 But yeah, it would be nicer if the executor didn't have to worry about
 this.  +1 if you're willing to take on the work.
I won't promise anything but I will play around with the grammar and see if I 
find a nice enough way.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Andres Freund

On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
 Well, I think the main problem is going to be shunting the query down
 the right parsing track (SELECT versus utility-statement) early enough.
 Making this work cleanly would be a bigger deal than I think you're
 thinking.
Obviously that depends on the definition of clean...

Changing the grammar to make that explicit seems to involve a bit too many 
changes on a first glance. The cheap way out seems to be to make the decision 
in analyze.c:transformQuery.
Would that be an acceptable way forward?

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Simon Riggs

On Sat, Nov 26, 2011 at 7:20 PM, Dimitri Fontaine
dimi...@2ndquadrant.fr wrote:

 FWIW (scheduling mainly), I don't think I'm going to spend more time on
 this work until I get some comments, because all that remains to be done
 is about building on top of what I've already been doing here.

+1 overall


One of the main use cases for me is the ability to
* prevent all commands
* prevent extension commands - to maintain stable execution environment

Those are especially important because in 9.2 DDL commands will cause
additional locking overheads, so preventing DDL will be essential to
keeping performance stable in high txn rate databases.

So I'd like to see a few more triggers that work out of the box for
those cases (perhaps wrapped by a function?). It would also allow a
more useful man page example of how to use this.


On the patch, header comment for cmdtrigger.c needs updating.
Comments for DropCmdTrigger etc look like you forgot to update them
after cut and paste.

Comments say For any given command tag, you can have either Before
and After triggers, or
 Instead Of triggers, not both., which seems like a great thing to
put in the docs.

Looks good.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Simon Riggs

On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.comwrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


Regenerate the master.



 --

 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Simon Riggs

On Thu, Dec 1, 2011 at 9:08 PM, Simon Riggs si...@2ndquadrant.com wrote:
 On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.com
 wrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


 Regenerate the master.

typo: regenerate *from* the master

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] TupleDescInitEntry failing to initialize varlen members

2011-12-01 Thread Alvaro Herrera


Hi,

I just noticed in gdb that TupleDescInitEntry does not initialize
attacl, attoptions, attfdwoptions.  This is probably not very serious
(otherwise we'd have bugs about it), but it is noticeable in tupdescs
constructed by the executor, at least ExecTypeFromTL:

(gdb) print *tupDesc-attrs[2]
$6 = {attrelid = 0, attname = {data = c, '\000' repeats 62 times}, atttypid 
= 1114, 
  attstattarget = -1, attlen = 8, attnum = 3, attphysnum = 3, attlognum = 3, 
attndims = 0, 
  attcacheoff = -1, atttypmod = -1, attbyval = 1 '\001', attstorage = 112 'p', 
attalign = 100 'd', 
  attnotnull = 0 '\000', atthasdef = 0 '\000', attisdropped = 0 '\000', 
attislocal = 1 '\001', 
  attinhcount = 0, attcollation = 0, attacl = {2139062142}, attoptions = {{
  vl_len_ = \177\177\177\177, vl_dat = \177}}, attfdwoptions = {{
  vl_len_ = \177\177\177\177, vl_dat = \177}}}


Does anybody think this is worth fixing?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread Jim Buttafuoco

Simon,What do you mean, start over with a base backup?JimOn Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.com wrote:
the WAL file on the master is long gone, how would one inspect the web segment? Any way to have PG "move" on?Regenerate the master.
-- Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services

___Jim Buttafuocoj...@contacttelecom.com603-647-7170 ext. - Office603-490-3409 - Celljimbuttafuoco - Skype

Re: [HACKERS] patch for type privileges

2011-12-01 Thread Peter Eisentraut

On tor, 2011-12-01 at 14:37 +0100, Yeb Havinga wrote:
 On 2011-11-29 18:47, Peter Eisentraut wrote:
  On tis, 2011-11-29 at 07:07 +0200, Peter Eisentraut wrote:
  On mån, 2011-11-28 at 11:41 +0100, Yeb Havinga wrote:
  On 2011-11-15 21:50, Peter Eisentraut wrote:
  Patch attached.
  I cannot get the patch to apply, this is the output of patch -p1
  --dry-run on HEAD.
  I need to remerge it against concurrent range type activity.
  New patch attached.
 
 I'm looking at your patch. One thing that puzzled me for a while was 
 that I could not restrict access to base types (either built-in or user 
 defined). Is this intentional?

Works for me:

=# create user foo;
=# revoke usage on type int8 from public;
=# \c - foo
= create table test1 (a int4, b int8);
ERROR:  permission denied for type bigint



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] patch for type privileges

2011-12-01 Thread Peter Eisentraut

On mån, 2011-11-28 at 14:25 -0600, Merlin Moncure wrote:
 On Tue, Nov 15, 2011 at 2:23 PM, Peter Eisentraut pete...@gmx.net wrote:
  The basics here are mainly informed by the SQL standard.  One thing from
  there I did not implement is checking for permission of a type used in
  CAST (foo AS type).  This would be doable but relatively complicated,
  and in practice someone how is not supposed to be able to use the type
  wouldn't be able to create the cast or the underlying cast function
  anyway for lack of access to the type.
 
 I'm not quite following that: with your patch are you or are you not
 prohibited from utilizing casts?  In other words, if you didn't have
 USAGE priv, what would happen if you tried this:
 
 CREATE VIEW v AS SELECT null::restricted_type::text; ?

This is not affected by my patch, so it would do whatever it did before.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TupleDescInitEntry failing to initialize varlen members

2011-12-01 Thread Tom Lane

Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 I just noticed in gdb that TupleDescInitEntry does not initialize
 attacl, attoptions, attfdwoptions.

Indeed not, because it's not building a tuple.  It's building an array
of C structs, and there's nothing useful to do with those fields.

(Robert's proposal to not even have such fields be C-visible might make
you feel better.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] to_date() marked stable?

2011-12-01 Thread Josh Berkus


 Are there people using to_date in indexes or partition functions where
 changing it to immutable would be useful?

I do, but I also create a shell function which sets timezone etc.
locally so that to_date is *really* immutable.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgresql 9.1 replication failing

2011-12-01 Thread desmodemone

Hello Jim,
   I think you not have other possibilities if the archives are
corrupted and there are no possibilities to restore it,
you need to recreate the standby starting from a base backup.

Kind Regards


2011/12/1 Jim Buttafuoco j...@contacttelecom.com

 Simon,

 What do you mean, start over with a base backup?

 Jim

 On Dec 1, 2011, at 4:08 PM, Simon Riggs wrote:

 On Thu, Dec 1, 2011 at 7:09 PM, Jim Buttafuoco j...@contacttelecom.comwrote:

 the WAL file on the master is long gone, how would one inspect the web
 segment?  Any way to have PG move on?


 Regenerate the master.



 --

  Simon Riggs   http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services


 ___






 Jim Buttafuoco
 j...@contacttelecom.com
 603-647-7170 ext. - Office
 603-490-3409 - Cell
 jimbuttafuoco - Skype

Re: [HACKERS] backup_label during crash recovery: do we know how to solve it?

2011-12-01 Thread Robert Haas

On Tue, Nov 29, 2011 at 9:10 PM, Daniel Farina dan...@heroku.com wrote:
 Reviving a thread that has hit its second birthday:

 http://archives.postgresql.org/pgsql-hackers/2009-11/msg00024.php

 In our case not being able to restart Postgres when it has been taken
 down in the middle of a base backup is starting to manifest as a
 serious source of downtime: basically, any backend crash or machine
 restart will cause postgres not to start without human intervention.
 The message delivered is sufficiently scary and indirect enough
 (because of the plausible scenarios that could cause corruption if
 postgres were to make a decision automatically in the most general
 case) that it's not all that attractive to train a general operator
 rotation to assess what to do, as it involves reading and then,
 effectively, ignoring some awfully scary error messages and removing
 the backup label file.  Even if the error messages weren't scary
 (itself a problem if one comes to the wrong conclusion as a result),
 the time spent digging around under short notice to confirm what's
 going on is a high pole in the tent for improving uptime for us,
 taking an extra five to ten minutes per common encounter.

 Our problem is compounded by having a lot of databases that take base
 backups at attenuated rates in an unattended way, and therefore a
 human who may have been woken up from a sound sleep will have to
 figure out what was going on before they've reached consciousness,
 rather than a person with prior knowledge of having started a backup.
 Also, fairly unremarkable databases can take so long to back up that
 they may well have a greater than 20% chance of encountering this
 problem at any particular time:  20% of a day is less than 5 hours per
 day taken to do on-line backups.  Basically, we -- and anyone else
 with unattended physical backup schemes -- are punished rather
 severely by the current design.

 This issue has some more recent related incarnations, even if for
 different reasons:

 http://archives.postgresql.org/pgsql-hackers/2011-01/msg00764.php

 Because backup_label coming or going? confusion in Postgres can have
 serious consequences, I wanted to post to the list first to solicit a
 minimal design to solve this problem.  If it's fairly small in its
 mechanics then it may yet be feasible for the January CF.

In some ways, I feel like this problem is unsolvable by definition.
If a backup is designed to be an exact copy of the data directory
taken between pg_start_backup() and pg_stop_backup(), then by
definition you can't distinguish between the original and the copy.
That's what a copy *is*.

Now, we could fix this by requiring an additional step when creating a
backup.  For example, we could create backup_label.not_really on the
master and require the person taking the backup to rename it to
backup_label on the slave before starting the postmaster.  This could
be an optional behavior, to preserve backward compatibility.  Now the
slave *isn't* an exact copy of the master, so PostgreSQL can
distinguish.

But it seems that this could also be worked around outside the
database.  We don't have built-in clusterware, so there must be
something in the external environment that knows which server is
supposed to be the master and which is supposed to be the standby.
So, if you're on the master, remove the backup label file before
starting the postmaster.  If you're on the standby, don't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread Stephen Frost

* Robert Haas (robertmh...@gmail.com) wrote:
 Me neither.  The ip4r type also supports ranges that aren't on
 CIDR-block boundaries, which probably isn't something that makes sense
 to incorporate into cidr.  But not everyone needs that, and some
 people might also need support for ipv6 CIDR blocks, which ip4r
 doesn't support.  So I don't necessarily see the existence of ip4r as
 a reason why cidr shouldn't have better indexing support.

Seems I wasn't clear.  The semantic changes were why ip4r was *created*
(instead of just using cidr..).  The fact that it's got index support is
independent from that (though, in my view, shows that people who
actually care about this data type use ip4r and don't use cidr, or we'd
hear much more complaining..).

I don't have any particular care about if cidr has indexing support or
not.  I'm certainly not *against* it, except insofar as it encourages
use of a data type that really could probably be better (by being more
like ip4r..).

Thanks,

Stephen


signature.asc
Description: Digital signature

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 1, 2011 at 11:44 AM, Peter Geoghegan pe...@2ndquadrant.com 
 wrote:
 Attached is revision of my patch with some clean-ups.

 One thing I'm starting to get a bit concerned about is the effect of
 this patch on the size of the resulting binary.

Regardless of that, I'm still of the opinion that this patch is
fundamentally misdesigned.  The way it's set up, it is only possible to
have a fast path for types that are hard-wired into tuplesort.c, and
that flies in the face of twenty years' worth of effort to make Postgres
an extensible system.  I really don't care about performance
measurements: this is not an acceptable design, period.

What I want to see is some mechanism that pushes this out to the
individual datatypes, so that both core and plug-in datatypes have
access to the benefits.  There are a number of ways that could be
attacked, but the most obvious one is to invent additional, optional
support function(s) for btree opclasses.

I also still think that we should pursue the idea of a lower-overhead
API for comparison functions.  It may be that it's worth the code bulk
to have an inlined copy of qsort for a small number of high-usage
datatypes, but I think there are going to be a lot for which we aren't
going to want to pay that price, and yet we could get some benefit from
cutting the call overhead.  A lower-overhead API would also save cycles
in usage of the comparison functions from btree itself (remember that?).

I think in particular that the proposed approach to multiple sort keys
is bogus and should be replaced by just using lower-overhead
comparators.  The gain per added code byte in doing it as proposed
has got to be too low to be reasonable.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Peter Geoghegan

On 1 December 2011 17:15, Robert Haas robertmh...@gmail.com wrote:
 One thing I'm starting to get a bit concerned about is the effect of
 this patch on the size of the resulting binary.  The performance
 results you've posted are getting steadily more impressive as you get
 into this, which is cool, but it seems like the number of copies of
 the qsort logic that you're proposing to include in the resulting
 binary is also steadily climbing.  On my system, a stripped postgres
 binary built with my usual compile options (except --enable-cassert,
 which I took out) is 49336 bytes bigger with this patch applied, an
 increase of close to 1%.

Do your usual compile options include debug symbols? I've been using
standard compile options for development of this patch, for obvious
reasons. I get 36690 bytes (just under 36 KiB, or a 0.644% increase).

Binary bloat is a legitimate concern. I thought that I was
conservative in my choice of specialisations though.

 We might need to be a little bit choosy
 about this, because I don't think that we want to end up with a
 situation where some noticeable percentage of the final binary
 consists of differently-inlined versions of the quicksort algorithm -
 especially because there may be other places where we want to do
 similar kinds of inlining.

 Thoughts?

A simple caveat in a comment along the lines of this mechanism
instantiates 2 copies of qsort_arg per type, please use judiciously
sounds like the right balance. It could also be possible to be penny
wise and pound foolish here though.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 8:29 PM, Peter Geoghegan pe...@2ndquadrant.com wrote:
 Do your usual compile options include debug symbols? I've been using
 standard compile options for development of this patch, for obvious
 reasons. I get 36690 bytes (just under 36 KiB, or a 0.644% increase).

They do, but I ran strip on the resulting binary before taking those
measurements, so I thought that would undo it...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Command Triggers

2011-12-01 Thread Tom Lane

Andres Freund and...@anarazel.de writes:
 On Thursday, December 01, 2011 07:21:25 PM Tom Lane wrote:
 Making this work cleanly would be a bigger deal than I think you're
 thinking.

 Obviously that depends on the definition of clean...

 Changing the grammar to make that explicit seems to involve a bit too many 
 changes on a first glance. The cheap way out seems to be to make the decision
 in analyze.c:transformQuery.
 Would that be an acceptable way forward?

ITYM transformStmt, but yeah, somewhere around there is probably reasonable.
The way I'm imagining this would work is that IntoClause disappears from
Query altogether: analyze.c would build a utility statement
CreateTableAs, pull the IntoClause out of the SelectStmt structure and
put it into the utility statement, and then proceed much as we do for
ExplainStmt (and for the same reasons).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 8:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 What I want to see is some mechanism that pushes this out to the
 individual datatypes, so that both core and plug-in datatypes have
 access to the benefits.  There are a number of ways that could be
 attacked, but the most obvious one is to invent additional, optional
 support function(s) for btree opclasses.

I feel like what's missing here is a way for the catalogs to refer to
a function that doesn't take FunctionCallInfo as an argument.  But it
strikes me that it wouldn't be very hard to design such a mechanism.
It seems like all we really need to do is decide on a nomenclature.
The simplest possible approach would probably be to just refer to
functions by name.  In other words, we add a text or name column to
pg_amproc, and then inside the backend there's a function whose job it
is to map the string to a function address.  However, that would have
the annoying effect of causing catalog bloat, so I'm inclined to
instead propose an 8-byte integer.  (A 4-byte integer is enough, but
would increase the risk of collisions between values that might be
chosen by third party extensions.)

So, the API to this would look something like this:

void register_arbitrary_function_pointer(int64 handle, void *funcptr);
void *get_arbitrary_function_pointer(int64 handle);

So the idea is that if you need to refer to an arbitrary function in a
system catalog (such as pg_amproc), you generate a random non-zero
64-bit integer, stuff it into the catalog, and then register the same
64-bit integer with the appropriate function pointer.  To avoid
increasing backend startup time, we could have a Perl script generate
a prefab hash table for the built-in functions; loadable modules would
add their entries at PG_init() time using
register_arbitrary_function_pointer.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 7:56 PM, Stephen Frost sfr...@snowman.net wrote:
 I don't have any particular care about if cidr has indexing support or
 not.  I'm certainly not *against* it, except insofar as it encourages
 use of a data type that really could probably be better (by being more
 like ip4r..).

Not that you're biased or anything!  :-p

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 1, 2011 at 8:13 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 What I want to see is some mechanism that pushes this out to the
 individual datatypes, so that both core and plug-in datatypes have
 access to the benefits.  There are a number of ways that could be
 attacked, but the most obvious one is to invent additional, optional
 support function(s) for btree opclasses.

 I feel like what's missing here is a way for the catalogs to refer to
 a function that doesn't take FunctionCallInfo as an argument.  But it
 strikes me that it wouldn't be very hard to design such a mechanism.

IMO, entries in pg_proc ought to refer to functions that are callable
by the standard interface: breaking that basic contract is not going to
lead anywhere pleasant.  Nor do I really want yet more columns in
pg_proc.  Nor does the register a pointer scheme you suggest make
any sense to me: you still need to connect these things to catalog
entries, pg_opclass entries in particular, and the intermediate handle
adds nothing to the situation except for creating a risk of collisions.

The scheme that was rolling around in my mind was about like this:

* Define btree opclasses to have an optional support function,
amprocnum 2, that has a SQL signature of func(internal) returns void.

* The actual argument represented by the internal parameter would
be a pointer to a struct which is to be filled in by the support
function.  We'd call the support function once, during tuplesort
setup or btree index relcache entry setup, and save the struct
somewhere.

* The struct contents would be pointers to functions that have a
non-FunctionCallInfo interface.  We know we need at least two:
a full qsort replacement, and a non-FCI comparator.  We might want
more in future, if someone convinces us that additional specializations
of sorting are worth the trouble.  So I imagine a struct like this:

typedef struct SortSupportFunctions {
void(*inline_qsort) (Datum *elements, int nelements);
int (*comparator) (Datum a, Datum b);
} SortSupportFunctions;

with the agreement that the caller must zero out the struct before call,
and then the btree support function sets the function pointers for any
specializations it's going to offer.  If we later add a third or fourth
function pointer, datatypes that know about that could fill in those
pointers, but datatypes that haven't been updated aren't broken.

One thing I'm not too certain about is whether to define the APIs just
as above, or to support a passthrough argument of some sort (and if so,
what does it reference)?  Possibly any datatype that we'd actually care
about this for is going to be simple enough to not need any state data.
Or possibly not.  And what about collations?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Why so few built-in range types?

2011-12-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 1, 2011 at 7:56 PM, Stephen Frost sfr...@snowman.net wrote:
 I don't have any particular care about if cidr has indexing support or
 not.  I'm certainly not *against* it, except insofar as it encourages
 use of a data type that really could probably be better (by being more
 like ip4r..).

 Not that you're biased or anything!  :-p

IIRC, a lot of the basic behavior of the inet/cidr types was designed by
Paul Vixie (though he's not to blame for their I/O presentation).
So I'm inclined to doubt that they're as broken as Stephen claims.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 9:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 IMO, entries in pg_proc ought to refer to functions that are callable
 by the standard interface: breaking that basic contract is not going to
 lead anywhere pleasant.  Nor do I really want yet more columns in
 pg_proc.

I wasn't proposing to create pg_proc entries for this.

 Nor does the register a pointer scheme you suggest make
 any sense to me: you still need to connect these things to catalog
 entries, pg_opclass entries in particular, and the intermediate handle
 adds nothing to the situation except for creating a risk of collisions.

I think you might be misinterpreting what I had in mind.  Right now,
pg_amproc entries have an amproc column that points to a pg_proc
entry that in turn points to a function that takes
FunctionCallInfoData as an argument.  What I'm proposing to do is add
an additional column to that catalog that points more or less directly
to a non-SQL-callable function, but it can't actually just be the
address of the function because that's not stable.  So what I'm
proposing is that we interpose the thinnest possible shim layer
between the catalog and a function pointer, and an int64 - function
pointer mapping seemed to me like something that would fit the bill.

 The scheme that was rolling around in my mind was about like this:

 * Define btree opclasses to have an optional support function,
 amprocnum 2, that has a SQL signature of func(internal) returns void.

 * The actual argument represented by the internal parameter would
 be a pointer to a struct which is to be filled in by the support
 function.  We'd call the support function once, during tuplesort
 setup or btree index relcache entry setup, and save the struct
 somewhere.

 * The struct contents would be pointers to functions that have a
 non-FunctionCallInfo interface.  We know we need at least two:
 a full qsort replacement, and a non-FCI comparator.  We might want
 more in future, if someone convinces us that additional specializations
 of sorting are worth the trouble.  So I imagine a struct like this:

        typedef struct SortSupportFunctions {
                void    (*inline_qsort) (Datum *elements, int nelements);
                int     (*comparator) (Datum a, Datum b);
        } SortSupportFunctions;

 with the agreement that the caller must zero out the struct before call,
 and then the btree support function sets the function pointers for any
 specializations it's going to offer.  If we later add a third or fourth
 function pointer, datatypes that know about that could fill in those
 pointers, but datatypes that haven't been updated aren't broken.

 One thing I'm not too certain about is whether to define the APIs just
 as above, or to support a passthrough argument of some sort (and if so,
 what does it reference)?  Possibly any datatype that we'd actually care
 about this for is going to be simple enough to not need any state data.
 Or possibly not.  And what about collations?

Maybe there should be a comparator_setup function that gets the
collation OID and returns void *, and then that void * value gets
passed as a third argument to each call to the comparator function.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 1, 2011 at 9:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Nor does the register a pointer scheme you suggest make
 any sense to me: you still need to connect these things to catalog
 entries, pg_opclass entries in particular, and the intermediate handle
 adds nothing to the situation except for creating a risk of collisions.

 I think you might be misinterpreting what I had in mind.  Right now,
 pg_amproc entries have an amproc column that points to a pg_proc
 entry that in turn points to a function that takes
 FunctionCallInfoData as an argument.  What I'm proposing to do is add
 an additional column to that catalog that points more or less directly
 to a non-SQL-callable function, but it can't actually just be the
 address of the function because that's not stable.  So what I'm
 proposing is that we interpose the thinnest possible shim layer
 between the catalog and a function pointer, and an int64 - function
 pointer mapping seemed to me like something that would fit the bill.

But you still need a lot of mechanism to do that mapping, including an
initialization function that has noplace to be executed (unless every
.so that defines a btree opclass has to be listed in preload_libraries),
so it doesn't seem like the thinnest possible shim to me.

 One thing I'm not too certain about is whether to define the APIs just
 as above, or to support a passthrough argument of some sort (and if so,
 what does it reference)?  Possibly any datatype that we'd actually care
 about this for is going to be simple enough to not need any state data.
 Or possibly not.  And what about collations?

 Maybe there should be a comparator_setup function that gets the
 collation OID and returns void *, and then that void * value gets
 passed as a third argument to each call to the comparator function.

Maybe.  Or perhaps we could merge that work into the
function-pointer-setup function --- that is, give it the collation and
some extra workspace to fool with.  We would always know the
collation at the time we're doing that setup.

At this point the struct filled in by the setup function is starting
to feel a lot like FmgrInfo, and history teaches us a lot about what
needs to be in there.  So maybe

typedef struct SortSupportInfoData *SortSupportInfo;

typedef struct SortSupportInfoData
{
MemoryContext   info_cxt;   /* where to allocate subsidiary data */
void*extra; /* any data needed by functions */
Oid collation;  /* provided by caller */

void(*inline_qsort) (Datum *elements, int nelements,
 SortSupportInfo info);
int (*comparator) (Datum a, Datum b,
   SortSupportInfo info);
/* possibly other function pointers in future */
} SortSupportInfoData;

I am thinking that the btree code, at least, would want to just
unconditionally do

colsortinfo-comparator(datum1, datum2, colsortinfo)

so for an opclass that fails to supply the low-overhead comparator,
it would insert into the comparator pointer a shim function that
calls the opclass' old-style FCI-using comparator.  (Anybody who
complains about the added overhead would be told to get busy and
supply a low-overhead comparator for their datatype...)  But to do
that, we have to have enough infrastructure here to cover all cases,
so omitting collation or not having a place to stash an FmgrInfo
won't do.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Robert Haas

On Thu, Dec 1, 2011 at 10:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 But you still need a lot of mechanism to do that mapping, including an
 initialization function that has noplace to be executed (unless every
 .so that defines a btree opclass has to be listed in preload_libraries),
 so it doesn't seem like the thinnest possible shim to me.

PG_init?

 One thing I'm not too certain about is whether to define the APIs just
 as above, or to support a passthrough argument of some sort (and if so,
 what does it reference)?  Possibly any datatype that we'd actually care
 about this for is going to be simple enough to not need any state data.
 Or possibly not.  And what about collations?

 Maybe there should be a comparator_setup function that gets the
 collation OID and returns void *, and then that void * value gets
 passed as a third argument to each call to the comparator function.

 Maybe.  Or perhaps we could merge that work into the
 function-pointer-setup function --- that is, give it the collation and
 some extra workspace to fool with.  We would always know the
 collation at the time we're doing that setup.

Ah!  That seems quite nice.

 At this point the struct filled in by the setup function is starting
 to feel a lot like FmgrInfo, and history teaches us a lot about what
 needs to be in there.  So maybe

 typedef struct SortSupportInfoData *SortSupportInfo;

 typedef struct SortSupportInfoData
 {
        MemoryContext   info_cxt;       /* where to allocate subsidiary data */
        void            *extra;         /* any data needed by functions */
        Oid             collation;      /* provided by caller */

        void    (*inline_qsort) (Datum *elements, int nelements,
                                 SortSupportInfo info);
        int     (*comparator) (Datum a, Datum b,
                               SortSupportInfo info);
        /* possibly other function pointers in future */
 } SortSupportInfoData;

 I am thinking that the btree code, at least, would want to just
 unconditionally do

        colsortinfo-comparator(datum1, datum2, colsortinfo)

 so for an opclass that fails to supply the low-overhead comparator,
 it would insert into the comparator pointer a shim function that
 calls the opclass' old-style FCI-using comparator.  (Anybody who
 complains about the added overhead would be told to get busy and
 supply a low-overhead comparator for their datatype...)  But to do
 that, we have to have enough infrastructure here to cover all cases,
 so omitting collation or not having a place to stash an FmgrInfo
 won't do.

I'm slightly worried about whether that'll be adding too much overhead
to the case where there is no non-FCI comparator.  But it may be no
worse than what we're doing now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Inlining comparators as a performance optimisation

2011-12-01 Thread Tom Lane

Robert Haas robertmh...@gmail.com writes:
 On Thu, Dec 1, 2011 at 10:48 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I am thinking that the btree code, at least, would want to just
 unconditionally do
 
colsortinfo-comparator(datum1, datum2, colsortinfo)
 
 so for an opclass that fails to supply the low-overhead comparator,
 it would insert into the comparator pointer a shim function that
 calls the opclass' old-style FCI-using comparator.  (Anybody who
 complains about the added overhead would be told to get busy and
 supply a low-overhead comparator for their datatype...)  But to do
 that, we have to have enough infrastructure here to cover all cases,
 so omitting collation or not having a place to stash an FmgrInfo
 won't do.

 I'm slightly worried about whether that'll be adding too much overhead
 to the case where there is no non-FCI comparator.  But it may be no
 worse than what we're doing now.

It should be the same or better.  Right now, we are going through
FunctionCall2Coll to reach the FCI-style comparator.  The shim function
would be more or less equivalent to that, and since it's quite special
purpose I would hope we could shave a cycle or two.  For instance, we
could probably afford to set up a dedicated FunctionCallInfo struct
associated with the SortSupportInfo struct, and not have to reinitialize
one each time.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Accounting for toast in query planner. (gin/gist indexes).

2011-12-01 Thread Tom Lane

jes...@krogh.cc writes:
 Secondly I could bump the default cost of ts_match_vq/ts_match_qv a
 bit up, since the cost of doing that computation is probably not as
 cheap as a ordinary boolean function.

Actually, you could try bumping their costs up by more than a bit.
It's a tad unfair to blame the toast access costs on the functions,
but it might get close enough to what you need.  If memory serves,
we don't charge those costs against index scans.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

59 matches

Mail list logo