Re: [HACKERS] WIP: Join push-down for foreign tables

2011-09-21 Thread Shigeru Hanada
Hi Michael,

(2011/09/21 12:52), Michael Paquier wrote:
 I am interested in the development you are doing regarding join push down
 and fdw stuff for remote postgreSQL servers.
 Is there a way to get the postgres fdw you are providing here for common
 9.1?
 I saw that the tar you are providing here is adapted only for your patch.

As you say, the pgsql_fdw I posted requires my join-push-down patch.
But, at least in current revision, using PG_VERSION_NUM would make it
compile-able for both 9.1 and 9.2.  But I'm not sure that changes
required for 9.2 development cycle are enough small for this workaround.

Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or
elsewhere, though it needs some (hopefully little) time.

Regards,
-- 
Shigeru Hanada

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-09-21 Thread Fujii Masao
On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Sep 21, 2011 at 04:50, Fujii Masao masao.fu...@gmail.com wrote:
 3. Copy the pg_control file from the cluster directory on the standby to
    the backup as follows:

    cp $PGDATA/global/pg_control /mnt/server/backupdir/global

 But this is done as part of step 2 already. I assume what this really
 means is that the pg_control file must be the last file backed up?

Yes.

When we perform an archive recovery from the backup taken during
normal processing, we gets a backup end location from the backup-end
WAL record which was written by pg_stop_backup(). But since no WAL
writing is allowed during recovery, pg_stop_backup() on the standby
cannot write a backup-end WAL record. So, in his patch, instead of
a backup-end WAL record, the startup process uses the minimum
recovery point recorded in pg_control which has been included in the
backup, as a backup end location. BTW, a backup end location is
used to check whether recovery has reached a consistency state
(i.e., end-of-backup).

To use the minimum recovery point in pg_control as a backup end
location safely, pg_control must be backed up last. Otherwise, data
page which has the newer LSN than the minimum recovery point
might be included in the backup.

 (Since there are certainly a lot other ways to do the backup than just
 cp to a mounted directory..)

Yes. The above command I described is just an example.

 4. Execute pg_stop_backup on the standby.

 The backup taken by the above procedure is available for an archive
 recovery or standby server.

 If the standby is promoted during a backup, pg_stop_backup() detects
 the change of the server status and fails. The data backed up before the
 promotion is invalid and not available for recovery.

 Taking a backup from the standby by using pg_basebackup is still not
 possible. But we can relax that restriction after applying this patch.

 I think that this is going to be very important, particularly given
 the requirements on pt 3 above. (But yes, it certainly doesn't have to
 be done as part of this patch, but it really should be the plan to
 have this included in the same version)

Agreed.

 To take a base backup during recovery safely, some sort of parameters
 must be set properly. Hot standby must be enabled on the standby, i.e.,
 wal_level and hot_standby must be enabled on the master and the standby,
 respectively. FPW (full page writes) is required for a base backup,
 so full_page_writes must be enabled on the master.

 Presumably pg_start_backup() will check this. And we'll somehow track
 this before pg_stop_backup() as well? (for such evil things such as
 the user changing FPW from on to off and then back to on again during
 a backup, will will make it look correct both during start and stop,
 but incorrect in the middle - pg_stop_backup needs to fail in that
 case as well)

Right. As I suggested upthread, to address that problem, we need to log
the change of FPW on the master, and then we need to check whether
such a WAL is replayed on the standby during the backup. If it's done,
pg_stop_backup() should emit an error.

Regards,

-- 
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Heikki Linnakangas

On 21.09.2011 02:53, Peter Geoghegan wrote:

C stdlib quick-sort time elapsed: 2.092451 seconds
Inline quick-sort time elapsed: 1.587651 seconds

Does *that* look attractive to you?


Not really, to be honest. That's a 25% speedup in pure qsorting speed. 
How much of a gain in a real query do you expect to get from that, in 
the best case? There's so many other sources of overhead that I'm afraid 
this will be lost in the noise. If you find a query that spends, say, 
50% of its time in qsort(), you will only get a 12.5% speedup on that 
query. And even 50% is really pushing it - I challenge you to find a 
query that spends any significant amount of time qsorting integers.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Simon Riggs
On Tue, Sep 20, 2011 at 3:51 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 This performance patch differs from most in that it's difficult in
 principle to imagine a performance regression occurring.

 Really?  N copies of the same code could lead to performance loss just
 due to code bloat (ie, less of a query's inner loops fitting in CPU
 cache).  Not to mention the clear regression in maintainability.  So
 I'm disinclined to consider this sort of change without a significantly
 bigger win than you're suggesting above (no, I don't even consider the
 -O0 number attractive, let alone what you're finding at -O2).

More copies of the code are somewhat annoying, but its only 100 lines
of code in one module and we can easily have specific tests for each.
The extra code size is minor in comparison to the reams of code we add
elsewhere.

It's a surprisingly good win for such a common use case. Well done, Peter.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Simon Riggs
On Wed, Sep 21, 2011 at 7:51 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 21.09.2011 02:53, Peter Geoghegan wrote:

 C stdlib quick-sort time elapsed: 2.092451 seconds
 Inline quick-sort time elapsed: 1.587651 seconds

 Does *that* look attractive to you?

 Not really, to be honest. That's a 25% speedup in pure qsorting speed. How
 much of a gain in a real query do you expect to get from that, in the best
 case? There's so many other sources of overhead that I'm afraid this will be
 lost in the noise. If you find a query that spends, say, 50% of its time in
 qsort(), you will only get a 12.5% speedup on that query. And even 50% is
 really pushing it - I challenge you to find a query that spends any
 significant amount of time qsorting integers.

How about almost every primary index creation?

Don't really see a reason for the negativity here. If you use that
argument no performance gain is worth it because all workloads are
mixed.

This is a marvellous win, a huge gain from a small, isolated and
easily tested change. By far the smallest amount of additional code to
sorting we will have added and yet one of the best gains.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Heikki Linnakangas

On 21.09.2011 10:01, Simon Riggs wrote:

On Wed, Sep 21, 2011 at 7:51 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com  wrote:

On 21.09.2011 02:53, Peter Geoghegan wrote:


C stdlib quick-sort time elapsed: 2.092451 seconds
Inline quick-sort time elapsed: 1.587651 seconds

Does *that* look attractive to you?


Not really, to be honest. That's a 25% speedup in pure qsorting speed. How
much of a gain in a real query do you expect to get from that, in the best
case? There's so many other sources of overhead that I'm afraid this will be
lost in the noise. If you find a query that spends, say, 50% of its time in
qsort(), you will only get a 12.5% speedup on that query. And even 50% is
really pushing it - I challenge you to find a query that spends any
significant amount of time qsorting integers.


How about almost every primary index creation?


Nope. Swamped by everything else.

Also note that as soon as the sort grows big enough to not fit in 
maintenance_work_mem, you switch to the external sort algorithm, which 
doesn't use qsort. Perhaps you could do similar inlining in the heap 
sort  merge passes done in the external sort, but it's unlikely to be 
as big a win there.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Join push-down for foreign tables

2011-09-21 Thread Michael Paquier
2011/9/21 Shigeru Hanada shigeru.han...@gmail.com

 Hi Michael,

 (2011/09/21 12:52), Michael Paquier wrote:
  I am interested in the development you are doing regarding join push down
  and fdw stuff for remote postgreSQL servers.
  Is there a way to get the postgres fdw you are providing here for common
  9.1?
  I saw that the tar you are providing here is adapted only for your patch.

 As you say, the pgsql_fdw I posted requires my join-push-down patch.
 But, at least in current revision, using PG_VERSION_NUM would make it
 compile-able for both 9.1 and 9.2.  But I'm not sure that changes
 required for 9.2 development cycle are enough small for this workaround.

OK, don't worry. I may be able to change that if necessary.

 Anyway, I'm going to publish recent pgsql_fdw for 9.1 on PGXN or
 elsewhere, though it needs some (hopefully little) time.

Thanks, that will be helpful, I think this work is very valuable for
postgresql and keep an eye on it. Btw, please don't feel that you have to
rush in doing that. I am not in a hurry at all so of course do it at your
rhythm.

Regards,
-- 
Michael Paquier
http://michael.otacoo.com


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote:
 No, but more similar the format are the easier it gets to at least factor
 the hairy parts of such a parser into a common subroutine. Assume that we
 don't support NULL as an alias for INF. What would then be the result of
 
   '[A,NULL)'::textrange?

I think that the range input should *parse* NULL in a similar way, but
reject it. So, to make it the range between two definite strings, you'd
do:

  '[A,NULL)'::textrange

which would be equal to textrange('A','NULL','[)'). Without the quotes,
it would detect the NULL, and give an error. Open to suggestion here,
though.

 Presumably, it'd be the same as textrange('A','NULL','[)'). Which think
 is a bit surprising, since '[A,NULL]'::text[] produces ARRAY['A',NULL],
 *NOT* ARRAY['A','NULL'].
 
 BTW, we currently represent infinity for floating point values as
 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
 
   int4range(0,NULL,'[)')::text
 
 return 
 
   '[0,Infinity)'?

I'm open to that, if you think it's an improvement I'll do it (but we
should probably pick one identifiable string and stick with it). What
I'd like to avoid is adding to the NULL/infinity confusion.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
 What I really
 care about is that we don't talk ourselves into needing a zillion
 constructor functions.  Making things work with a single constructor
 function seems to me to simplify life quite a bit, and allowing there
 seems essential for that.

I think we pretty much all agree on that. However, you did see the note
about the difficulty of using default parameters in built-in functions,
right?

I ultimately ended up with 4 constructors, each with the same name but
0, 1, 2, and 3 parameters. Suggestions welcome.

 (I am also vaguely wondering what happens if if you have a text
 range is (nubile, null) ambiguous?)

There are a few ways to handle that. I would lean toward parsing the
NULL as a special keyword, and then rejecting it (does it matter if it's
upper case?).

Regards,
Jeff Davis



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Peter Geoghegan
On 21 September 2011 01:48,  karave...@mail.bg wrote:
 All -O2 version show 42% speedup with inlined qsort.
 -O0 showed 25% speedup.

Thanks. Looks like the figures I posted last night were fairly
conservative. Does anyone else care to report results?

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Shared sequence-like objects in PostgreSQL

2011-09-21 Thread Vlad Arkhipov

Hello all,

I'm writing a C-language function that is similar to nextval() but 
should return the next member of the recurrent sequence:
T(n+1) = f(T(n), T(n-1), ..., T(n-k)), where f is some function and k is 
a constant.
The state of this object should be persistent between database restarts 
and should be easily recovered if the database crashes.


So the first problem I encountered was where to store the current state 
of this object (n and values T(n), T(n-1), ... T(n-k)). I believe that 
TopMemoryContext is not shared between processes, therefore I must use 
shmem functions from backend/storage/ipc/shmem.c to create a structure 
in shared memory.


The next issue is how to synchronize backends' reads/writes to this 
chunk of shared memory. I suppose there must be something to handle with 
semaphores in the Postgres code.


Then I periodically need to persist the state of this object to the 
database, for example for every 100 generated values, as well as on the 
postmaster's shutdown. What is the best method for doing that?


Please let me know if this problem has been solved before. Thanks for 
you help.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Online base backup from the hot-standby

2011-09-21 Thread Magnus Hagander
On Wed, Sep 21, 2011 at 08:23, Fujii Masao masao.fu...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 2:13 PM, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Sep 21, 2011 at 04:50, Fujii Masao masao.fu...@gmail.com wrote:
 3. Copy the pg_control file from the cluster directory on the standby to
    the backup as follows:

    cp $PGDATA/global/pg_control /mnt/server/backupdir/global

 But this is done as part of step 2 already. I assume what this really
 means is that the pg_control file must be the last file backed up?

 Yes.

 When we perform an archive recovery from the backup taken during
 normal processing, we gets a backup end location from the backup-end
 WAL record which was written by pg_stop_backup(). But since no WAL
 writing is allowed during recovery, pg_stop_backup() on the standby
 cannot write a backup-end WAL record. So, in his patch, instead of
 a backup-end WAL record, the startup process uses the minimum
 recovery point recorded in pg_control which has been included in the
 backup, as a backup end location. BTW, a backup end location is
 used to check whether recovery has reached a consistency state
 (i.e., end-of-backup).

 To use the minimum recovery point in pg_control as a backup end
 location safely, pg_control must be backed up last. Otherwise, data
 page which has the newer LSN than the minimum recovery point
 might be included in the backup.

Ah, check.


 (Since there are certainly a lot other ways to do the backup than just
 cp to a mounted directory..)

 Yes. The above command I described is just an example.

ok.


 4. Execute pg_stop_backup on the standby.

 The backup taken by the above procedure is available for an archive
 recovery or standby server.

 If the standby is promoted during a backup, pg_stop_backup() detects
 the change of the server status and fails. The data backed up before the
 promotion is invalid and not available for recovery.

 Taking a backup from the standby by using pg_basebackup is still not
 possible. But we can relax that restriction after applying this patch.

 I think that this is going to be very important, particularly given
 the requirements on pt 3 above. (But yes, it certainly doesn't have to
 be done as part of this patch, but it really should be the plan to
 have this included in the same version)

 Agreed.

 To take a base backup during recovery safely, some sort of parameters
 must be set properly. Hot standby must be enabled on the standby, i.e.,
 wal_level and hot_standby must be enabled on the master and the standby,
 respectively. FPW (full page writes) is required for a base backup,
 so full_page_writes must be enabled on the master.

 Presumably pg_start_backup() will check this. And we'll somehow track
 this before pg_stop_backup() as well? (for such evil things such as
 the user changing FPW from on to off and then back to on again during
 a backup, will will make it look correct both during start and stop,
 but incorrect in the middle - pg_stop_backup needs to fail in that
 case as well)

 Right. As I suggested upthread, to address that problem, we need to log
 the change of FPW on the master, and then we need to check whether
 such a WAL is replayed on the standby during the backup. If it's done,
 pg_stop_backup() should emit an error.

I somehow missed this thread completely, so I didn't catch your
previous comments - oops, sorry. The important point being that we
need to track if when this happens even if it has been reset to a
valid value. So we can't just check the state of the variable at the
beginning and at the end.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve lseek scalability v3

2011-09-21 Thread Marco Stornelli
2011/9/19 Matthew Wilcox matt...@wil.cx:
 On Mon, Sep 19, 2011 at 08:31:00AM -0400, Stephen Frost wrote:
 * Benjamin LaHaise (b...@kvack.org) wrote:
  For such tables, can't Postgres track the size of the file internally?  I'm
  assuming it's keeping file descriptors open on the tables it manages, in
  which case when it writes to a file to extend it, the internally stored 
  size
  could be updated.  Not making a syscall at all would scale far better than
  even a modified lseek() will perform.

 We'd have to have it in shared memory and have a lock around it, it
 wouldn't be cheap at all.

 Yep, that makes perfect sense.  After all, the kernel does basically the
 same thing to maintain this information; why should we have userspace
 duplicating the same infrastructure?

 I must admit, I'd never heard of this usage of lseek to get the current
 size of a file before; I'd assumed everybody used fstat.  Given this
 legitimate reason for a high-frequency calling of lseek, I withdraw my
 earlier objection to the patch series.

 --
 Matthew Wilcox                          Intel Open Source Technology Centre
 Bill, look, we understand that you're interested in selling us this
 operating system, but compare it to ours.  We can't possibly take such
 a retrograde step.
 --
 To unsubscribe from this list: send the line unsubscribe linux-fsdevel in
 the body of a message to majord...@vger.kernel.org
 More majordomo info at  http://vger.kernel.org/majordomo-info.html


I really don't understand the approach here. An improvement is an
improvement, do we need a use case to add an improvement to the
kernel? We are not talking about to add a new syscall or to do an ABI
change in this case. So my absolute ack to these patches.

Marco

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] POC: inline int4 comparison in tuplesort

2011-09-21 Thread Dan McGee
This attempts to be as simple as it gets while reducing function call
depth, and should be viewed as a proof of concept. It is also untested
as of now, but will try to do that and report back.

I'm hoping I followed the rabbit hole correctly and are correctly
comparing the right pointers to each other in order to short circuit the
case where we are using the int4 comparison operator.

Peter, if you want to compare stock vs. your patch vs. this patch, we might
be able to get some sort of read on where the maintainablity vs. performance
curve lies. Note that this version should still allow sorting of anything,
and simply shifts gears for int4 tuples...

---
 src/backend/utils/sort/tuplesort.c |   23 +--
 1 files changed, 21 insertions(+), 2 deletions(-)

diff --git a/src/backend/utils/sort/tuplesort.c 
b/src/backend/utils/sort/tuplesort.c
index 3505236..ddd5ced 100644
--- a/src/backend/utils/sort/tuplesort.c
+++ b/src/backend/utils/sort/tuplesort.c
@@ -2652,6 +2652,22 @@ myFunctionCall2Coll(FmgrInfo *flinfo, Oid collation, 
Datum arg1, Datum arg2)
return result;
 }
 
+static inline
+int int4cmp(Datum first, Datum second)
+{
+   int32   a = DatumGetInt32(first);
+   int32   b = DatumGetInt32(second);
+
+   if (a  b)
+   return 1;
+   else if (a == b)
+   return 0;
+   else
+   return -1;
+}
+
+extern Datum btint4cmp(PG_FUNCTION_ARGS);
+
 /*
  * Apply a sort function (by now converted to fmgr lookup form)
  * and return a 3-way comparison result.  This takes care of handling
@@ -2683,8 +2699,11 @@ inlineApplySortFunction(FmgrInfo *sortFunction, int 
sk_flags, Oid collation,
}
else
{
-   compare = DatumGetInt32(myFunctionCall2Coll(sortFunction, 
collation,
-   
datum1, datum2));
+   if (sortFunction-fn_addr == btint4cmp)
+   compare = int4cmp(datum1, datum2);
+   else
+   compare = 
DatumGetInt32(myFunctionCall2Coll(sortFunction, collation,
+   
datum1, datum2));
 
if (sk_flags  SK_BT_DESC)
compare = -compare;
-- 
1.7.6.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Amit Kapila
Recent discussions on the threads Double sorting split patch and

CUDA sorting raised the possibility that there could be significant

performance optimisation low-hanging fruit picked by having the

executor treat integers and floats as a special case during sorting,

avoiding going to the trouble of calling a comparator using the

built-in SQL function machinery

 

Why only for integers and floats why not for char/varchar?

But I believe this can make code less maintainable as similar things can be
done at other places to avoid SQL function machinery.

 

Once the cache has been warmed, explain analyze very consistently

reports a runtime of 123ms for this query on master/HEAD, which varies

+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply

this patch, that goes down to 107ms +/- 1ms at -O0.

 

Time 123ms which is without your change is with which optimization -O2 or
O0?

 

 


***

This e-mail and attachments contain confidential information from HUAWEI,
which is intended only for the person or entity whose address is listed
above. Any use of the information contained herein in any way (including,
but not limited to, total or partial disclosure, reproduction, or
dissemination) by persons other than the intended recipient's) is
prohibited. If you receive this e-mail in error, please notify the sender by
phone or email immediately and delete it!

 

 

-Original Message-
From: pgsql-hackers-ow...@postgresql.org
[mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Geoghegan
Sent: Tuesday, September 20, 2011 7:26 AM
To: PG Hackers
Subject: [HACKERS] Inlining comparators as a performance optimisation

 

Recent discussions on the threads Double sorting split patch and

CUDA sorting raised the possibility that there could be significant

performance optimisation low-hanging fruit picked by having the

executor treat integers and floats as a special case during sorting,

avoiding going to the trouble of calling a comparator using the

built-in SQL function machinery, and taking advantage of inlining of

the comparator, which has been shown to have a considerable

performance advantage (at least compared to a general purpose c stdlib

qsort(), that takes a function pointer as its comparator, much like

tuplesort).

 

I've hacked together a sloppy POC implementation in a hurry

(basically, some code is shifted around) , which is attached - I felt

that it would be useful to determine if the community feels that this

is a worth-while undertaking in advance of a business trip that I'm

leaving on tomorrow lasting until Friday, during which I will be

mostly unavailable. The patch breaks the Postgres sorting executor

node (at least when it quicksorts) for any type other than int4. I

apologise for how rough the patch is, but the code itself isn't

important right now - the idea is. I anticipate that the value

state-datumType or something similar will be set in a near future

revision, so that tuplesort_performsort will know which type-specific

optimisation it can use for the type, while falling back on the

existing generic qsort_arg + qsort_arg_comparator, and sorting won't

actually be broken.

 

I've been doing some preliminary testing using the dell store 2 sample

database. I increase work_mem to '50MB', to ensure that a quicksort

will be performed for sorting (otherwise, I'm using the

postgresql.conf that initdb gave me). The query is:

 

explain analyze select * from orderlines order by prod_id;

 

Once the cache has been warmed, explain analyze very consistently

reports a runtime of 123ms for this query on master/HEAD, which varies

+/- 1 ms, with a few outliers of maybe +/- 2ms. However, when I apply

this patch, that goes down to 107ms +/- 1ms at -O0. I think that

that's a pretty good start. Funnily enough, the difference/advantage

vanishes at -O2 (I'm guessing that the higher optimisation level of

GCC 4.5 hyper-corrects away the inlining, but I don't have time to

check that right now).

 

I imagine the version that I actually submit for patch review will

have a macro-based infrastructure for inlining the sorting of various

built-in types, initially integers and floats. It will most likely

have some other optimisations - I haven't even used a profiler yet.

 

This performance patch differs from most in that it's difficult in

principle to imagine a performance regression occurring.

 

Thoughts?

 

-- 

Peter Geoghegan   http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training and Services



Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 09:23 , Jeff Davis wrote:
 On Mon, 2011-09-19 at 18:32 +0200, Florian Pflug wrote:
 No, but more similar the format are the easier it gets to at least factor
 the hairy parts of such a parser into a common subroutine. Assume that we
 don't support NULL as an alias for INF. What would then be the result of
 
  '[A,NULL)'::textrange?
 
 I think that the range input should *parse* NULL in a similar way, but
 reject it. So, to make it the range between two definite strings, you'd
 do:
 
  '[A,NULL)'::textrange
 
 which would be equal to textrange('A','NULL','[)'). Without the quotes,
 it would detect the NULL, and give an error. Open to suggestion here,
 though.

Hm, that seems like a reasonable compromise. As long as range types and
arrays agree on the same basic lexical rules regarding quoting and whitespace
(i.e. that spaces outside of double-quotes are non-significant, that keywords
like NULL and INF/Infinity are case-insensitive, ...) I'm happy I guess.

 BTW, we currently represent infinity for floating point values as
 'Infinity', not 'INF'. Shouldn't we do the same for ranges, i.e. make
 
  int4range(0,NULL,'[)')::text
 
 return 
 
  '[0,Infinity)'?
 
 I'm open to that, if you think it's an improvement I'll do it (but we
 should probably pick one identifiable string and stick with it). What
 I'd like to avoid is adding to the NULL/infinity confusion.

I've thought about this some more, and came to realize that the question
here really is whether

  floatrange(0, 'Infinity'::float, '[)')

and

  floatrange(0, NULL, '[)')

are the same thing or not. If they're not, then obviously using Infinity
to represent omitted bounds is going to be very confusing. If they are,
then using Infinity seems preferable. Maybe boundaries should be restricted
to numeric float values (i.e. +/-Infinity and NaN should be rejected), though
I dunno if the range type infrastructure supports that. Thoughts?

best regards,
Florian Pflug




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
 What I really
 care about is that we don't talk ourselves into needing a zillion
 constructor functions.  Making things work with a single constructor
 function seems to me to simplify life quite a bit, and allowing there
 seems essential for that.

 I think we pretty much all agree on that. However, you did see the note
 about the difficulty of using default parameters in built-in functions,
 right?

 I ultimately ended up with 4 constructors, each with the same name but
 0, 1, 2, and 3 parameters. Suggestions welcome.

 (I am also vaguely wondering what happens if if you have a text
 range is (nubile, null) ambiguous?)

 There are a few ways to handle that. I would lean toward parsing the
 NULL as a special keyword, and then rejecting it (does it matter if it's
 upper case?).

Boy, that seems really weird to me.  If you're going to do it, it
ought to be case-insensitive, but I think detecting the case only for
the purpose of rejecting it is probably a mistake.  I mean, if
(nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
really seem like they ought to be any different.  Otherwise, anyone
who wants to construct these strings programatically is going to need
to escape everything and always write (cat,dog) or however you do
that, and that seems like an unnecessary imposition.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Shared sequence-like objects in PostgreSQL

2011-09-21 Thread Greg Stark
On Wed, Sep 21, 2011 at 8:19 AM, Vlad Arkhipov arhi...@dc.baikal.ru wrote:
 I'm writing a C-language function that is similar to nextval() but should
 return the next member of the recurrent sequence:
 T(n+1) = f(T(n), T(n-1), ..., T(n-k)), where f is some function and k is a
 constant.
 The state of this object should be persistent between database restarts and
 should be easily recovered if the database crashes.

The purpose of nextval() is to provide an escape hatch from the normal
transactional guarantees which would normally serialize everything
using it. Avoiding the performance impact of that is the only reason
it needs to use shared memory and so on.

If this function isn't performance critical and doesn't need to be
highly concurrent then you would be better off storing this
information in a table and updating the table using regular database
updates. The way you've defined it also makes me wonder whether you
can afford to skip values. If not then you don't really get an option
of avoiding the serialization.

If you can, one short-cut you could consider would be to populate a
table with the values of the sequence, and periodically populate more
values when you run short of unused values. Then you can use a regular
postgres sequence to generate indexes into that table. That would not
perform quite as well as a shared memory native implementation like
you describe but wouldn't require nearly as much Postgres-specific C
code.

Perhaps if you can explain what the problem you're actually trying to
solve is it might be clearer whether it justifies working at such a
low level.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 14:00 , Robert Haas wrote:
 On Wed, Sep 21, 2011 at 3:29 AM, Jeff Davis pg...@j-davis.com wrote:
 On Mon, 2011-09-19 at 12:26 -0400, Robert Haas wrote:
 (I am also vaguely wondering what happens if if you have a text
 range is (nubile, null) ambiguous?)
 
 There are a few ways to handle that. I would lean toward parsing the
 NULL as a special keyword, and then rejecting it (does it matter if it's
 upper case?).
 
 Boy, that seems really weird to me.  If you're going to do it, it
 ought to be case-insensitive, but I think detecting the case only for
 the purpose of rejecting it is probably a mistake.  I mean, if
 (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
 really seem like they ought to be any different.

But that's exactly how arrays behave too. '{null,nutty}' is interpreted
as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as
ARRAY['nubile','nutty'].

 Otherwise, anyone
 who wants to construct these strings programatically is going to need
 to escape everything and always write (cat,dog) or however you do
 that, and that seems like an unnecessary imposition.

Unless you fully depart from what arrays you, you'll have to do that anyway
because leading and trailing spaces aren't considered to be significant in
non-quoted elements. In other words, '( cat , dog )' represents
textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').

Also, as long as we need to recognize at least one special value meaning
a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
around the need for quotes in the general case. Well, expect making the
representation of

  range(X, NULL, '[)') be '[X)',

the one of

  range(NULL, X, '(]') be '(X]'

and the one of

  range(NULL, NULL, '()') be '()',

but I'm not sure that's an improvement. And even if it was, you'd still
need to quote X if it contained one of (,),[,] or ,. So most
client would probably still choose to quote unconditionally, instead of
detecting whether it was necessary or not.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 8:41 AM, Florian Pflug f...@phlo.org wrote:
 Boy, that seems really weird to me.  If you're going to do it, it
 ought to be case-insensitive, but I think detecting the case only for
 the purpose of rejecting it is probably a mistake.  I mean, if
 (nubile, nutty) is OK, then (nubile, null) and (null, nutty) don't
 really seem like they ought to be any different.

 But that's exactly how arrays behave too. '{null,nutty}' is interpreted
 as ARRAY[NULL,'nutty'] while '{nubile,nutty}' is interpreted as
 ARRAY['nubile','nutty'].

Oh.  Well, never mind then.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Greg Stark
On Wed, Sep 21, 2011 at 8:08 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 How about almost every primary index creation?

 Nope. Swamped by everything else.

Really? I think it's pretty common for shops to be able to dedicate
large amounts of RAM to building initial indexes on data loads or
reindex operations. Enough that they can cache the entire table for
the short time they're doing the index builds even if they're quite
large. Witness the recent pleas to allow maintenance_work_mem on the
order of tens of gigabytes. And it's also pretty common that shops can
dedicate very large I/O bandwidth, in many cases enough to saturate
the memory bandwidth, for doing these kinds of batch operations when
they get large enough to need to do an external sort.

There's still overhead of reading the pages, the tuples, finding the
sort keys in the tuple, etc. But I think the actual qsort or heap
operations in tapesort are pretty big portions of the work.

This is pretty easy to measure. Just run oprofile or gprof and see
what percentage of time for a big index build is spent in qsort.

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 8:47 AM, Greg Stark st...@mit.edu wrote:
 On Wed, Sep 21, 2011 at 8:08 AM, Heikki Linnakangas
 heikki.linnakan...@enterprisedb.com wrote:
 How about almost every primary index creation?

 Nope. Swamped by everything else.

 Really? I think it's pretty common for shops to be able to dedicate
 large amounts of RAM to building initial indexes on data loads or
 reindex operations. Enough that they can cache the entire table for
 the short time they're doing the index builds even if they're quite
 large. Witness the recent pleas to allow maintenance_work_mem on the
 order of tens of gigabytes. And it's also pretty common that shops can
 dedicate very large I/O bandwidth, in many cases enough to saturate
 the memory bandwidth, for doing these kinds of batch operations when
 they get large enough to need to do an external sort.

 There's still overhead of reading the pages, the tuples, finding the
 sort keys in the tuple, etc. But I think the actual qsort or heap
 operations in tapesort are pretty big portions of the work.

 This is pretty easy to measure. Just run oprofile or gprof and see
 what percentage of time for a big index build is spent in qsort.

+1 for some actual measurements.

I don't think anyone on this thread is saying that if we can get big
performance gains from doing this we still shouldn't do it.  But at
this point it's unclear that we can get a consistent speedup that
isn't heavily dependent on the choice of compiler flags (to say
nothing of compiler and OS), and even if we can, it's not clear that
it will still be noticeable when you measure the run time of an entire
query rather than just the speed of qsort().  Like Tom and Heikki, I'm
a bit skeptical: it wouldn't surprise me to find out that qsort() is
5% of the runtime any realistic test case and the average qsort()
speedup based on tests on a couple different platforms is 10% and so
on average we're looking at a 0.5% improvement, in which case it might
be more trouble than it's worth, especially if it turns out that there
are OS/platform combinations where the inlined version is (for some
crazy reason) slower.  I've seen performance differences of up to 3%
from minor code rearrangements that don't seem like they should matter
at all, just because code and data shifts around across cache-line
boundaries and the new arrangement is slightly better or worse than
the old one.  So if the performance improvement turns out to be very
small, then validating that it actually IS an improvement in general
is likely to be kind of a pain in the ass.

On the other hand, the performance improvement might turn out to be
large.  Maybe there's a test case where, as Heikki suggests, 50% of
the time is spent in qsort().  If we can reliably make that 25%
faster, I wouldn't dismiss that out of hand; I think that would be
pretty good, assuming it didn't require massive amounts of spaghetti
code to make it work.  I don't see that that would be any more
marginal than the sorts of things we've optimized in, say, commit
4fc115b2e981f8c63165ca86a23215380a3fda66, or commit
f4d242ef94730c447d87b9840a40b0ec3371fe0f.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Peter Geoghegan
On 21 September 2011 07:51, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 On 21.09.2011 02:53, Peter Geoghegan wrote:

 C stdlib quick-sort time elapsed: 2.092451 seconds
 Inline quick-sort time elapsed: 1.587651 seconds

 Does *that* look attractive to you?

 Not really, to be honest. That's a 25% speedup in pure qsorting speed. How
 much of a gain in a real query do you expect to get from that, in the best
 case? There's so many other sources of overhead that I'm afraid this will be
 lost in the noise.

I'm surprised that you're dismissive of this. After all, we have in
the past indulged in micro-optimisation of qsort, or so it would seem
from this comment:

 * We have modified their original by adding a check for already-sorted input,
 * which seems to be a win per discussions on pgsql-hackers around 2006-03-21.

Makes affected queries radically faster (In the best case, a speedup
somewhat greater than 12.5%) is an unreasonably high standard for a
performance optimisation of the executor in general (such a high
standard might be sensible if it was due to a particular
maintainability downside, but you didn't mention one). Even still, I
think that the 12.5% figure is pretty pessimistic - I've already sped
up the dell store query by almost that much, and that's with a patch
that was, due to circumstances, cobbled together.

Not only are we benefiting from the effects of inlining, we're also
benefiting from the removal of unnecessary indirection. As Tom said,
In concrete terms, there would be no reason to have tuplesort.c's
myFunctionCall2Coll, and maybe not inlineApplySortFunction either, if
the datatype-specific comparison functions had APIs that were closer
to what sorting wants rather than following the general
SQL-callable-function API. He was just referring to the benefits of
removing indirection here, so ISTM that this is really two performance
optimisations rolled into one - it's conceivable that the total
performance improvement will even exceed the isolated inlining
comparator benchmark.

As I've said, I believe this patch can be committed without
compromising the maintainability of the tuplesort code to an extent
that is not clearly worth it, through the use of a clean, macro-based
abstraction. Concerns about bloated binaries are probably not well
founded, because what I'm proposing is to a certain extent emulating
C++ templates, while using a very common pattern used with C++
templates. In the C++ world, algorithms are often generalised as
templates, so that they can be used equally well with any datatype
(that supports the interface of the template), while availing of
compiler optimisations per template instantiation (instance of using a
given type with a given template). I actually got the idea for this
patch in part from a book that I read years ago that described the
fact that counter-intuitively, std::sort() consistently outperforms
qsort(), because the comparator is often inlined, and the compiler can
generally avail of optimisations from knowing the comparator at
compile-time.

On 21 September 2011 13:47, Greg Stark st...@mit.edu wrote:
 This is pretty easy to measure. Just run oprofile or gprof and see
 what percentage of time for a big index build is spent in qsort.

I'll do so soon. I intend to get to this on Friday evening, and
perhaps have a proper patch to show next week.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Heikki Linnakangas

On 21.09.2011 17:20, Peter Geoghegan wrote:

Even still, I
think that the 12.5% figure is pretty pessimistic - I've already sped
up the dell store query by almost that much, and that's with a patch
that was, due to circumstances, cobbled together.


I'm not against making things faster, it's just that I haven't seen 
solid evidence yet that this will help. Just provide a best-case test 
case for this that shows a huge improvement, and I'll shut up. If the 
improvement is only modest, then let's discuss how big it is and whether 
it's worth the code ugliness this causes.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Tom Lane
Florian Pflug f...@phlo.org writes:
 On Sep21, 2011, at 14:00 , Robert Haas wrote:
 Otherwise, anyone
 who wants to construct these strings programatically is going to need
 to escape everything and always write (cat,dog) or however you do
 that, and that seems like an unnecessary imposition.

 Unless you fully depart from what arrays you, you'll have to do that anyway
 because leading and trailing spaces aren't considered to be significant in
 non-quoted elements. In other words, '( cat , dog )' represents
 textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').

Keep in mind that the array I/O behavior is widely considered to suck.
When we defined the record I/O behavior, we did not emulate that
whitespace weirdness, nor a number of other weirdnesses.  I would argue
that ranges ought to model their I/O behavior on records not arrays,
because that's not as much of a legacy syntax.

 Also, as long as we need to recognize at least one special value meaning
 a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
 around the need for quotes in the general case.

Right.  In the record case, we used an empty string for NULL, and then
had to insist on quotes for actual empty strings.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21.09.2011 17:20, Peter Geoghegan wrote:
 Even still, I
 think that the 12.5% figure is pretty pessimistic - I've already sped
 up the dell store query by almost that much, and that's with a patch
 that was, due to circumstances, cobbled together.

 I'm not against making things faster, it's just that I haven't seen 
 solid evidence yet that this will help. Just provide a best-case test 
 case for this that shows a huge improvement, and I'll shut up. If the 
 improvement is only modest, then let's discuss how big it is and whether 
 it's worth the code ugliness this causes.

The other question that I'm going to be asking is whether it's not
possible to get most of the same improvement with a much smaller code
footprint.  I continue to suspect that getting rid of the SQL function
impedance-match layer (myFunctionCall2Coll etc) would provide most of
whatever gain is to be had here, without nearly as large a cost in code
size and maintainability, and with the extra benefit that the speedup
would also be available to non-core datatypes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-21 Thread Linas Virbalas
Hello,

* Context *

I'm observing problems with provisioning a standby from the master by
following a basic and documented Making a Base Backup [1] procedure with
rsync if, in the mean time, heavy load is applied on the master.

After searching the archives, the only more discussed and similar issue I
found hit was by Daniel Farina in a thread hot backups: am I doing it
wrong, or do we have a problem with pg_clog? [2], but, it seems, the issue
was discarded because of a non-standard backup procedure Deniel used.
However, I'm observing the same error with a simple procedure, hence this
message.

* Details *

Procedure:

1. Start load generator on the master (WAL archiving enabled).
2. Prepare a Streaming Replication standby (accepting WAL files too):
2.1. pg_switch_xlog() on the master;
2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a
while as master is loaded up);
2.3. rsync data/global/pg_control to the standby;
2.4. rsync all other data/ (without pg_xlog) to the standby;
2.5. pg_stop_backup() on the master;
2.6. Wait to receive all WAL files, generated during the backup, on the
standby;
2.6. Start the standby PG instance.

The last step will, usually, fail with a similar error:

2011-09-21 13:41:05 CEST LOG:  database system was interrupted; last known
up at 2011-09-21 13:40:50 CEST
Restoring 0014.history
mv: cannot stat `/opt/PostgreSQL/9.1/archive/0014.history': No such file
or directory
Restoring 0013.history
2011-09-21 13:41:05 CEST LOG:  restored log file 0013.history from
archive
2011-09-21 13:41:05 CEST LOG:  entering standby mode
Restoring 0013000600DC
2011-09-21 13:41:05 CEST LOG:  restored log file 0013000600DC
from archive
Restoring 0013000600DB
2011-09-21 13:41:05 CEST LOG:  restored log file 0013000600DB
from archive
2011-09-21 13:41:05 CEST FATAL:  could not access status of transaction
1188673
2011-09-21 13:41:05 CEST DETAIL:  Could not read from file pg_clog/0001 at
offset 32768: Success.
2011-09-21 13:41:05 CEST LOG:  startup process (PID 13819) exited with exit
code 1
2011-09-21 13:41:05 CEST LOG:  aborting startup due to startup process
failure

The procedure works very reliably if there is little or no load on the
master, but fails very often with the pg_clog error when load generator (few
thousands of SELECTs, ~60 INSERTs, ~60 DELETEs and ~60 UPDATES per second)
is started up.

I assumed that a file system backup taken during pg_start_backup and
pg_stop_backup is guaranteed to be consistent and that missing pieces will
be taken from the WAL files, generated  shipped during the backup, but is
it really?

Is this procedure missing some steps? Or maybe this a known issue?

Thank you,
Linas

[1] http://www.postgresql.org/docs/current/static/continuous-archiving.html
[2] http://archives.postgresql.org/pgsql-hackers/2011-04/msg01132.php


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Andrew Dunstan



On 09/21/2011 10:50 AM, Tom Lane wrote:

The other question that I'm going to be asking is whether it's not
possible to get most of the same improvement with a much smaller code
footprint.  I continue to suspect that getting rid of the SQL function
impedance-match layer (myFunctionCall2Coll etc) would provide most of
whatever gain is to be had here, without nearly as large a cost in code
size and maintainability, and with the extra benefit that the speedup
would also be available to non-core datatypes.




Can we get a patch so we can do benchmarks on this?

cheers

andrew

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sequence locking

2011-09-21 Thread Andres Freund
Hi,

I find the current behaviour of locking of sequences rather problematic. 
Multiple things:

- First and foremost I find it highly dangerous that ALTER SEQUENCE ... is 
for the biggest part not transactional. I think about the only transaction 
part is the name, owner and schema.
Sure, its documented, but ...

The cited reasons for wanting that behaviour look a bit bogus to me? Why 
should concurrency be important when doing an ALTER SEQUENCE? 

- Its impossible to emulate proper locking yourself because locking is not 
allowed for sequences

The first one looks rather hard to solve to me with my  passing knowledge of 
the sequence, but probably worthy of a TODO entry.

The second one looks easier. Any arguments against allowing it again? It seems 
to have been allowed in prehistoric times.

Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 16:41 , Tom Lane wrote:
 Florian Pflug f...@phlo.org writes:
 On Sep21, 2011, at 14:00 , Robert Haas wrote:
 Otherwise, anyone
 who wants to construct these strings programatically is going to need
 to escape everything and always write (cat,dog) or however you do
 that, and that seems like an unnecessary imposition.
 
 Unless you fully depart from what arrays you, you'll have to do that anyway
 because leading and trailing spaces aren't considered to be significant in
 non-quoted elements. In other words, '( cat , dog )' represents
 textrange('cat', 'dog', '()'), *not* textrange(' cat ', ' dog ', '()').
 
 Keep in mind that the array I/O behavior is widely considered to suck.
 When we defined the record I/O behavior, we did not emulate that
 whitespace weirdness, nor a number of other weirdnesses.  I would argue
 that ranges ought to model their I/O behavior on records not arrays,
 because that's not as much of a legacy syntax.

Interesting. Funnily enough, I always assumed it was the other way around.
Probably because I don't care much for the empty-unquoted-string-is-NULL
behaviour of records. But leaving that personal opinion aside, yeah, in that
case we should make range I/O follow record I/O.

 Also, as long as we need to recognize at least one special value meaning
 a non-existing bound ('INF' or 'Infinity' or whatever), I don't see a way
 around the need for quotes in the general case.
 
 Right.  In the record case, we used an empty string for NULL, and then
 had to insist on quotes for actual empty strings.

Hm, so we'd have

  '(X,)' for range(X, NULL, '()'),
  '(,X)' for range(NULL, X, '()') and
  '(,)' for range(NULL, NULL, '()').

We'd then have the choice of either declaring

  '(X,]' to mean '(X,)',
  '[,X)' to mean '(,X)' and
  '[,]' to mean '(,)'

or to forbid the use of '[' and ']' for unspecified bounds.

(Leaving out the ',' in the case of only one bound as in my reply to Robert's
mail somewhere else in this thread doesn't actually work, since it'd be 
ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').)

One nice property is that, apart from the different brackets used, this
representation is identical to the one used by records while still avoiding
the infinity vs. NULL confusion.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Simon Riggs si...@2ndquadrant.com writes:
 This is a marvellous win, a huge gain from a small, isolated and
 easily tested change. By far the smallest amount of additional code to
 sorting we will have added and yet one of the best gains.

I think you forgot your cheerleader uniform.  A patch along these lines
is not going to be small, isolated, easily maintained, nor beneficial
for any but a small number of predetermined datatypes.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-21 Thread Susanne Ebrecht

Hello Alvaro,

On 16.09.2011 15:08, Alvaro Herrera wrote:

It's certainly possible to create a private mailing list to support this
idea.  How would the membership be approved, however, is not clear to
me.  Would we let only well-known names from other pgsql lists into it?

(I, for one, had no idea you were in the SQL committee.)


It will be personally me - who gets the penalty when something get 
outside the group who is supporting me.


Members should be people that I can trust.

Besides core team I trust all who core team trust.

And of course there are ppl outside core team who I am trusting.

Susanne

--
Susanne Ebrecht - 2ndQuadrant
PostgreSQL Development, 24x7 Support, Training and Services
www.2ndQuadrant.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Peter Geoghegan
On 21 September 2011 15:50, Tom Lane t...@sss.pgh.pa.us wrote:
 Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 I'm not against making things faster, it's just that I haven't seen
 solid evidence yet that this will help. Just provide a best-case test
 case for this that shows a huge improvement, and I'll shut up. If the
 improvement is only modest, then let's discuss how big it is and whether
 it's worth the code ugliness this causes.

Fair enough.

 The other question that I'm going to be asking is whether it's not
 possible to get most of the same improvement with a much smaller code
 footprint.

That's a reasonable question, and I hope to be able to come up with a
good answer.

 I continue to suspect that getting rid of the SQL function
 impedance-match layer (myFunctionCall2Coll etc) would provide most of
 whatever gain is to be had here, without nearly as large a cost in code
 size and maintainability, and with the extra benefit that the speedup
 would also be available to non-core datatypes.

I'm fairly surprised that your view on that is mostly or entirely
unchanged, even after I've demonstrated a considerable performance
advantage from a macro-based qsort implementation over my OS vendor's
c std lib qsort(), using an isolated test-case, that does not have
anything to do with that impedance mismatch. I'm not sure why you
doubt that the same thing is happening within tuplesort.

-- 
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 09/21/2011 10:50 AM, Tom Lane wrote:
 The other question that I'm going to be asking is whether it's not
 possible to get most of the same improvement with a much smaller code
 footprint.  I continue to suspect that getting rid of the SQL function
 impedance-match layer (myFunctionCall2Coll etc) would provide most of
 whatever gain is to be had here, without nearly as large a cost in code
 size and maintainability, and with the extra benefit that the speedup
 would also be available to non-core datatypes.

 Can we get a patch so we can do benchmarks on this?

Well, we'd have to negotiate what the API ought to be.  What I'm
envisioning is that datatypes could provide alternate comparison
functions that are designed to be qsort-callable rather than
SQL-callable.  As such, they could not have entries in pg_proc, so
it seems like there's no ready way to represent them in the catalogs.

The idea that I was toying with was to allow the regular SQL-callable
comparison function to somehow return a function pointer to the
alternate comparison function, so that the first comparison in a given
sort run would be done the traditional way but then we'd notice the
provided function pointer and start using that.  It would not be too
hard to pass back the pointer using FunctionCallInfoData.context, say.
The downside is adding cycles to unoptimized cases to uselessly check
for a returned function pointer that's not there.  Perhaps it could be
hacked so that we only add cycles to the very first call, but I've not
looked closely at the code to see what would be involved.

Has anyone got a better idea for getting hold of the alternate function?

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Heikki Linnakangas

On 21.09.2011 18:46, Tom Lane wrote:

Andrew Dunstanand...@dunslane.net  writes:

On 09/21/2011 10:50 AM, Tom Lane wrote:

The other question that I'm going to be asking is whether it's not
possible to get most of the same improvement with a much smaller code
footprint.  I continue to suspect that getting rid of the SQL function
impedance-match layer (myFunctionCall2Coll etc) would provide most of
whatever gain is to be had here, without nearly as large a cost in code
size and maintainability, and with the extra benefit that the speedup
would also be available to non-core datatypes.



Can we get a patch so we can do benchmarks on this?


Well, we'd have to negotiate what the API ought to be.  What I'm
envisioning is that datatypes could provide alternate comparison
functions that are designed to be qsort-callable rather than
SQL-callable.  As such, they could not have entries in pg_proc, so
it seems like there's no ready way to represent them in the catalogs.

The idea that I was toying with was to allow the regular SQL-callable
comparison function to somehow return a function pointer to the
alternate comparison function, so that the first comparison in a given
sort run would be done the traditional way but then we'd notice the
provided function pointer and start using that.  It would not be too
hard to pass back the pointer using FunctionCallInfoData.context, say.
The downside is adding cycles to unoptimized cases to uselessly check
for a returned function pointer that's not there.  Perhaps it could be
hacked so that we only add cycles to the very first call, but I've not
looked closely at the code to see what would be involved.


You could have a new function with a pg_proc entry, that just returns a 
function pointer to the qsort-callback.


Or maybe the interface should be an even more radical replacement of 
qsort, not just the comparison function. Instead of calling qsort, 
tuplesort.c would call the new datatype-specific sort-function (which 
would be in pg_proc). The implementation could use an inlined version of 
qsort, like Peter is suggesting, or it could do something completely 
different, like a radix sort or a GPU-assisted sort or whatever.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Greg Stark
On Wed, Sep 21, 2011 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  As such, they could not have entries in pg_proc, so
 it seems like there's no ready way to represent them in the catalogs.

Why couldn't they be in pg_proc with a bunch of opaque arguments like
the GIST opclass support functions?

I'm a bit puzzled what the arguments would look like. They would still
need to know the collation, nulls first/last flags, etc.
And calling it would still not be inlinable.  So they would have to
check those flags on each invocation instead of having a piece of
straightline code that hard codes the behaviour with the right
behaviour inline.  ISTM the hope for a speedup from the inlining
mostly came from the idea that the compiler might be able to hoist
this logic outside the loop (and I suppose implement n specialized
loops depending on the behaviour needed).

-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21.09.2011 18:46, Tom Lane wrote:
 The idea that I was toying with was to allow the regular SQL-callable
 comparison function to somehow return a function pointer to the
 alternate comparison function,

 You could have a new function with a pg_proc entry, that just returns a 
 function pointer to the qsort-callback.

Yeah, possibly.  That would be a much more invasive change, but cleaner
in some sense.  I'm not really prepared to do all the legwork involved
in that just to get to a performance-testable patch though.

 Or maybe the interface should be an even more radical replacement of 
 qsort, not just the comparison function. Instead of calling qsort, 
 tuplesort.c would call the new datatype-specific sort-function (which 
 would be in pg_proc). The implementation could use an inlined version of 
 qsort, like Peter is suggesting, or it could do something completely 
 different, like a radix sort or a GPU-assisted sort or whatever.

No.  In the first place, that only helps for in-memory sorts.  In the
second, it would absolutely destroy our ability to change the behavior
of sorting ever again.  Considering that we've added ASC/DESC, NULLS
FIRST/LAST, and collation support over the years, are you really
prepared to bet that the sort code will never need any more feature
upgrades?  (This concern is in fact the source of my beef with the whole
inlining proposal to begin with, but allowing the inlining to occur into
third-party code that we don't control at all would be a hundred times
worse.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Heikki Linnakangas

On 21.09.2011 18:46, Tom Lane wrote:

Well, we'd have to negotiate what the API ought to be.  What I'm
envisioning is that datatypes could provide alternate comparison
functions that are designed to be qsort-callable rather than
SQL-callable.  As such, they could not have entries in pg_proc, so
it seems like there's no ready way to represent them in the catalogs.


Quite aside from this qsort-thing, it would be nice to have versions of 
all simple functions that could be called without the FunctionCall 
overhead. So instead of:


FunctionCall2(flinfo_for_int4pl, 1, 2)

you could do simply

int4pl_fastpath(1,2)

I'm not sure how big an effect this would have, but it seems like it 
could shave some cycles across the system.


We could have an extended version of the PG_FUNCTION_INFO_V1 macro that 
would let you register the fastpath function:


PG_FUNCTION_INFO_V1(int4pl, int4pl_fastpath);

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] DECLARE CURSOR must not contain data-modifying statements in WITH

2011-09-21 Thread Andres Freund
Hi all,

Whats the reason for disallowing cursors on wCTEs? I am not sure I can follow 
the comment:

/*
 * We also disallow data-modifying WITH in a cursor.  (This could be
 * allowed, but the semantics of when the updates occur might be
 * surprising.)
 */
if (result-hasModifyingCTE)
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
 errmsg(DECLARE CURSOR must not contain 
data-modifying 
statements in WITH)));

Given that cursors are about the only sensible way to return larger amounts of 
data, that behaviour reduces the usefulness of wCTEs a bit.

Whats the exact cause of concern here? I personally don't think there is a 
problem documenting that you should fetch the cursor fully before relying on 
the updated tables to be in a sensible state. But that may be just me.


Thanks,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Greg Stark st...@mit.edu writes:
 On Wed, Sep 21, 2011 at 4:46 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  As such, they could not have entries in pg_proc, so
 it seems like there's no ready way to represent them in the catalogs.

 Why couldn't they be in pg_proc with a bunch of opaque arguments like
 the GIST opclass support functions?

That does not mean the same thing at all.  Everything in pg_proc is
meant to be called through the V0 or V1 function call info protocols.

 I'm a bit puzzled what the arguments would look like. They would still
 need to know the collation, nulls first/last flags, etc.

No, I wasn't thinking that we should do that.  The datatype comparison
functions should have the exact same semantics they do now, just a
lower-overhead call mechanism.  If you try to push stuff like NULLS
FIRST/LAST into the per-datatype code, then you are up against a problem
when you want to add a new flag: you have to touch lots of code not all
of which you even control.

 And calling it would still not be inlinable.  So they would have to
 check those flags on each invocation instead of having a piece of
 straightline code that hard codes the behaviour with the right
 behaviour inline.  ISTM the hope for a speedup from the inlining
 mostly came from the idea that the compiler might be able to hoist
 this logic outside the loop (and I suppose implement n specialized
 loops depending on the behaviour needed).

None of that stuff is inlinable or constant-foldable today, nor would it
be with the patch that Peter was proposing AFAICS, because none of the
flags will ever be compile time constant values.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-21 Thread Euler Taveira de Oliveira

On 21-09-2011 11:44, Linas Virbalas wrote:

[This question doesn't belong to -hackers. Please post it in -general or -admin]


Procedure:

1. Start load generator on the master (WAL archiving enabled).
2. Prepare a Streaming Replication standby (accepting WAL files too):
2.1. pg_switch_xlog() on the master;

You don't need this.


2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a
while as master is loaded up);

No. if you use pg_start_backup('foo', true) it will be fast. Check the manual.


2.3. rsync data/global/pg_control to the standby;

Why are you doing this? If ...


2.4. rsync all other data/ (without pg_xlog) to the standby;

you will copy it again or no? Don't understand your point.


2.5. pg_stop_backup() on the master;
2.6. Wait to receive all WAL files, generated during the backup, on the
standby;
2.6. Start the standby PG instance.

The last step will, usually, fail with a similar error:

The problem could be that the minimum recovery point (step 2.3) is different 
from the end of rsync if you are under load.



--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Tom Lane
Heikki Linnakangas heikki.linnakan...@enterprisedb.com writes:
 On 21.09.2011 18:46, Tom Lane wrote:
 Well, we'd have to negotiate what the API ought to be.  What I'm
 envisioning is that datatypes could provide alternate comparison
 functions that are designed to be qsort-callable rather than
 SQL-callable.  As such, they could not have entries in pg_proc, so
 it seems like there's no ready way to represent them in the catalogs.

 Quite aside from this qsort-thing, it would be nice to have versions of 
 all simple functions that could be called without the FunctionCall 
 overhead.

Hmm, that's an interesting idea.  I think probably the important aspects
are (1) known number of arguments and (2) no null argument or result
values are allowed.  Not sure what we'd do with collations though.

 We could have an extended version of the PG_FUNCTION_INFO_V1 macro that 
 would let you register the fastpath function:
 PG_FUNCTION_INFO_V1(int4pl, int4pl_fastpath);

We don't use PG_FUNCTION_INFO_V1 for built-in functions ...

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unaccent contrib

2011-09-21 Thread Daniel Vázquez
Hi guys!

I know Postgresql 9.x includes unaccent contrib on their deliver package.

unaccent is compatible with postgresql 8.4 (but not is in their contrib
version distribution)

what's better way to setup unaccent module on Postgresql 8.4 production
server.

Copy contrib/unaccent from 9.x to the 8.4 installation and execute it???

We can do it???

Other ideas??

Can Global Development Group, make some acumulative rpm  for contrib modules
that are backward compatible???

thx!


Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 16:44 , Linas Virbalas wrote:
 After searching the archives, the only more discussed and similar issue I
 found hit was by Daniel Farina in a thread hot backups: am I doing it
 wrong, or do we have a problem with pg_clog? [2], but, it seems, the issue
 was discarded because of a non-standard backup procedure Deniel used.

That's not the way I read that thread. In fact, Robert Haas confirmed that
Daniel's backup procedure was sound in theory. The open question was whether the
error occurred because of a Bug in Daniel's backup code or postgresql's restore
code. The thread then petered out without that question being answered.

 Procedure:
 
 1. Start load generator on the master (WAL archiving enabled).
 2. Prepare a Streaming Replication standby (accepting WAL files too):
 2.1. pg_switch_xlog() on the master;
 2.2. pg_start_backup(Obackup_under_load¹) on the master (this will take a
 while as master is loaded up);
 2.3. rsync data/global/pg_control to the standby;
 2.4. rsync all other data/ (without pg_xlog) to the standby;
 2.5. pg_stop_backup() on the master;
 2.6. Wait to receive all WAL files, generated during the backup, on the
 standby;
 2.6. Start the standby PG instance.

Looks good. (2.1) and (2.3) seem redundant (as Euler already noticed),
but shouldn't cause any errors.

Could you provide us with the exact rsync version and parameters you use?

 The last step will, usually, fail with a similar error:
 
 2011-09-21 13:41:05 CEST LOG:  database system was interrupted; last known
 up at 2011-09-21 13:40:50 CEST
 Restoring 0014.history
 mv: cannot stat `/opt/PostgreSQL/9.1/archive/0014.history': No such file
 or directory
 Restoring 0013.history
 2011-09-21 13:41:05 CEST LOG:  restored log file 0013.history from
 archive
 2011-09-21 13:41:05 CEST LOG:  entering standby mode
 Restoring 0013000600DC
 2011-09-21 13:41:05 CEST LOG:  restored log file 0013000600DC
 from archive
 Restoring 0013000600DB
 2011-09-21 13:41:05 CEST LOG:  restored log file 0013000600DB
 from archive
 2011-09-21 13:41:05 CEST FATAL:  could not access status of transaction
 1188673
 2011-09-21 13:41:05 CEST DETAIL:  Could not read from file pg_clog/0001 at
 offset 32768: Success.

Whats the size of the file (pg_clog/0001) on both the master and the slave?

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Greg Stark
On Wed, Sep 21, 2011 at 5:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 None of that stuff is inlinable or constant-foldable today, nor would it
 be with the patch that Peter was proposing AFAICS, because none of the
 flags will ever be compile time constant values.

I was referring to transformations like this which I believe compilers
are already capable of doing:

v = ...;
while (...)
  if (v) {
if (a  b) ... else ;
  } else {
if (a  b) ... else ...;
  }

turning it into code that looks like:

if (v) {
 while ()
   if (ab) ... else ...;
} else {
  while ()
if (ab) ... else ...;
}

which may not look like much -- especially with branch prediction --
but then it's much more likely to be able to unroll the loop and do
clever instruction scheduling and so on than if there's an extra
branch in the middle of the loop. But if there's a function call to an
external function called through a function pointer in the middle of
the loop then the whole endeavour would be for naught.



-- 
greg

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-21 Thread Alvaro Herrera

Excerpts from Peter Eisentraut's message of mié sep 21 00:27:53 -0300 2011:
 
 On tis, 2011-09-20 at 11:12 -0300, Alvaro Herrera wrote:
+1 for a closed mailing list.  It's a bit annoying to have to do
such a thing, but it's not like we haven't got other closed
lists for appropriate purposes.
   
   Well, that much we've already decided a few years ago.  The
   blocking issues are: (1) do we have enough interest, and (2) where
   to put it (I'm looking at you, pgfoundry).
  
  I don't see why we wouldn't put it in @postgresql.org.
 
 One nice thing about pgfoundry would be the document manager.  Also, at
 least at some point in the past, a pgfoundry project was easier to
 manage than getting anything done about a @postgresql.org mailing list.

The document manager might be useful, true.  I cannot speak about past
administrators of the Majordomo installation that serves the
@postgresql.org lists, though.  For all intents and purposes, it seems
I'm in charge of it now.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Is there really no interest in SQL Standard?

2011-09-21 Thread Dave Page
On Wed, Sep 21, 2011 at 5:49 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Peter Eisentraut's message of mié sep 21 00:27:53 -0300 2011:

 On tis, 2011-09-20 at 11:12 -0300, Alvaro Herrera wrote:
+1 for a closed mailing list.  It's a bit annoying to have to do
such a thing, but it's not like we haven't got other closed
lists for appropriate purposes.
  
   Well, that much we've already decided a few years ago.  The
   blocking issues are: (1) do we have enough interest, and (2) where
   to put it (I'm looking at you, pgfoundry).
 
  I don't see why we wouldn't put it in @postgresql.org.

 One nice thing about pgfoundry would be the document manager.  Also, at
 least at some point in the past, a pgfoundry project was easier to
 manage than getting anything done about a @postgresql.org mailing list.

 The document manager might be useful, true.  I cannot speak about past
 administrators of the Majordomo installation that serves the
 @postgresql.org lists, though.  For all intents and purposes, it seems
 I'm in charge of it now.

Only seems? :-)



-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Kevin Grittner
Andres Freund and...@anarazel.de wrote:
 
 - Its impossible to emulate proper locking yourself because
 locking is not allowed for sequences
 
 Any arguments against allowing it again? It seems to have been
 allowed in prehistoric times.
 
It would be nice to allow it.  I've had to create a dummy table just
to use for locking a sequence (by convention).
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Remastering using streaming only replication?

2011-09-21 Thread Josh Berkus
Fujii,

I haven't really been following your latest patches about taking backups
from the standby and cascading replication, but I wanted to see if it
fulfills another TODO: the ability to remaster (that is, designate the
lead standby as the new master) without needing to copy WAL files.

Supporting remastering using steaming replication only was on your TODO
list when we closed 9.1.  It seems like this would get solved as a
side-effect, but I wanted to confirm that.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Merlin Moncure
On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Andres Freund and...@anarazel.de wrote:

 - Its impossible to emulate proper locking yourself because
 locking is not allowed for sequences

 Any arguments against allowing it again? It seems to have been
 allowed in prehistoric times.

 It would be nice to allow it.  I've had to create a dummy table just
 to use for locking a sequence (by convention).

another (better?) way is advisory locks...

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Josh Berkus
Robert,

 Josh is arguing that we ought to use the term replication, but it

Actually, no. I'm arguing that we should use the term standby, since
that term is consistent with how we refer to replica servers throughout
the docs, and the term recovery is not.

 seems to me that's just as misleading - maybe moreso, since recovery
 is sufficiently a term of art to make you at least think about reading
 the manual, whereas you know (or think you know) what replication is.

Nope.  What it means is that users see stuff relating to recovery and
say oh, that's not right, the replication stuff must be somewhere else.

I've taught a half-dozen classes on PostgreSQL binary replication now,
and the recovery nomenclature *always* confuses students.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 12:55 PM, Josh Berkus j...@agliodbs.com wrote:
 Josh is arguing that we ought to use the term replication, but it

 Actually, no. I'm arguing that we should use the term standby, since
 that term is consistent with how we refer to replica servers throughout
 the docs, and the term recovery is not.

 seems to me that's just as misleading - maybe moreso, since recovery
 is sufficiently a term of art to make you at least think about reading
 the manual, whereas you know (or think you know) what replication is.

 Nope.  What it means is that users see stuff relating to recovery and
 say oh, that's not right, the replication stuff must be somewhere else.

 I've taught a half-dozen classes on PostgreSQL binary replication now,
 and the recovery nomenclature *always* confuses students.

Yeah, I get it.  But I think standby would confuse them, too, just in
a different set of situations.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Wed, 2011-09-21 at 17:20 +0200, Florian Pflug wrote:
 Hm, so we'd have
 
   '(X,)' for range(X, NULL, '()'),
   '(,X)' for range(NULL, X, '()') and
   '(,)' for range(NULL, NULL, '()').
 
 We'd then have the choice of either declaring
 
   '(X,]' to mean '(X,)',
   '[,X)' to mean '(,X)' and
   '[,]' to mean '(,)'
 
 or to forbid the use of '[' and ']' for unspecified bounds.

Right now, I just canonicalize it to round brackets if infinite. Seems
pointless to reject it, but I can if someone thinks it's better.

 (Leaving out the ',' in the case of only one bound as in my reply to Robert's
 mail somewhere else in this thread doesn't actually work, since it'd be 
 ambiguous whether '(X)' means range(X, NULL, '()') or range(NULL, X, '()').)
 
 One nice property is that, apart from the different brackets used, this
 representation is identical to the one used by records while still avoiding
 the infinity vs. NULL confusion.

OK, I like that. Slightly strange to require quoting empty strings, but
not stranger than the alternatives.

While we're at it, any suggestions on the text representation of an
empty range?

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Jeff Davis
On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote:
 I've thought about this some more, and came to realize that the question
 here really is whether
 
   floatrange(0, 'Infinity'::float, '[)')
 
 and
 
   floatrange(0, NULL, '[)')
 
 are the same thing or not.

The unbounded side of a range is never equal to a value in the data
type's domain, so no, it's not the same.

I think that we pretty much settled on just using an empty string for
infinity in the other thread, right? So that makes this a non-issue.

Regards,
Jeff Davis


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Kevin Grittner
Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Andres Freund and...@anarazel.de wrote:

 - Its impossible to emulate proper locking yourself because
 locking is not allowed for sequences

 Any arguments against allowing it again? It seems to have been
 allowed in prehistoric times.

 It would be nice to allow it.  I've had to create a dummy table
 just to use for locking a sequence (by convention).
 
 another (better?) way is advisory locks...
 
Not under 9.0 or earlier if you want the lock to last until the end
of the transaction.  Also, the fact that advisory locks are only on
numbers, without any mechanism for mapping those to character
strings, makes them poorly suited to many tasks.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Josh Berkus

 Yeah, I get it.  But I think standby would confuse them, too, just in
 a different set of situations.

Other than PITR, what situations?

PITR is used by a minority of our users.  Binary replication, if not
already used by a majority, will be in the future (it's certainly the
majority of my professional clients).  Further, PITR is usually
something which is either handled by vendor backup management software,
or by professional DBAs, whereas replication is used by developers with
little or no DBA support.

Why should we make terminology obscure for the majority usecase to make
it clear for the minority one?  Especially since the majority use-case
has almost all the newbies?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote:
 Yeah, I get it.  But I think standby would confuse them, too, just in
 a different set of situations.

 Other than PITR, what situations?

Hot backup?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Josh Berkus
On 9/21/11 10:07 AM, Robert Haas wrote:
 On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote:
 Yeah, I get it.  But I think standby would confuse them, too, just in
 a different set of situations.

 Other than PITR, what situations?
 
 Hot backup?

Hot backup == PITR.   You're just not bothering to accumulate WAL logs.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PERFORM] Constraint exclusion on UNION ALL subqueries with WHERE conditions

2011-09-21 Thread Tom Lane
=?ISO-8859-1?Q?Gunnlaugur_=DE=F3r_Briem?= gunnlau...@gmail.com writes:
 On Monday, September 19, 2011 3:59:30 AM UTC, Tom Lane wrote:
 Works for me in 8.4.8.  Do you have constraint_exclusion set to ON?

 I did try with constraint_exclusion set to on, though the docs suggest 
 partition should be enough (examine constraints only for ... UNION ALL 
 subqueries)

 Here's a minimal test case (which I should have supplied in the original 
 post, sorry), tried just now in 8.4.8:

 CREATE OR REPLACE VIEW v_heavy_view
 AS SELECT (random()*1e5)::integer col
 FROM generate_series(1, 1e6::integer);

 CREATE OR REPLACE VIEW v_test_constraint_exclusion AS
 SELECT col FROM v_heavy_view WHERE col  3
 UNION ALL SELECT col FROM v_heavy_view WHERE col = 3;

 EXPLAIN SELECT * FROM v_test_constraint_exclusion WHERE col=2;

Hmm.  The reason this particular case doesn't work is that we don't
apply relation_excluded_by_constraints() to functions-in-FROM.
It's only used for plain-table RTEs, not subqueries, functions,
etc.  I suspect the complainant's real case involved an unflattenable
subquery.

Probably the rationale for that coding was that only plain tables
could have CHECK constraints; but the portion of the logic that looks
for mutually contradictory scan constraints could apply to non-table
relations.

Should we change the code to make such checks in these cases?
The default behavior (with constraint_exclusion = partition) would
still be to do nothing extra, but it would add planning expense when
constraint_exclusion = on.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 1:08 PM, Josh Berkus j...@agliodbs.com wrote:
 On 9/21/11 10:07 AM, Robert Haas wrote:
 On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote:
 Yeah, I get it.  But I think standby would confuse them, too, just in
 a different set of situations.

 Other than PITR, what situations?

 Hot backup?

 Hot backup == PITR.   You're just not bothering to accumulate WAL logs.

Well, I don't think of it that way, but YMMV, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Andres Freund
On Wednesday 21 Sep 2011 19:03:17 Kevin Grittner wrote:
 Merlin Moncure mmonc...@gmail.com wrote:
  On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner
  
  kevin.gritt...@wicourts.gov wrote:
  Andres Freund and...@anarazel.de wrote:
  - Its impossible to emulate proper locking yourself because
  locking is not allowed for sequences
  
  Any arguments against allowing it again? It seems to have been
  allowed in prehistoric times.
  
  It would be nice to allow it.  I've had to create a dummy table
  just to use for locking a sequence (by convention).
  
  another (better?) way is advisory locks...
 
 Not under 9.0 or earlier if you want the lock to last until the end
 of the transaction.  Also, the fact that advisory locks are only on
 numbers, without any mechanism for mapping those to character
 strings, makes them poorly suited to many tasks.
The usual trick is to lock on the oid of some database object. But I agree, 
its a poor workaround for this specific problem.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hot Backup with rsync fails at pg_clog if under load

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 12:22 PM, Euler Taveira de Oliveira
eu...@timbira.com wrote:
 [This question doesn't belong to -hackers. Please post it in -general or
 -admin]

-hackers or -bugs seems appropriate to me; I think this is a bug.

 2.2. pg_start_backup(Obackup_under_loadš) on the master (this will take a
 while as master is loaded up);

 No. if you use pg_start_backup('foo', true) it will be fast. Check the
 manual.

If the server is sufficiently heavily loaded that a checkpoint takes a
nontrivial amount of time, the OP is correct that this will be not
fast, regardless of whether you choose to force an immediate
checkpoint.

 2.3. rsync data/global/pg_control to the standby;

 Why are you doing this? If ...

 2.4. rsync all other data/ (without pg_xlog) to the standby;

 you will copy it again or no? Don't understand your point.

His point is that exercising the bug depends on doing the copying in a
certain order.  Any order of copying the data theoretically ought to
be OK, as long as it's all between starting the backup and stopping
the backup, but apparently it isn't.

 The problem could be that the minimum recovery point (step 2.3) is different
 from the end of rsync if you are under load.

It seems pretty clear that some relevant chunk of WAL isn't getting
replayed, but it's not at all clear to me why not.  It seems like it
would be useful to compare the LSN returned by pg_start_backup() with
the location at which replay begins when you fire up the clone.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Tom Lane
Kevin Grittner kevin.gritt...@wicourts.gov writes:
 Andres Freund and...@anarazel.de wrote:
 - Its impossible to emulate proper locking yourself because
 locking is not allowed for sequences
 
 Any arguments against allowing it again? It seems to have been
 allowed in prehistoric times.

If you think that it used to be allowed, it'd be a good idea to see
if you can find the archived discussions about changing it.
 
 It would be nice to allow it.  I've had to create a dummy table just
 to use for locking a sequence (by convention).

One question is what you think the lock means.  I believe for example
that taking a non-exclusive regular table lock on a sequence would not
prevent other sessions from doing nextval(); even an exclusive one would
not prevent them from doing so if they had pre-cached values.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Aidan Van Dyk
On Wed, Sep 21, 2011 at 1:13 PM, Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 1:08 PM, Josh Berkus j...@agliodbs.com wrote:
 On 9/21/11 10:07 AM, Robert Haas wrote:
 On Wed, Sep 21, 2011 at 1:03 PM, Josh Berkus j...@agliodbs.com wrote:
 Yeah, I get it.  But I think standby would confuse them, too, just in
 a different set of situations.

 Other than PITR, what situations?

 Hot backup?

 Hot backup == PITR.   You're just not bothering to accumulate WAL logs.

 Well, I don't think of it that way, but YMMV, of course.

I think that the major differentiating factor is the intended action
when caught up, and the definition of caught up, and trying to use a
single term for both of them is going to always cause confusion.

So I tend to think of the use cases by their continuation.  A
slave is intended to continually keep trying to get more once it's
retrieved and applied all the changes it can.  It can be hot, or cold,
streaming, or archive, etc...  And recovery is intended to stop
recovering and become normal once it's finished retrieving and
applying all changes it can.  Again, it has multiple ways to retrive
it's wal too.

And I think Tom touched on this point in the
recovery.conf/recovery.done thread a bit too.  Maybe we need to
really start talking about the different when done do ...
distinctions, and and using that distinction to help our nomenclature.

Both recovery/slave (both hot or cold) use the same retrieve/apply
machinery (and thus configuration options).  But because of the
different caught up action, are different features.

a.

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unite recovery.conf and postgresql.conf

2011-09-21 Thread Aidan Van Dyk
On Wed, Sep 21, 2011 at 1:34 PM, Aidan Van Dyk ai...@highrise.ca wrote:

 And I think Tom touched on this point in the
 recovery.conf/recovery.done thread a bit too.

Doh!  That's this thread

/me slinks away, ashamed for not even taking a close look at the to/cc list...

-- 
Aidan Van Dyk                                             Create like a god,
ai...@highrise.ca                                       command like a king,
http://www.highrise.ca/                                   work like a slave.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Andres Freund
On Wednesday 21 Sep 2011 19:24:55 Tom Lane wrote:
 Kevin Grittner kevin.gritt...@wicourts.gov writes:
  Andres Freund and...@anarazel.de wrote:
  - Its impossible to emulate proper locking yourself because
  locking is not allowed for sequences
  
  Any arguments against allowing it again? It seems to have been
  allowed in prehistoric times.
 
 If you think that it used to be allowed, it'd be a good idea to see
 if you can find the archived discussions about changing it.
The message I was thinking about was
http://archives.postgresql.org/pgsql-hackers/2001-10/msg00930.php

  It would be nice to allow it.  I've had to create a dummy table just
  to use for locking a sequence (by convention).
 One question is what you think the lock means.  I believe for example
 that taking a non-exclusive regular table lock on a sequence would not
 prevent other sessions from doing nextval(); even an exclusive one would
 not prevent them from doing so if they had pre-cached values.
I don't see what a non-exclusive lock on a sequence should sensibly do so I 
don't see a problem with not supporting them.
That already cached values are not affected by the lock seems to be pretty 
logical to me - and not really problematic.
At least in my cases I would look at last_value from the sequence after 
locking it- which includes the cached values so its fine that they can be used. 
The case that somebody already acquired a sequence value that not visible to 
other sessions has to be taken into account anyway.


Greetings,

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unaccent contrib

2011-09-21 Thread Euler Taveira de Oliveira

On 21-09-2011 13:28, Daniel Vázquez wrote:

unaccent is compatible with postgresql 8.4 (but not is in their contrib
version distribution)


No, it is not. AFAICS it is necessary to add some backend code that is not in 
8.4.


--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Alvaro Herrera

Hi,

I notice that HeapTupleSatisfiesToast is not setting the
HEAP_XMIN_COMMITTED bit, though it is reading it.  Is there a reason for
this?  It seems to me that it'd make sense to have it set ... unless
it's set by some other routine, somehow?

-- 
Álvaro Herrera alvhe...@alvh.no-ip.org

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Kevin Grittner
Andres Freund and...@anarazel.de wrote:
 On Wednesday 21 Sep 2011 19:24:55 Tom Lane wrote:
 
 One question is what you think the lock means.  I believe for
 example that taking a non-exclusive regular table lock on a
 sequence would not prevent other sessions from doing nextval();
 even an exclusive one would not prevent them from doing so if
 they had pre-cached values.
 I don't see what a non-exclusive lock on a sequence should
 sensibly do so I don't see a problem with not supporting them.
 That already cached values are not affected by the lock seems to
 be pretty logical to me - and not really problematic.
 At least in my cases I would look at last_value from the sequence
 after locking it- which includes the cached values so its fine
 that they can be used. 
 The case that somebody already acquired a sequence value that not
 visible to other sessions has to be taken into account anyway.
 
I think all of that holds for us, as well.  Our only real use for
this (so far, anyway) is in our trigger-based replication -- a
deferred AFTER INSERT trigger assigns a strictly monotonically
increasing commit number which must match the order of commit.  I
don't see how getting an exclusive lock on the sequence itself could
introduce any bugs which we wouldn't have using a dummy table
created only to serve as a lock target.
 
Given that I can't think of any other uses for this feature, I guess
it would be pretty low on my list of priorities.  As I said earlier,
it would be nice.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sequence locking

2011-09-21 Thread Merlin Moncure
On Wed, Sep 21, 2011 at 12:03 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Merlin Moncure mmonc...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 11:51 AM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Andres Freund and...@anarazel.de wrote:

 - Its impossible to emulate proper locking yourself because
 locking is not allowed for sequences

 Any arguments against allowing it again? It seems to have been
 allowed in prehistoric times.

 It would be nice to allow it.  I've had to create a dummy table
 just to use for locking a sequence (by convention).

 another (better?) way is advisory locks...

 Not under 9.0 or earlier if you want the lock to last until the end
 of the transaction.  Also, the fact that advisory locks are only on
 numbers, without any mechanism for mapping those to character
 strings, makes them poorly suited to many tasks.

hm, this was one of the things I used advisory locks for -- in fact,
not having to hold the lock for the duration of the transaction was
useful for cases of sequence locking.

Basically, you steal the sequence oid for a lock id and wrap nextval()
with an advisory sharelock.  Then, if you need to do some type of
heavy duty operation, like reserve a contiguous block of identifiers,
you can full lock the same lock and block everyone.

If the locks were full transaction locks, that would essentially
serialize all transactions that sharelocked the sequence...no bueno.
So, considering all that, what are the actual use cases for proper
locking of sequence locks (outside of the metadata stuff you can lock
now)?

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unaccent contrib

2011-09-21 Thread Daniel Vázquez
... ok
No alternatives for unaccent on 8.4?

2011/9/21 Euler Taveira de Oliveira eu...@timbira.com

 On 21-09-2011 13:28, Daniel Vázquez wrote:

 unaccent is compatible with postgresql 8.4 (but not is in their contrib
 version distribution)

  No, it is not. AFAICS it is necessary to add some backend code that is
 not in 8.4.


 --
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/**mailpref/pgsql-hackershttp://www.postgresql.org/mailpref/pgsql-hackers




-- 
Daniel Vázquez
SICONET (A Bull Group Company)
Torre Agbar. Avda. Diagonal, 211 - planta 23
08018 - Barcelona
telf: + 34 93 2272727 (Ext. 2952)
fax: + 34 93 2272728
www.bull.es - www.siconet.es
daniel.vazq...@bull.es


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Robert Haas
On Thu, Sep 15, 2011 at 11:57 AM, Heikki Linnakangas
heikki.linnakan...@enterprisedb.com wrote:
 s/atomic/barrier/

Oops.

 +/*
 + * A compiler barrier need not (and preferably should not) emit any
 actual
 + * machine code, but must act as an optimization fence: the compiler must
 not
 + * reorder loads or stores to main memory around the barrier.  However,
 the
 + * CPU may still reorder loads or stores at runtime, if the
 architecture's
 + * memory model permits this.
 + *
 + * A memory barrier must act as a compiler barrier, and in addition must
 + * guarantee that all loads and stores issued prior to the barrier are
 + * completed before any loads or stores issued after the barrier.  Unless
 + * loads and stores are totally ordered (which is not the case on most
 + * architectures) this requires issuing some sort of memory fencing
 + * instruction.

 This seems like a strange way to explain the problem. I would suggest
 structuring those paragraphs along the lines of:

 
 A PostgreSQL memory barrier guarantees that any loads/stores before the
 barrier are completely finished and visible to other CPUs, before the
 loads/stores after the barrier are performed.

 That involves two things: 1. We must stop the compiler from rearranging
 loads/stores across the barrier. 2. We must stop the CPU from reordering the
 loads/stores across the barrier.
 

That doesn't seem much different than I wrote?

One thing to keep in mind about whatever language we use here is that
memory barrier instructions need not (and often do not) compel the CPU
to completely finish anything before doing the next thing.
Execution is heavily pipelined, and on a sequence like STORE/WRITE
BARRIER/STORE the CPU is perfectly well entitled to begin the second
store before the first one is done.  It just can't let the second
STORE get far enough for other CPUs to see it.

 Do we have any use for compiler barriers that are not also memory barriers?
 If not, I would suggest not exposing the pg_compiler_barrier() macro, but
 keep that as an implementation detail in the implementations of
 pg_memory_barrier().

I think there might be some use for a pure compiler barrier, but I'm
not sure yet.  It's probably not worth having a fallback
implementation involving a spinlock, though, because odds are good
that any code that is performance-critical enough to be attempting to
safely use a compiler barrier can't survive having a spinlock
acquire-and-release shoved in there, so any such code should probably
be #ifdef'd.

 Some examples on the correct usage of these barriers would be nice, too.

That's a big topic.  A trivial example is that if you write:

a[*x] = i;
++*x;

...where x and i are pointers into shared memory, another backend
might loop over a from 0 to x-1 and accidentally read off the end of
the array.

But even a full explanation of that case seems like almost too much
for the comment of a header file, and there are certainly far more
complex cases.  I think anyone who is using these primitives is going
to have to do some independent reading...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 But even a full explanation of that case seems like almost too
 much for the comment of a header file, and there are certainly far
 more complex cases.  I think anyone who is using these primitives
 is going to have to do some independent reading...
 
Maybe a URL or two in the header comments, pointing to relevant
papers for the techniques used?  After all, years from now someone
might be familiar with other techniques from newer papers and wonder
what the techniques in the code are based on.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Merlin Moncure
On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org wrote:

 Hi,

 I notice that HeapTupleSatisfiesToast is not setting the
 HEAP_XMIN_COMMITTED bit, though it is reading it.  Is there a reason for
 this?  It seems to me that it'd make sense to have it set ... unless
 it's set by some other routine, somehow?

I think it's implied from the untoasted row.  Notice that it's not
checking visibility either except in binary upgrade cases.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 But even a full explanation of that case seems like almost too
 much for the comment of a header file, and there are certainly far
 more complex cases.  I think anyone who is using these primitives
 is going to have to do some independent reading...

 Maybe a URL or two in the header comments, pointing to relevant
 papers for the techniques used?  After all, years from now someone
 might be familiar with other techniques from newer papers and wonder
 what the techniques in the code are based on.

If there are any academic papers on this topic, I haven't found them.
Mostly, I've found lots of articles written by people who were coding
for the Linux kernel, and a lot of those articles are extremely
Linux-specific (you could use the smb_rb() macro here, but it's better
to instead use this RCU-related macro, because it'll do it for you,
blah blah).  I'm happy to link to any sources anyone thinks we ought
to link to, but I've mostly had to piece this together bit by bit from
blog posts and (sometimes buggy) examples.  It hasn't been a
particularly thrilling exercise.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Alvaro Herrera

Excerpts from Merlin Moncure's message of mié sep 21 16:02:34 -0300 2011:
 On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org 
 wrote:
 
  Hi,
 
  I notice that HeapTupleSatisfiesToast is not setting the
  HEAP_XMIN_COMMITTED bit, though it is reading it.  Is there a reason for
  this?  It seems to me that it'd make sense to have it set ... unless
  it's set by some other routine, somehow?
 
 I think it's implied from the untoasted row.  Notice that it's not
 checking visibility either except in binary upgrade cases.

Yeah, so toast visibility is highly dependant to the referencing row.
Which makes sense.  But then, if the XMIN_COMMITTED bit is not set,
you're forced to check the other bits every time.  I guess the tradeoff
is that if you set it, the page is dirtied and you're forced to write it
down, which is even worse.

More interesting, however, is the fact that the XMAX_COMMITTED bit is
never set either.  I guess the rows are deleted by a different mechanism
(tuptoaster probably) -- it isn't obvious how this works just by looking
at tqual.c.  It seems to do nothing at all.

-- 
Álvaro Herrera alvhe...@commandprompt.com
The PostgreSQL Company - Command Prompt, Inc.
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Inlining comparators as a performance optimisation

2011-09-21 Thread Simon Riggs
On Wed, Sep 21, 2011 at 4:22 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Simon Riggs si...@2ndquadrant.com writes:
 This is a marvellous win, a huge gain from a small, isolated and
 easily tested change. By far the smallest amount of additional code to
 sorting we will have added and yet one of the best gains.

 I think you forgot your cheerleader uniform.

LOL. I'm happy whoever and whenever we get large wins like that.

Go Postgres!

 A patch along these lines
 is not going to be small, isolated, easily maintained, nor beneficial
 for any but a small number of predetermined datatypes.

That was the starting premise.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Merlin Moncure
On Wed, Sep 21, 2011 at 2:14 PM, Alvaro Herrera
alvhe...@commandprompt.com wrote:

 Excerpts from Merlin Moncure's message of mié sep 21 16:02:34 -0300 2011:
 On Wed, Sep 21, 2011 at 1:03 PM, Alvaro Herrera alvhe...@alvh.no-ip.org 
 wrote:
 
  Hi,
 
  I notice that HeapTupleSatisfiesToast is not setting the
  HEAP_XMIN_COMMITTED bit, though it is reading it.  Is there a reason for
  this?  It seems to me that it'd make sense to have it set ... unless
  it's set by some other routine, somehow?

 I think it's implied from the untoasted row.  Notice that it's not
 checking visibility either except in binary upgrade cases.

 Yeah, so toast visibility is highly dependant to the referencing row.
 Which makes sense.  But then, if the XMIN_COMMITTED bit is not set,
 you're forced to check the other bits every time.  I guess the tradeoff
 is that if you set it, the page is dirtied and you're forced to write it
 down, which is even worse.

yeah -- there's no way it's worth the i/o in that case, since there's
no visibility check to protect yourself from.

 More interesting, however, is the fact that the XMAX_COMMITTED bit is
 never set either.  I guess the rows are deleted by a different mechanism
 (tuptoaster probably) -- it isn't obvious how this works just by looking
 at tqual.c.  It seems to do nothing at all.

yup -- probably the only reason it exists at all is vacuum issues
which all visibility routines have to handle. otherwise, it's a fancy
'return true'.

merlin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: SP-GiST, Space-Partitioned GiST

2011-09-21 Thread Heikki Linnakangas

On 05.09.2011 09:39, Oleg Bartunov wrote:

I attached wrong patch in previous message, sorry ! Here is a last version.


One little detail caught my eye: In spgSplitNodeAction, you call 
SpGistGetBuffer() within a critical section. That should be avoided, 
SpGistGetBuffer() can easily fail if you e.g run out of disk space.


--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 But even a full explanation of that case seems like almost too
 much for the comment of a header file, and there are certainly
 far more complex cases.  I think anyone who is using these
 primitives is going to have to do some independent reading...

 Maybe a URL or two in the header comments, pointing to relevant
 papers for the techniques used?  After all, years from now
 someone might be familiar with other techniques from newer papers
 and wonder what the techniques in the code are based on.
 
 If there are any academic papers on this topic, I haven't found
 them.  Mostly, I've found lots of articles written by people who
 were coding for the Linux kernel, and a lot of those articles are
 extremely Linux-specific (you could use the smb_rb() macro here,
 but it's better to instead use this RCU-related macro, because
 it'll do it for you, blah blah).  I'm happy to link to any sources
 anyone thinks we ought to link to, but I've mostly had to piece
 this together bit by bit from blog posts and (sometimes buggy)
 examples.  It hasn't been a particularly thrilling exercise.
 
Well, if it really is that hard to piece together the relevant
techniques, it seems cruel not to check in the results of your
efforts to work it out somewhere.  Perhaps a README file?
 
On the other hand, a search turned up these two papers (which I
haven't yet read, but will):
 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397rep=rep1type=pdf
 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657rep=rep1type=pdf
 
On a quick scan, they both look promising in themselves, and
reference a lot of other promising-sounding papers.
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unaccent contrib

2011-09-21 Thread Euler Taveira de Oliveira

On 21-09-2011 15:23, Daniel Vázquez wrote:

No alternatives for unaccent on 8.4?


Not that I know of.


--
   Euler Taveira de Oliveira - Timbira   http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Tom Lane
Alvaro Herrera alvhe...@alvh.no-ip.org writes:
 I notice that HeapTupleSatisfiesToast is not setting the
 HEAP_XMIN_COMMITTED bit, though it is reading it.  Is there a reason for
 this?  It seems to me that it'd make sense to have it set ... unless
 it's set by some other routine, somehow?

Hmm ... I wonder if that has something to do with the performance
complaints we sometimes get about TOAST.  VACUUM would presumably set
the bit, but in a never-vacuumed toast table this could be a problem.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HeapTupleSatisfiesToast not setting XMIN_COMMITTED?

2011-09-21 Thread Tom Lane
Alvaro Herrera alvhe...@commandprompt.com writes:
 More interesting, however, is the fact that the XMAX_COMMITTED bit is
 never set either.  I guess the rows are deleted by a different mechanism
 (tuptoaster probably) -- it isn't obvious how this works just by looking
 at tqual.c.  It seems to do nothing at all.

I have some vague recollection that the only reason we do any xmin/xmax
checking at all for a TOAST tuple is to support tuple motion caused by
old-style VACUUM FULL.  Jan might remember better.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Gurjeet Singh
On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas robertmh...@gmail.com wrote:
  I've been thinking about this too and actually went so far as to do
  some research and put together something that I hope covers most of
  the interesting cases.  The attached patch is pretty much entirely
  untested, but reflects my present belief about how things ought to
  work.

 And, here's an updated version, with some of the more obviously broken
 things fixed.


You declare dummy_spinlock variable as extren and use it, but it is not
defined anywhere. Wouldn't that be a linker error?

-- 
Gurjeet Singh
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 4:19 PM, Gurjeet Singh singh.gurj...@gmail.com wrote:
 On Wed, Sep 14, 2011 at 4:29 PM, Robert Haas robertmh...@gmail.com wrote:

 On Mon, Aug 8, 2011 at 7:47 AM, Robert Haas robertmh...@gmail.com wrote:
  I've been thinking about this too and actually went so far as to do
  some research and put together something that I hope covers most of
  the interesting cases.  The attached patch is pretty much entirely
  untested, but reflects my present belief about how things ought to
  work.

 And, here's an updated version, with some of the more obviously broken
 things fixed.

 You declare dummy_spinlock variable as extren and use it, but it is not
 defined anywhere. Wouldn't that be a linker error?

Yeah, we need to add that somewhere, maybe s_lock.c

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unaccent contrib

2011-09-21 Thread Devrim GÜNDÜZ
On Wed, 2011-09-21 at 18:28 +0200, Daniel Vázquez wrote:

 Can Global Development Group, make some acumulative rpm  for contrib
 modules that are backward compatible???

No (as the RPM maintainer).
-- 
Devrim GÜNDÜZ
Principal Systems Engineer @ EnterpriseDB: http://www.enterprisedb.com
PostgreSQL Danışmanı/Consultant, Red Hat Certified Engineer
Community: devrim~PostgreSQL.org, devrim.gunduz~linux.org.tr
http://www.gunduz.org  Twitter: http://twitter.com/devrimgunduz


signature.asc
Description: This is a digitally signed message part


Re: [HACKERS] unaccent contrib

2011-09-21 Thread Tom Lane
Euler Taveira de Oliveira eu...@timbira.com writes:
 On 21-09-2011 13:28, Daniel Vázquez wrote:
 unaccent is compatible with postgresql 8.4 (but not is in their contrib
 version distribution)

 No, it is not. AFAICS it is necessary to add some backend code that is not in 
 8.4.

[ pokes at it ]  Yeah, you are right.  The version of unaccent that is
in our source tree is a filtering dictionary, and therefore cannot
possibly work with backends older than 9.0 (when the filtering
dictionary feature was added).

So I'm wondering where the OP read that it was compatible with 8.4.
Our own documentation about it certainly does not say that.  It's
possible that Oleg and Teodor had some prototype version, different
from what got committed to our tree, that would work in 8.4.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Robert Haas
On Wed, Sep 21, 2011 at 3:39 PM, Kevin Grittner
kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 On Wed, Sep 21, 2011 at 2:48 PM, Kevin Grittner
 kevin.gritt...@wicourts.gov wrote:
 Robert Haas robertmh...@gmail.com wrote:
 But even a full explanation of that case seems like almost too
 much for the comment of a header file, and there are certainly
 far more complex cases.  I think anyone who is using these
 primitives is going to have to do some independent reading...

 Maybe a URL or two in the header comments, pointing to relevant
 papers for the techniques used?  After all, years from now
 someone might be familiar with other techniques from newer papers
 and wonder what the techniques in the code are based on.

 If there are any academic papers on this topic, I haven't found
 them.  Mostly, I've found lots of articles written by people who
 were coding for the Linux kernel, and a lot of those articles are
 extremely Linux-specific (you could use the smb_rb() macro here,
 but it's better to instead use this RCU-related macro, because
 it'll do it for you, blah blah).  I'm happy to link to any sources
 anyone thinks we ought to link to, but I've mostly had to piece
 this together bit by bit from blog posts and (sometimes buggy)
 examples.  It hasn't been a particularly thrilling exercise.

 Well, if it really is that hard to piece together the relevant
 techniques, it seems cruel not to check in the results of your
 efforts to work it out somewhere.  Perhaps a README file?

I don't know if techniques is the right word.  I mean, there are three
questions that you might want to answer here:

1. I have a new architecture and I want barrier.h to support it.  What
do I need to do?
2. What is the behavior of the various constructs provided by barrier.h?
3. Why would I want that behavior and how can I use it to do cool stuff?

I intended the comment in that file to be enough to answer questions
#1 and #2.  What you and Heikki are asking about is really #3, and
that seems to me to be setting the bar awfully high.  I mean, lwlock.c
explains what a lightweight lock does, but it doesn't explain all of
the ways that a lightweight lock can be used to solve performance
problems, nor should it.  That would be recapitulating the
documentation that is hopefully present everywhere that LWLocks are
used as well as speculating about future applications.  It just
doesn't seem sensible to me to try to enumerate all the ways that a
fundamental primitive can potentially be used down the line.

What I found hard about memory barriers is basically this: I didn't
understand that the CPU is allowed to perform operations out of order.
 And I couldn't understand what the consequences of that fact were.  I
sort of understood - but hadn't really internalized - the idea that
execution is highly pipelined, so the single moment at which an
execution is performed is not well defined.  Before I really got my
head around it, I had to read the explanations of what a memory
barrier actually does over and over again.  I probably read ten
different explanations saying the same thing in different words about
twenty times a piece, and slowly the light dawned.  I did my best to
explain that in the existing comment; I'm happy to expand the comment
if people have suggestions for what to put in there; but basically I
think this is a hard concept and if you haven't done this stuff before
it's going to take a while to get up to speed.

 On the other hand, a search turned up these two papers (which I
 haven't yet read, but will):

 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.97.2397rep=rep1type=pdf

 http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.153.6657rep=rep1type=pdf

 On a quick scan, they both look promising in themselves, and
 reference a lot of other promising-sounding papers.

I'm not sure these are much help for learning how to program with
memory barriers, but if somebody really wants them (or something else)
included, I'm not going to fight too hard.  I don't expect this to be
perfect the first time through; I am just trying to get a basic
infrastructure in place.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] memory barriers (was: Yes, WaitLatch is vulnerable to weak-memory-ordering bugs)

2011-09-21 Thread Kevin Grittner
Robert Haas robertmh...@gmail.com wrote:
 
 there are three questions that you might want to answer here:
 
 1. I have a new architecture and I want barrier.h to support it. 
What do I need to do?
 2. What is the behavior of the various constructs provided by
barrier.h?
 3. Why would I want that behavior and how can I use it to do cool
stuff?
 
 I intended the comment in that file to be enough to answer
 questions #1 and #2.  What you and Heikki are asking about is
 really #3, and that seems to me to be setting the bar awfully
 high.
 
OK, put that way, I mostly agree.  A general overview of the known
usage patterns in a header or README file still doesn't seem out of
line to me.  With LW locks, for example, I've only seen two patterns
used in PostgreSQL:
 
(1) Get a shared lock for reads and an exclusive lock for writes, or
(2) get a shared lock to read or write your own data, but an
exclusive lock to read anyone else's data.
 
In both cases, there must be a defined order of lock acquisition to
avoid deadlock, with a strong recommendation that locks be released
in the reverse order.  Mentioning that much doesn't preclude other
uses of LW locks, but might help people new to the code.  That's the
level of summary *I* would like to see included.
 
 What I found hard about memory barriers is basically this: I
 didn't understand that the CPU is allowed to perform operations
 out of order.  And I couldn't understand what the consequences of
 that fact were.  I sort of understood - but hadn't really
 internalized - the idea that execution is highly pipelined, so the
 single moment at which an execution is performed is not well
 defined.  Before I really got my head around it, I had to read the
 explanations of what a memory barrier actually does over and over
 again.  I probably read ten different explanations saying the same
 thing in different words about twenty times a piece, and slowly
 the light dawned.  I did my best to explain that in the existing
 comment; I'm happy to expand the comment if people have
 suggestions for what to put in there; but basically I think this
 is a hard concept and if you haven't done this stuff before it's
 going to take a while to get up to speed.
 
That's the sort of thing where it would be helpful to provide one or
two URLs for cogent explanations of this.  Even if it takes repeated
readings and meditations on the explanations for it to sink in, this
is worth it.  (For SSI I had to read the paper many times, and then
go read several referenced papers, before I really had my head
around it, and I've had others say the same thing.  But having a
link to the material gives someone a chance to *do* that.)
 
-Kevin

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] EXPLAIN and nfiltered, take two

2011-09-21 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Sep 20, 2011 at 8:19 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I find it useless and probably confusing to put out Rows Removed
 by Recheck Cond: 0 unless we're dealing with a lossy index.

 I don't really see the point of this.  I think printing it always is
 both clear and appropriate; it would be even nicer if we also had a
 line for Rows Rechecked.

I think you'll be singing a different tune as soon as you find out that
every single indexscan in your plans has a generally-useless Rows
Removed line added to it.  The patch as I have it now is doing this:

regression=# explain analyze select * from tenk1 where unique1 = 42;
  QUERY PLAN
   
---
 Index Scan using tenk1_unique1 on tenk1  (cost=0.00..8.27 rows=1 width=244) 
(actual time=0.137..0.153 rows=1 loops=1)
   Index Cond: (unique1 = 42)
   Rows Removed by Index Cond: 0
 Total runtime: 0.635 ms
(4 rows)

and personally I object to a 50% increase in the vertical space occupied
by a plan item in order to tell me not a damn thing.  For an ordinary
btree indexscan, that line WILL NOT EVER be nonzero.  I do not own
monitors large enough to accept that sort of overhead without complaint.

Now the entry *is* useful when considering a lossy indexscan.  I just
don't want it there otherwise.  (Note that I'm not proposing suppressing
such entries from the machine-readable formats, just text format.)

 I think we are getting to the point where EXPLAIN is complex enough
 that it should really have its own chapter in the documentation.

Yeah, the lack of documentation in the submitted patch is another big
problem with it, but it's not especially relevant to arguing about what
the behavior should be.  I'll see about documentation after I'm done
with the code.

 Also, upthread it was argued that we shouldn't measure the effects of
 joinqual filtering.  I don't buy this for a minute, ...

 Yep, I agree.  We should measure everything we possibly can.  I don't
 have a clear idea how this ought to be represented, but leaving it out
 doesn't seem like the right answer.

After playing with it for awhile, I'm inclined to just put out the
measured count and let the user interpret it.  Here are some examples
using table tenk1 from the regression database, with most of the
planner's enable_ flags turned off to force it to produce the same shape
of plan for each case:

regression=# explain analyze select * from tenk1 a join tenk1 b on a.thousand = 
b.hundred ;
  QUERY 
PLAN   
---
 Merge Join  (cost=0.00..3172.38 rows=10 width=488) (actual 
time=0.449..1678.464 rows=10 loops=1)
   Merge Cond: (b.hundred = a.thousand)
   -  Index Scan using tenk1_hundred on tenk1 b  (cost=0.00..1702.16 
rows=1 width=244) (actual time=0.228..117.637 rows=1 loops=1)
   -  Materialize  (cost=0.00..1727.20 rows=1 width=244) (actual 
time=0.154..431.251 rows=11 loops=1)
 -  Index Scan using tenk1_thous_tenthous on tenk1 a  
(cost=0.00..1702.20 rows=1 width=244) (actual time=0.115..12.033 rows=1001 
loops=1)
 Total runtime: 2057.743 ms
(6 rows)

regression=# explain analyze select * from tenk1 a join tenk1 b on a.thousand = 
b.hundred and a.four = b.ten;
  QUERY 
PLAN   
---
 Merge Join  (cost=0.00..3422.38 rows=1 width=488) (actual 
time=0.657..1314.394 rows=2 loops=1)
   Merge Cond: (b.hundred = a.thousand)
   Join Filter: (a.four = b.ten)
   Rows Removed by Join Filter: 8
   -  Index Scan using tenk1_hundred on tenk1 b  (cost=0.00..1702.16 
rows=1 width=244) (actual time=0.395..110.506 rows=1 loops=1)
   -  Materialize  (cost=0.00..1727.20 rows=1 width=244) (actual 
time=0.165..390.867 rows=11 loops=1)
 -  Index Scan using tenk1_thous_tenthous on tenk1 a  
(cost=0.00..1702.20 rows=1 width=244) (actual time=0.121..11.680 rows=1001 
loops=1)
 Total runtime: 1386.446 ms
(8 rows)

OK, that's pretty straightforward, what about a left join?

regression=# explain analyze select * from tenk1 a left join tenk1 b on 
a.thousand = b.hundred ;
QUERY PLAN  
   

Re: [HACKERS] Double sorting split patch

2011-09-21 Thread Alexander Korotkov
I've processed the results of the tests with double sorting split which I've
sheduled for buffering build. I've updated wiki page with them:
http://wiki.postgresql.org/wiki/Fast_GiST_index_build_GSoC_2011#Testing_results
Raw results of query speed measues are in the attachment. There number of
pages accesse is presented in dependency of table, buffering build and split
method.
Note, that tests were run with not last version of fast build patch
(gist_fast_build-heikki-0.14.1.1.patch was used). Therefore, build time
with buffering can be better.

--
With best regards,
Alexander Korotkov.
 dataset |  build method  |  split method  | 
predicate | page accesses 
-+++---+---
 uniform | buffering (1/366/on)   | new linear | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |   492
 uniform | buffering (1/366/on)   | new linear | point @ 
'(0.501,0.501),(0.5,0.5)'::box   |   125
 uniform | buffering (1/366/on)   | new linear | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   876
 uniform | buffering (1/366/on)   | new linear | point @ 
'(0.21,0.21),(0.2,0.2)'::box |  1453
 uniform | buffering (1/366/on)   | new linear | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |   394
 uniform | buffering (1/366/auto) | new linear | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |   333
 uniform | buffering (1/366/auto) | new linear | point @ 
'(0.501,0.501),(0.5,0.5)'::box   |   134
 uniform | buffering (1/366/auto) | new linear | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   570
 uniform | buffering (1/366/auto) | new linear | point @ 
'(0.21,0.21),(0.2,0.2)'::box |   720
 uniform | buffering (1/366/auto) | new linear | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |   520
 uniform | ordinal| new linear | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |   563
 uniform | ordinal| new linear | point @ 
'(0.501,0.501),(0.5,0.5)'::box   |   159
 uniform | ordinal| new linear | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   917
 uniform | ordinal| new linear | point @ 
'(0.21,0.21),(0.2,0.2)'::box |   809
 uniform | ordinal| new linear | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |   330
 uniform | buffering (1/366/on)   | double sorting | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |43
 uniform | buffering (1/366/on)   | double sorting | point @ 
'(0.501,0.501),(0.5,0.5)'::box   | 8
 uniform | buffering (1/366/on)   | double sorting | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   128
 uniform | buffering (1/366/on)   | double sorting | point @ 
'(0.21,0.21),(0.2,0.2)'::box |   120
 uniform | buffering (1/366/on)   | double sorting | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |18
 uniform | buffering (1/366/auto) | double sorting | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |38
 uniform | buffering (1/366/auto) | double sorting | point @ 
'(0.501,0.501),(0.5,0.5)'::box   | 7
 uniform | buffering (1/366/auto) | double sorting | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   119
 uniform | buffering (1/366/auto) | double sorting | point @ 
'(0.21,0.21),(0.2,0.2)'::box |   116
 uniform | buffering (1/366/auto) | double sorting | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |20
 uniform | ordinal| double sorting | point @ 
'(0.505,0.505),(0.5,0.5)'::box   |37
 uniform | ordinal| double sorting | point @ 
'(0.501,0.501),(0.5,0.5)'::box   | 7
 uniform | ordinal| double sorting | point @ 
'(0.51,0.51),(0.5,0.5)'::box |   116
 uniform | ordinal| double sorting | point @ 
'(0.21,0.21),(0.2,0.2)'::box |   123
 uniform | ordinal| double sorting | point @ 
'(0.103,0.903),(0.1,0.9)'::box   |19
 usnoa2  | buffering (1/366/on)   | new linear | point @ 
'(275.8,6.8),(275.5,6.5)'::box   |70
 usnoa2  | buffering (1/366/on)   | new linear | point @ 
'(300.8,6.8),(300.5,6.5)'::box   |80
 usnoa2  | buffering (1/366/on)   | new linear | point @ 
'(240,-42.9),(239.9,-43)'::box   |15
 usnoa2  | buffering (1/366/on)   | new linear | point @ 
'(259,-44.5),(258.9,-45.6)'::box |89
 usnoa2  | buffering (1/366/on)   | new linear | point @ 
'(309.6,27),(309.5,26.9)'::box   |   

[HACKERS] citext operator precedence fix

2011-09-21 Thread Josh Berkus
All,

I just tripped over this:

select 'josh'::varchar(32) = 'Josh'::citext;
 ?column?
--
 f

While it's clear why it's that way, it's not how people would expect
citext to behave.  Users expect case-insensitive text to be
case-insensitive for *all* comparisons, not just for comparisons with
the same data type.  I would also think that folks migrating from SQL
Server and MySQL would get bitten by this.

I'd like to patch the citext contrib module for 9.2 to fix this by
creating four new = operators to establish the comparison function for
text and varchar.

Before I clean up my ad-hoc fix code for submission, are there strong
objections to this idea?  Or are there other data types I'd need to cover?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] citext operator precedence fix

2011-09-21 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 I'd like to patch the citext contrib module for 9.2 to fix this by
 creating four new = operators to establish the comparison function for
 text and varchar.

I think you'll find that's easier said than done (problem 1 is going to
be ambiguity, and problem 2 is going to be that comparisons involving
these operators won't get indexed).

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] citext operator precedence fix

2011-09-21 Thread Josh Berkus

 I think you'll find that's easier said than done (problem 1 is going to
 be ambiguity, 

Ambiguity?

 and problem 2 is going to be that comparisons involving
 these operators won't get indexed).

Yeah, that's acceptable, since it's not any worse than the behavior of
the comparisons now.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 19:00 , Jeff Davis wrote:
 While we're at it, any suggestions on the text representation of an
 empty range?

My personal favourite would be '0', since it resembles the symbol used
for empty sets in mathematics, and we already decided to use mathematical
notation for ranges.

If we're concerned that most of our users won't get that, then 'empty'
would be a viable alternative I think.

From a consistency POV it'd make sense to use a bracket-based syntax
also for empty ranges. But the only available options would be '()' and '[]',
which are too easily confused with '(,)' and '[,]' (which we already
decided should represent the full range).

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.2] make_greater_string() does not return a string in some cases

2011-09-21 Thread Robert Haas
On Tue, Sep 13, 2011 at 10:13 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@oss.ntt.co.jp wrote:
 This is rebased patch of `Allow encoding specific character
 incrementer'(https://commitfest.postgresql.org/action/patch_view?id=602).

 Addition to the patch, increment sanity check program for new
 functions pg_generic_charinc and pg_utf8_increment is attached.

I took a look at this patch ... and the thread ... and the previous
thread with the same subject line:

http://archives.postgresql.org/pgsql-bugs/2010-06/msg00303.php

As I understand it, the issue here is that when we try to generate
suitable  and  quals for a LIKE expression, we need to find a string
which is greater than the prefix we're searching for, and the existing
algorithm sometimes fails.  But there seems to be some dispute over
how likely this is to occur.  Tom argues that the case is so rare that
we shouldn't worry about it:

http://archives.postgresql.org/pgsql-bugs/2010-06/msg00336.php

...while Kyotaro Horiguchi clearly feels otherwise, citing the
statistic that about 100 out of 7000 Japanese characters fail to work
properly:

http://archives.postgresql.org/pgsql-bugs/2011-07/msg00064.php

That statistic seems to justify some action, but what?  Ideas:

1. Adopt the patch as proposed, or something like it.
2. Instead of installing encoding-specific character incrementing
functions, we could try to come up with a more reliable generic
algorithm.  Not sure exactly what, though.
3. Come up with some way to avoid needing to do this in the first place.

One random idea I have is - instead of generating  and  clauses,
could we define a prefix match operator - i.e. a ### b iff substr(a,
1, length(b)) = b?  We'd need to do something about the selectivity,
but I don't see why that would be a problem.

Thoughts?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Range Types - typo + NULL string constructor

2011-09-21 Thread Florian Pflug
On Sep21, 2011, at 19:02 , Jeff Davis wrote:
 On Wed, 2011-09-21 at 13:24 +0200, Florian Pflug wrote:
 I've thought about this some more, and came to realize that the question
 here really is whether
 
  floatrange(0, 'Infinity'::float, '[)')
 
 and
 
  floatrange(0, NULL, '[)')
 
 are the same thing or not.
 
 The unbounded side of a range is never equal to a value in the data
 type's domain, so no, it's not the same.
 
 I think that we pretty much settled on just using an empty string for
 infinity in the other thread, right? So that makes this a non-issue.

Agreed.

best regards,
Florian Pflug


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] citext operator precedence fix

2011-09-21 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 I think you'll find that's easier said than done (problem 1 is going to
 be ambiguity, 

 Ambiguity?

Yeah, I'm worried about the possibility of parser failing to resolve
which operator is meant.

 and problem 2 is going to be that comparisons involving
 these operators won't get indexed).

 Yeah, that's acceptable, since it's not any worse than the behavior of
 the comparisons now.

No, I don't think so.  For people for whom the right thing is happening,
you'll risk making it (a) wrong and (b) lots slower.  For people for
whom the wrong thing is happening, maybe you'll fix it so it's
semantically right, but if indexes don't work they still won't be happy.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] citext operator precedence fix

2011-09-21 Thread Josh Berkus

 Ambiguity?
 
 Yeah, I'm worried about the possibility of parser failing to resolve
 which operator is meant.

Any idea of how to test for that?  What kind of situations would make
things difficult for the parser?

Also, how is this any different for any optional data type which
overrides = ?

 No, I don't think so.  For people for whom the right thing is happening,
 you'll risk making it (a) wrong and (b) lots slower.  

Well, I'm dubious that current behavior is the right thing for anybody.
 The best I could do to answer that would be an informal community survey.

 For people for
 whom the wrong thing is happening, maybe you'll fix it so it's
 semantically right, but if indexes don't work they still won't be happy.

So I'd need to add operator classes and indexing support functions as
well, then, presumably.  Annoying, but not impossible.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Separating bgwriter and checkpointer

2011-09-21 Thread Greg Smith

On 09/20/2011 09:35 AM, Heikki Linnakangas wrote:
Yeah, I was thinking of write()s, not fsyncs. I agree this might have 
some effect during fsync phase.


Right; that's where the most serious problems seem to pop up at anyway 
now.  All the testing I did earlier this year suggested Linux at least 
is happy to do a granular fsync, and it can also use things like 
barriers when appropriate to schedule I/O.  The hope here is that the 
background writer work to clean ahead of the strategy point is helpful 
to backends, and that should keep going even during the sync 
phase--which currently doesn't pause for anything else once it's 
started.  The cleaner writes should all queue up into RAM in a lazy way 
rather than block the true I/O, which is being driven by sync calls.


There is some risk here that the cleaner writes happen faster than the 
true rate at which backends really need buffers, since it has a 
predictive component it can be wrong about.  Those could in theory 
result in the write cache filling faster than it would in the current 
environment, such that writes truly block that would have been cached in 
the current code.  If you're that close to the edge though, backends 
should really benefit from the cleaner--that same write done by a client 
would turn into a serious stall.  From that perspective, when things 
have completely filled the write cache, any writes the cleaner can get 
out of the way in advance of when a backend needs it should be the 
biggest win most of the time.


--
Greg Smith   2ndQuadrant USg...@2ndquadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support  www.2ndQuadrant.us


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >