Re: [HACKERS] how to find out whether a view is updatable

2013-06-12 Thread Dean Rasheed
On 12 June 2013 23:01, Tom Lane  wrote:
> Dean Rasheed  writes:
>> [ pg_relation_is_updatable.patch ]
>
> I've committed this with some modifications as mentioned.  There is
> still room to debate exactly what
> information_schema.columns.is_updatable means --- we can now change that
> without an initdb.
>

Thanks. Those modifications all look pretty neat.

I'm inclined to stick with the current definition of what updatable
means in the information schema. I'm not entirely convinced that other
possible interpretations of the spec are any more valid, and it
certainly doesn't seem worth another initdb or a break with backwards
compatibility by changing it. At least we now have functions that can
give a more intuitive result for updatability.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] Remove useless USE_PGXS support in contrib

2013-06-12 Thread Peter Eisentraut
This has served no purpose except to

1. take up space
2. confuse users
3. produce broken external extension modules that take contrib as an example
4. break builds of PostgreSQL when users try to fix 3. by exporting USE_PGXS

There is adequate material in the documentation and elsewhere (PGXN) on
how to write extensions and their makefiles, so this is not needed.
---
pursuant to discussion here:
http://www.postgresql.org/message-id/512ceab8.9010...@gmx.net

 contrib/adminpack/Makefile  |6 --
 contrib/auth_delay/Makefile |6 --
 contrib/auto_explain/Makefile   |6 --
 contrib/btree_gin/Makefile  |6 --
 contrib/btree_gist/Makefile |6 --
 contrib/chkpass/Makefile|6 --
 contrib/citext/Makefile |6 --
 contrib/cube/Makefile   |6 --
 contrib/dblink/Makefile |6 --
 contrib/dict_int/Makefile   |6 --
 contrib/dict_xsyn/Makefile  |6 --
 contrib/dummy_seclabel/Makefile |6 --
 contrib/earthdistance/Makefile  |6 --
 contrib/file_fdw/Makefile   |6 --
 contrib/fuzzystrmatch/Makefile  |6 --
 contrib/hstore/Makefile |6 --
 contrib/intagg/Makefile |6 --
 contrib/intarray/Makefile   |6 --
 contrib/isn/Makefile|6 --
 contrib/lo/Makefile |6 --
 contrib/ltree/Makefile  |6 --
 contrib/oid2name/Makefile   |6 --
 contrib/pageinspect/Makefile|6 --
 contrib/passwordcheck/Makefile  |6 --
 contrib/pg_archivecleanup/Makefile  |6 --
 contrib/pg_buffercache/Makefile |6 --
 contrib/pg_freespacemap/Makefile|6 --
 contrib/pg_standby/Makefile |6 --
 contrib/pg_stat_statements/Makefile |6 --
 contrib/pg_test_fsync/Makefile  |6 --
 contrib/pg_test_timing/Makefile |6 --
 contrib/pg_trgm/Makefile|6 --
 contrib/pg_upgrade/Makefile |6 --
 contrib/pg_upgrade_support/Makefile |8 +---
 contrib/pg_xlogdump/Makefile|6 --
 contrib/pgbench/Makefile|6 --
 contrib/pgcrypto/Makefile   |6 --
 contrib/pgrowlocks/Makefile |6 --
 contrib/pgstattuple/Makefile|6 --
 contrib/postgres_fdw/Makefile   |6 --
 contrib/seg/Makefile|6 --
 contrib/sepgsql/Makefile|6 --
 contrib/spi/Makefile|6 --
 contrib/sslinfo/Makefile|6 --
 contrib/tablefunc/Makefile  |6 --
 contrib/tcn/Makefile|6 --
 contrib/test_parser/Makefile|6 --
 contrib/tsearch2/Makefile   |6 --
 contrib/unaccent/Makefile   |6 --
 contrib/uuid-ossp/Makefile  |6 --
 contrib/vacuumlo/Makefile   |6 --
 contrib/worker_spi/Makefile |6 --
 contrib/xml2/Makefile   |6 --
 53 files changed, 1 insertion(+), 319 deletions(-)

diff --git a/contrib/adminpack/Makefile b/contrib/adminpack/Makefile
index 5cbc8f0..074d443 100644
--- a/contrib/adminpack/Makefile
+++ b/contrib/adminpack/Makefile
@@ -7,13 +7,7 @@ PG_CPPFLAGS = -I$(libpq_srcdir)
 EXTENSION = adminpack
 DATA = adminpack--1.0.sql
 
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
 subdir = contrib/adminpack
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/contrib/auth_delay/Makefile b/contrib/auth_delay/Makefile
index 09d2d54..d1c7918 100644
--- a/contrib/auth_delay/Makefile
+++ b/contrib/auth_delay/Makefile
@@ -2,13 +2,7 @@
 
 MODULES = auth_delay
 
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
 subdir = contrib/auth_delay
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/contrib/auto_explain/Makefile b/contrib/auto_explain/Makefile
index 2d1443f..d343572 100644
--- a/contrib/auto_explain/Makefile
+++ b/contrib/auto_explain/Makefile
@@ -3,13 +3,7 @@
 MODULE_big = auto_explain
 OBJS = auto_explain.o
 
-ifdef USE_PGXS
-PG_CONFIG = pg_config
-PGXS := $(shell $(PG_CONFIG) --pgxs)
-include $(PGXS)
-else
 subdir = contrib/auto_explain
 top_builddir = ../..
 include $(top_builddir)/src/Makefile.global
 include $(top_srcdir)/contrib/contrib-global.mk
-endif
diff --git a/contrib/btree_gin/Makefile b/contrib/btree_gin/Makefile
index 09fd3e6..284b745 100644
--- a/contrib/btree_gin/Makefile
+++ b/contrib/btree_gin/Makefile
@@ -11,13 +11,7 @@ REGRESS = install_btree_gin int2 int4 int8 float4 float8 
money oid \
macaddr inet cidr text varchar char

[HACKERS] [PATCH] Add session_preload_libraries configuration parameter

2013-06-12 Thread Peter Eisentraut
This is like shared_preload_libraries except that it takes effect at
backend start and can be changed without a full postmaster restart.  It
is like local_preload_libraries except that it is still only settable by
a superuser.  This can be a better way to load modules such as
auto_explain.

Since there are now three preload parameters, regroup the documentation
a bit.  Put all parameters into one section, explain common
functionality only once, update the descriptions to reflect current and
future realities.
---
This is a reboot of the ideas discussed in
http://www.postgresql.org/message-id/1358251492.401.5.ca...@vanquo.pezone.net
and http://www.postgresql.org/message-id/515357f4.6000...@gmx.net
during the 2013-01 commit fest.

 doc/src/sgml/auto-explain.sgml|5 +-
 doc/src/sgml/config.sgml  |  281 ++---
 src/backend/tcop/postgres.c   |2 +-
 src/backend/utils/init/miscinit.c |6 +-
 src/backend/utils/misc/guc.c  |   19 ++-
 src/include/miscadmin.h   |3 +-
 src/include/utils/guc_tables.h|1 +
 7 files changed, 193 insertions(+), 124 deletions(-)

diff --git a/doc/src/sgml/auto-explain.sgml b/doc/src/sgml/auto-explain.sgml
index 03b2309..cd0d6d8 100644
--- a/doc/src/sgml/auto-explain.sgml
+++ b/doc/src/sgml/auto-explain.sgml
@@ -24,7 +24,8 @@ auto_explain
 
 
   (You must be superuser to do that.)  More typical usage is to preload
-  it into all sessions by including auto_explain in
+  it into some or all sessions by including auto_explain in
+   or
in
   postgresql.conf.  Then you can track unexpectedly slow queries
   no matter when they happen.  Of course there is a price in overhead for
@@ -185,7 +186,7 @@ Configuration Parameters
 
 
 # postgresql.conf
-shared_preload_libraries = 'auto_explain'
+session_preload_libraries = 'auto_explain'
 
 auto_explain.log_min_duration = '3s'
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c7d84b5..b8e766d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1252,66 +1252,6 @@ Kernel Resource Usage

   
  
-
- 
-  shared_preload_libraries 
(string)
-  
-   shared_preload_libraries configuration 
parameter
-  
-  
-   
-This variable specifies one or more shared libraries
-to be preloaded at server start. For example,
-'$libdir/mylib' would cause
-mylib.so (or on some platforms,
-mylib.sl) to be preloaded from the installation's
-standard library directory.
-All library names are converted to lower case unless double-quoted.
-If more than one library is to be loaded, separate their names
-with commas.  This parameter can only be set at server start.
-   
-
-   
-PostgreSQL procedural language
-libraries can be preloaded in this way, typically by using the
-syntax '$libdir/plXXX' where
-XXX is pgsql, perl,
-tcl, or python.
-   
-
-   
-By preloading a shared library, the library startup time is avoided
-when the library is first used.  However, the time to start each new
-server process might increase slightly, even if that process never
-uses the library.  So this parameter is recommended only for
-libraries that will be used in most sessions.
-   
-
- 
-  
-On Windows hosts, preloading a library at server start will not reduce
-the time required to start each new server process; each server process
-will re-load all preload libraries.  However, 
shared_preload_libraries
- is still useful on Windows hosts because some shared 
libraries may
-need to perform certain operations that only take place at postmaster 
start
-(for example, a shared library may need to reserve lightweight locks
-or shared memory and you can't do that after the postmaster has 
started).
-   
-  
-   
-If a specified library is not found,
-the server will fail to start.
-   
-
-   
-Every  PostgreSQL-supported library has a magic
-block that is checked to guarantee compatibility.
-For this reason, non-PostgreSQL libraries cannot be
-loaded in this way.
-   
-  
- 
-
  
 
 
@@ -5494,6 +5434,171 @@ Locale and Formatting
  
 
 
+
+
+ Shared Library Preloading
+
+ 
+  Several settings are available for preloading shared libraries into the
+  server, in order to load additional functionality or achieve performance
+  benefits.  For example, a setting of
+  '$libdir/mylib' would cause
+  mylib.so (or on some platforms,
+  mylib.sl) to be preloaded from the installation's standard
+  library directory.  The differences between the settings are when they
+  take effect and what privileges are required to change them.
+ 
+
+ 
+  PostgreSQL procedural langu

Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Brendan Jurd
On 13 June 2013 04:26, Merlin Moncure  wrote:
> On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane  wrote:
>> Josh Berkus  writes:
>>> On 06/12/2013 11:01 AM, Tom Lane wrote:
 I'm going to be disappointed if all we can get out of this is
 a cardinality() function, and nothing is done about the empty-array
 semantics.

I would be disappointed too, but on the other hand, CARDINALITY is
required by the spec and anything would be better than nothing.

>> Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
>> his was the majority opinion, and in any case there wasn't much
>> consideration given to compromises falling somewhere between "no
>> changes" and the rather drastic solution Brendan proposed.

I'm all for looking into possible compromises, and will happily take
any improvements to this mess I think I can get past the compatibility
maximalist caucus.

>> regression=# select array_dims('{}'::int[]) is null;
>>  ?column?
>> --
>>  t
>> (1 row)
>>
>> Whatever you think the dimensions of that are, surely they're not
>> unknown.

I don't think anyone has actually tried to defend the behaviour of the
array functions w.r.t. empty arrays.  Even the opponents of the
original proposal agreed that the behaviour was silly, they just
didn't want to fix it, on account of the upgrade burden.

> But, couldn't that be solved by deprecating that function and
> providing a more sensible alternatively named version?

And what would you name that function?  array_dims2?  I can't think of
a name that makes the difference in behaviour apparent.  Can you
imagine the documentation for that?

array_dims - Returns the dimensions of the array, unless it is empty
in which case NULL.
array_proper_dims - Returns the dimensions of the array.
array_ndims - Returns the number of dimension, unless it is empty in
which case NULL.
array_proper_ndims - Returns the number of dimensions.

... and so on for _length, _upper and _lower.

Cheers,
BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of "IF NOT EXISTS" to others "CREATE" statements

2013-06-12 Thread Tom Dunstan
On 13 June 2013 04:30, Peter Eisentraut  wrote:

> I'm wondering where "IF NOT EXISTS" and "OR REPLACE" will meet.
>

CREATE OR REPLACE (or ALTER / UPDATE ?) would definitely be useful for
enums, where it would be nice if we could teach an ORM to generate DDL
based on the current values of the enum in code, and know that after the
operation had completed, the database enum type matched the code enum type.
I don't think a sequence of ALTER TYPE ADD VALUE IF NOT EXISTS quite does
the trick, as it doesn't guarantee that the db enum is in the same order as
the code enum, which may or may not be important. I'd expect a CREATE OR
ALTER for enums to raise an error if any of the elements were out of order.

Currently to get to a known state for enums you have to write manual
migration scripts, and while that tends to be how I roll anyway, often when
starting projects in rails / grails / hibernate etc people rely on db
schemas generated by the framework as it lets them prototype with less
mucking around. It would be nice for those frameworks to be able to
generate enum types in a known state.

Cheers

Tom


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Darren Duncan

On 2013.06.12 10:11 AM, Josh Berkus wrote:

Multidim arrays are why we can't have nice things.


Yeah, I think that was not our best moment.  :-(


Actually, if you think about it, we've never had "arrays" in PostgreSQL
... we have always had matrixes.  If you think about things that way,
most of the current functionality makes sense.


If the type system were changed so that arrays were were always just 
one-dimensional, you can define your matrix simply as a binary relation type 
whose primary key attribute has the type of a fixed-length array of integers, 
where the number of elements in the array is the number of dimensions in the 
matrix, and the array elements themselves defined the coordinates in the matrix. 
 This design confers a number of benefits.  Also, the case of the 
zero-dimension matrix needs no special treatment; the key array has zero 
elements.  Would that not work? -- Darren Duncan




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Stephen Frost
Ants,

* Ants Aasma (ants.aa...@eesti.ee) wrote:
> On Jun 13, 2013 4:18 AM, "Stephen Frost"  wrote:
> > * Ants Aasma (a...@cybertec.at) wrote:
> > > In a cluster setting you take the CSN value on the master, then before
> > > starting execution on a standby you wait until that the standby has
> > > replayed enough WAL to reach the CSN point read from the master and
> > > you know that after that everything that the snapshot can see is also
> > > replayed on the standby.
> >
> > This does make a lot of sense- but to clarify, this would only be for
> > certain isolation levels, right?  Or would we implement this for every
> > snapshot taken in a read-committed transaction?
> 
> I don't see a way how snapshots representing different points in time could
> provide sensible results for parallel queries, so this needs to be used for
> all snapshots. 

To be honest, I had really looked at this out of context and was
thinking of it being used with replication and hot-standbys.  I agree
that you'd have to use this for all snapshots if you're using it for
parallel query execution.

> This is why having the capability to request for a snapshot
> that is fresh enough for a specific client but old enough to not require
> replication waits would be a good feature.

That's an interesting concept.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Ants Aasma
On Jun 13, 2013 4:18 AM, "Stephen Frost"  wrote:
> * Ants Aasma (a...@cybertec.at) wrote:
> > In a cluster setting you take the CSN value on the master, then before
> > starting execution on a standby you wait until that the standby has
> > replayed enough WAL to reach the CSN point read from the master and
> > you know that after that everything that the snapshot can see is also
> > replayed on the standby.
>
> This does make a lot of sense- but to clarify, this would only be for
> certain isolation levels, right?  Or would we implement this for every
> snapshot taken in a read-committed transaction?

I don't see a way how snapshots representing different points in time could
provide sensible results for parallel queries, so this needs to be used for
all snapshots. This is why having the capability to request for a snapshot
that is fresh enough for a specific client but old enough to not require
replication waits would be a good feature.

Regards,
Ants Aasma


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Stephen Frost
* Ants Aasma (a...@cybertec.at) wrote:
> In a cluster setting you take the CSN value on the master, then before
> starting execution on a standby you wait until that the standby has
> replayed enough WAL to reach the CSN point read from the master and
> you know that after that everything that the snapshot can see is also
> replayed on the standby.

This does make a lot of sense- but to clarify, this would only be for
certain isolation levels, right?  Or would we implement this for every
snapshot taken in a read-committed transaction?

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Tatsuo Ishii
> On Thu, Jun 13, 2013 at 3:22 AM, Tatsuo Ishii  wrote:
>>> Parallel query execution doesn't require commits to synchronize all
>>> nodes. Parallel execution needs consistent snapshots across all nodes.
>>> In effect this means that nodes need to agree on commit ordering,
>>> either total order or a partial order that accounts for causality.
>>> Most applications also want the guarantee that once they receive
>>> commit confirmation, next snapshot they take will consider their
>>> transaction as committed.
>>>
>>> Coincidentally getting cluster wide consistent snapshots and delaying
>>> until some specific point in commit ordering is almost trivial to
>>> solve with Commit Sequence Number based snapshot scheme that I
>>> proposed.
>>
>> Can you elaborate more on this? Suppose streaming replication primary
>> commits xid = X at time Y. Later on a standy receives WAL including tx
>> X and commit it at time Y + 3 seconds. How can a parallel query
>> execution (which uses snapshot including X) on the standby be delayed
>> until Y + 3 seconds?
> 
> All commits are tagged with a monotonically increasing CSN number in
> the order that they are committed and snapshots read the latest CSN
> value to take notice of what has been committed. When determining
> visibility for a tuple with xmin xid X, you just look up the CSN value
> that X committed with and compare it with the snapshot CSN.  If the
> value is lower, you know it was committed at point in time the
> snapshot was taken, if it is higher or the transaction has not
> committed you know that the transaction was concurrent with or later
> than the snapshot and consequently not visible. This is the core idea,
> everything else in the proposal deals with the technical detail of how
> looking up a CSN value for a xid works.
> 
> In a cluster setting you take the CSN value on the master, then before
> starting execution on a standby you wait until that the standby has
> replayed enough WAL to reach the CSN point read from the master and
> you know that after that everything that the snapshot can see is also
> replayed on the standby.
> 
> The wait for replication can be optimized if the client takes note of
> the CSN that its last transaction committed with and negotiates a new
> snapshot across the cluster that is the same or larger so you only
> need to wait until the point until your specific transaction has been
> replicated. This allows for the replication time to overlap with
> client think time between receiving commit confirmation and taking a
> new snapshot.
> 
> This scheme can almost work now for streaming replication if you
> replace CSN with WAL LSN of the commit record. The issue prohibiting
> it is the fact that visibility order of commits on the master is
> determined by the order that commiters acquire ProcArrayLock, and that
> can be different from the order of WALInsertLock that determines the
> ordering of LSNs, whereas visibility on the slave instance is
> determined purely by WAL LSN order.

Thanks for detailed explanation. The idea of CSN is quite impressive.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Ants Aasma
On Thu, Jun 13, 2013 at 3:22 AM, Tatsuo Ishii  wrote:
>> Parallel query execution doesn't require commits to synchronize all
>> nodes. Parallel execution needs consistent snapshots across all nodes.
>> In effect this means that nodes need to agree on commit ordering,
>> either total order or a partial order that accounts for causality.
>> Most applications also want the guarantee that once they receive
>> commit confirmation, next snapshot they take will consider their
>> transaction as committed.
>>
>> Coincidentally getting cluster wide consistent snapshots and delaying
>> until some specific point in commit ordering is almost trivial to
>> solve with Commit Sequence Number based snapshot scheme that I
>> proposed.
>
> Can you elaborate more on this? Suppose streaming replication primary
> commits xid = X at time Y. Later on a standy receives WAL including tx
> X and commit it at time Y + 3 seconds. How can a parallel query
> execution (which uses snapshot including X) on the standby be delayed
> until Y + 3 seconds?

All commits are tagged with a monotonically increasing CSN number in
the order that they are committed and snapshots read the latest CSN
value to take notice of what has been committed. When determining
visibility for a tuple with xmin xid X, you just look up the CSN value
that X committed with and compare it with the snapshot CSN.  If the
value is lower, you know it was committed at point in time the
snapshot was taken, if it is higher or the transaction has not
committed you know that the transaction was concurrent with or later
than the snapshot and consequently not visible. This is the core idea,
everything else in the proposal deals with the technical detail of how
looking up a CSN value for a xid works.

In a cluster setting you take the CSN value on the master, then before
starting execution on a standby you wait until that the standby has
replayed enough WAL to reach the CSN point read from the master and
you know that after that everything that the snapshot can see is also
replayed on the standby.

The wait for replication can be optimized if the client takes note of
the CSN that its last transaction committed with and negotiates a new
snapshot across the cluster that is the same or larger so you only
need to wait until the point until your specific transaction has been
replicated. This allows for the replication time to overlap with
client think time between receiving commit confirmation and taking a
new snapshot.

This scheme can almost work now for streaming replication if you
replace CSN with WAL LSN of the commit record. The issue prohibiting
it is the fact that visibility order of commits on the master is
determined by the order that commiters acquire ProcArrayLock, and that
can be different from the order of WALInsertLock that determines the
ordering of LSNs, whereas visibility on the slave instance is
determined purely by WAL LSN order.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Tatsuo Ishii
> On Jun 12, 2013 2:02 AM, "Tatsuo Ishii"  wrote:
>> No, I'm not talking about conflict resolution.
>>
>> From 
>> http://www.cs.cmu.edu/~natassa/courses/15-823/F02/papers/replication.pdf:
>> --
>> Eager or Lazy Replication?
>>  Eager replication:
>>  keep all replicas synchronized by updating all
>>  replicas in a single transaction
>>
>>  Lazy replication:
>>  asynchronously propagate replica updates to
>>  other nodes after replicating transaction commits
>> --
>>
>> Parallel query execution needs to assume that each node synchronized
>> in a commit, otherwise the summary of each query result executed on
>> each node is meaningless.
> 
> As far as I can see the lazy-eager terminology is based on a
> multi-master configuration and doesn't really apply for PostgreSQL
> streaming replication.
> 
> Parallel query execution doesn't require commits to synchronize all
> nodes. Parallel execution needs consistent snapshots across all nodes.
> In effect this means that nodes need to agree on commit ordering,
> either total order or a partial order that accounts for causality.
> Most applications also want the guarantee that once they receive
> commit confirmation, next snapshot they take will consider their
> transaction as committed.
> 
> Coincidentally getting cluster wide consistent snapshots and delaying
> until some specific point in commit ordering is almost trivial to
> solve with Commit Sequence Number based snapshot scheme that I
> proposed.

Can you elaborate more on this? Suppose streaming replication primary
commits xid = X at time Y. Later on a standy receives WAL including tx
X and commit it at time Y + 3 seconds. How can a parallel query
execution (which uses snapshot including X) on the standby be delayed
until Y + 3 seconds?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-06-12 Thread Tom Lane
Dean Rasheed  writes:
> The more I read the spec, the less sense it seems to make, and each
> time I read it, I seem to reach a different conclusion.

> On my latest reading, I've almost convinced myself that "updatable" is
> meant to imply support for all 3 operations (INSERT, UPDATE and
> DELETE), at least in the absence of transient tables. The descriptions
> of all 3 seem to require the table to be updatable.

Still, they do admit the possibility of insertable_into being different
from is_updatable.  So I'm pretty happy with what we've got, at least
on the relation level.  Columns seem a bit more debatable; though I
continue to think that an is_updatable column in a not-is_updatable
table isn't contemplated by the spec.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum, Freeze and Analyze: the big picture

2013-06-12 Thread Kevin Grittner
Jeff Janes  wrote:

> Are there other anecdotes about what settings work well in
> practise, assuming people ever find ones that work well?

Putting WAL on its own RAID on its own battery-backed cached can
help a lot more than I would have thought -- even with read-only
transactions.

http://www.postgresql.org/message-id/4b71358e02250002f...@gw.wicourts.gov

--
Kevin Grittner
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-06-12 Thread Dean Rasheed
On 12 June 2013 18:35, Tom Lane  wrote:
> looking at this patch some more ...
>
> Dean Rasheed  writes:
>> One place where I think we have diverged from the spec, however, is in
>> information_schema.columns.updatable. This should be returning 'YES'
>> if the individual column is updatable, and I see no reason for that
>> the require the relation to support DELETE, which is what we currently
>> do (and always have done).
>
> I'm not convinced about this change.  The spec's notion of updatability
> requires both UPDATE and DELETE to be allowed; that's why they don't
> have a separate is_deletable attribute.  And they don't have any such
> thing as a column whose updatability doesn't require updatability of the
> underlying table.  So I think the previous behavior was correct and
> should be maintained: although Postgres does permit decoupling
> deletability from updatability, only tables/columns for which both
> operations are possible should be marked is_updatable in the
> information_schema.  Otherwise, an application relying on the assumption
> that "is_updatable" means it can DELETE will be broken.
>
> I can see however that varying opinions on this are possible.  Although
> I'd removed the separate pg_column_is_updatable() function from your
> patch with the intent of using pg_relation_is_updatable() directly,
> I'm now thinking about putting back the former, so that this decision
> is taken in C code where we can change it without an initdb.
>

The more I read the spec, the less sense it seems to make, and each
time I read it, I seem to reach a different conclusion.

On my latest reading, I've almost convinced myself that "updatable" is
meant to imply support for all 3 operations (INSERT, UPDATE and
DELETE), at least in the absence of transient tables. The descriptions
of all 3 seem to require the table to be updatable. INSERT requires
the table to be insertable-into, updatable and all its columns to be
updatable, but the requirement for insertable-into is only to rule out
transient tables. So if you don't have transient tables, which aren't
insertable-into, then all 3 operations are possible if and only if the
table is updatable.

That interpretation could be used to simplify the API, but no doubt
when I re-read the spec tomorrow, I'll reach a different conclusion.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] how to find out whether a view is updatable

2013-06-12 Thread Tom Lane
Dean Rasheed  writes:
> [ pg_relation_is_updatable.patch ]

I've committed this with some modifications as mentioned.  There is
still room to debate exactly what
information_schema.columns.is_updatable means --- we can now change that
without an initdb.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum, Freeze and Analyze: the big picture

2013-06-12 Thread Andres Freund
On 2013-06-12 14:43:53 -0700, Jeff Janes wrote:
> > The default suggestion that frequently seems to be made is just to
> > disable autovac cost limitations because of that.

> Is there general agreement that this suggestion is bad?  Setting
> autovacuum_vacuum_cost_delay to zero is basically saying "I dare you to do
> your best to destroy my IO performance."  So it is not surprising that this
> just moves one from the frying pan to the fire, or maybe the reverse.

It sure as heck is better than an anti wraparound shutdown every week
because autovacuum doesn't finish all relations in time. More often than
not a large part of the relations has already been frozen so it won't
dirty all that much.
I think it's actually a quite sensible setting in many situations given
the state of the current controls.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum, Freeze and Analyze: the big picture

2013-06-12 Thread Jeff Janes
On Mon, Jun 3, 2013 at 6:42 AM, Andres Freund wrote:

> On 2013-06-02 11:44:04 -0700, Jeff Janes wrote:
> > Do we know why anti-wraparound uses so many resources in the first place?
> >  The default settings seem to be quite conservative to me, even for a
> > system that has only a single 5400 rpm hdd (and even more so for any real
> > production system that would be used for a many-GB database).
>
> I guess the point is that nobody can actually run a bigger OLTP database
> successfully with the default settings. Usually that will end up with a)
> huge amounts of bloat in the tables autovac doesn't scan first b) forced
> shutdowns because autovac doesn't freeze quickly enough.
>

I think that Greg Smith posted elsewhere that 4MB/sec of dirtying (which is
the default) was about right for some of his very busy systems, which seem
like they had pretty impressive IO subsystems.  I was surprised it was so
low.  Are there other anecdotes about what settings work well in practise,
assuming people ever find ones that work well?

Which raises the question, Is the primary problem that there are no
settings that work well for very those systems, or that there usually are
such sweet-spot settings but mere mortals cannot find them?


>
> The default suggestion that frequently seems to be made is just to
> disable autovac cost limitations because of that.
>

Is there general agreement that this suggestion is bad?  Setting
autovacuum_vacuum_cost_delay to zero is basically saying "I dare you to do
your best to destroy my IO performance."  So it is not surprising that this
just moves one from the frying pan to the fire, or maybe the reverse.  (The
small ring buffer used by vacuums might save your bacon if your fsyncs
actually need to hit disk, as the constant stream of fsync requests to the
WAL will act as a secondary throttle).

How about recommending that if autovacuum is not keeping up, that it be
tried to set it to the default divided by the number of spindles?  That may
be overly aggressive, but infinitely less aggressive than setting it to
zero would be.

Cheers,

Jeff


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Claudio Freire
On Wed, Jun 12, 2013 at 6:03 PM, Joshua D. Drake  wrote:
>
>>  Right now you have to be a rocket
>> scientist no matter what configuration you're running.
>
>
> This is quite a bit overblown. Assuming your needs are simple. Archiving is
> at it is now, a relatively simple process to set up, even without something
> like PITRTools.  Where we run into trouble is when they aren't and that is
> ok because we can't solve every problem. We can only provide tools for
> others to solve their particular issue.
>
> What concerns me is we seem to be trying to make this "easy". It isn't
> supposed to be easy. This is hard stuff. Smart people built it and it takes
> a smart person to run it. When did it become a bad thing to be something
> that smart people need to run?
>
> Yes, we need to make it reliable. We don't need to be the Nanny database.


More than easy, it should be obvious.

Obvious doesn't mean easy, it just means what you have to do to get it
right is clearly in front of you. When you give people the freedom of
an "archive command", you also take away any guidance more restricting
options give. I think the point here is that a default would guide
people in how to make this work reliably, without having to rediscover
it every time. A good, *obvious* (not easy) default. Even "cp blah to
NFS mount" is obvious, while not easy (setting up an NFS through
firewalls is never easy).

So, having archive utilities in place of cp would ease the burden of
administration, because it'd be based on collective knowledge. Some
"pg_cp" (or more likely "pg_archive_wal") could check there's enough
space, and whatever else collective knowledge decided is necessary.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Joshua D. Drake


On 06/12/2013 08:49 AM, Robert Haas wrote:


Sure, remote archiving is great, and I'm glad you've been working on
it.  In general, I think that's a cleaner approach, but there are
still enough people using archive_command that we can't throw them
under the bus.


Correct.




I guess archiving to a nfs mount or so isn't too bad, but archiving and
using a cronjob to get the files off is typically a great way to loose data,
and we really shouldn't encourage that by default, Imo.




We certainly not by default but it is also something that can be easy to 
set up reliably if you know what you are doing.




Well, I think what we're encouraging right now is for people to do it
wrong.  The proliferation of complex tools to manage this process
suggests that it is not easy to manage without a complex tool.


No. It suggests that people have more than one requirement that the 
project WILL NEVER be able to solve.


Granted we have solved some of them, for example pg_basebackup. However, 
pg_basebackup isn't really useful for a large database. Multithreaded 
rsync is much more efficient.




 That's
a problem.  And we regularly have users who discover, under a variety
of circumstances, that they've been doing it wrong.  If there's a
better solution than hard-wiring some smarts about local directories,
I'm all ears - but making the simple case just work would still be
better than doing nothing.


Agreed.



 Right now you have to be a rocket
scientist no matter what configuration you're running.


This is quite a bit overblown. Assuming your needs are simple. Archiving 
is at it is now, a relatively simple process to set up, even without 
something like PITRTools.  Where we run into trouble is when they aren't 
and that is ok because we can't solve every problem. We can only provide 
tools for others to solve their particular issue.


What concerns me is we seem to be trying to make this "easy". It isn't 
supposed to be easy. This is hard stuff. Smart people built it and it 
takes a smart person to run it. When did it become a bad thing to be 
something that smart people need to run?


Yes, we need to make it reliable. We don't need to be the Nanny database.

JD

--
Command Prompt, Inc. - http://www.commandprompt.com/  509-416-6579
PostgreSQL Support, Training, Professional Services and Development
High Availability, Oracle Conversion, Postgres-XC, @cmdpromptinc
For my dreams of your image that blossoms
   a rose in the deeps of my heart. - W.B. Yeats


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch to add support of "IF NOT EXISTS" to others "CREATE" statements

2013-06-12 Thread Fabrízio de Royes Mello
On Wed, Jun 12, 2013 at 4:00 PM, Peter Eisentraut  wrote:
>
>
> I'm wondering where "IF NOT EXISTS" and "OR REPLACE" will meet.
>
> For example, why doesn't your list include CREATE FUNCTION?
>
> I have on my personal todo list to add "OR REPLACE" support to CREATE
> AGGREGATE and CREATE OPERATOR.  They are kind of like functions, after
> all, and CREATE OR REPLACE FUNCTION is clearly widely useful.
>
> I suppose both could be useful, but if we're going to make sweeping
> changes, perhaps that should be clarified.
>

I did not include "CREATE FUNCTION" precisely because I had the same doubts.

IMO the "IF NOT EXISTS" and "OR REPLACE" are differents, and can coexists in
the same statements but not used at the same time:

CREATE [ OF REPLACE | IF NOT EXISTS ] FUNCTION ...

I can use "IF NOT EXISTS" to "CREATE" a {FUNCTION | AGGREGATE | OPERATOR}
without replace (OR REPLACE) its definition to just create missing objects
and don't
raise an exception if already exists.


> Btw., I also want REPLACE BUT DO NOT CREATE.

Can you explain more about it?

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Blog sobre TI: http://fabriziomello.blogspot.com
>> Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] Patch to add support of "IF NOT EXISTS" to others "CREATE" statements

2013-06-12 Thread Peter Eisentraut
On 6/12/13 1:29 PM, Fabrízio de Royes Mello wrote:
> The attached patch add support to "IF NOT EXISTS" to "CREATE" statements
> listed below:
> 
> - CREATE AGGREGATE [ IF NOT EXISTS ] ...
> - CREATE CAST [ IF NOT EXISTS ] ...
> - CREATE COLLATION [ IF NOT EXISTS ] ...
> - CREATE OPERATOR [ IF NOT EXISTS ] ...
> - CREATE TEXT SEARCH {PARSER | DICTIONARY | TEMPLATE | CONFIGURATION} [
> IF NOT EXISTS ] ...
> - CREATE TYPE [ IF NOT EXISTS ] ... [AS [{ENUM | RANGE}] (...)]

I'm wondering where "IF NOT EXISTS" and "OR REPLACE" will meet.

For example, why doesn't your list include CREATE FUNCTION?

I have on my personal todo list to add "OR REPLACE" support to CREATE
AGGREGATE and CREATE OPERATOR.  They are kind of like functions, after
all, and CREATE OR REPLACE FUNCTION is clearly widely useful.

I suppose both could be useful, but if we're going to make sweeping
changes, perhaps that should be clarified.

Btw., I also want REPLACE BUT DO NOT CREATE.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Merlin Moncure
On Wed, Jun 12, 2013 at 1:20 PM, Tom Lane  wrote:
> Josh Berkus  writes:
>> On 06/12/2013 11:01 AM, Tom Lane wrote:
>>> I'm going to be disappointed if all we can get out of this is
>>> a cardinality() function, and nothing is done about the empty-array
>>> semantics.
>
>> Well, we can't change the zero-dim behavior without breaking backwards
>> compatibility.  And enough people piled on to say NO to that, that it
>> went by the wayside.
>
> Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
> his was the majority opinion, and in any case there wasn't much
> consideration given to compromises falling somewhere between "no
> changes" and the rather drastic solution Brendan proposed.  For
> instance, it's really hard to believe that this is a good thing:
>
> regression=# select array_dims('{}'::int[]) is null;
>  ?column?
> --
>  t
> (1 row)
>
> Whatever you think the dimensions of that are, surely they're not
> unknown.

But, couldn't that be solved by deprecating that function and
providing a more sensible alternatively named version?

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Tom Lane
Josh Berkus  writes:
> On 06/12/2013 11:01 AM, Tom Lane wrote:
>> I'm going to be disappointed if all we can get out of this is
>> a cardinality() function, and nothing is done about the empty-array
>> semantics.

> Well, we can't change the zero-dim behavior without breaking backwards
> compatibility.  And enough people piled on to say NO to that, that it
> went by the wayside.

Meh.  Robert was pretty vocal about it, but it wasn't clear to me that
his was the majority opinion, and in any case there wasn't much
consideration given to compromises falling somewhere between "no
changes" and the rather drastic solution Brendan proposed.  For
instance, it's really hard to believe that this is a good thing:

regression=# select array_dims('{}'::int[]) is null;
 ?column? 
--
 t
(1 row)

Whatever you think the dimensions of that are, surely they're not
unknown.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Josh Berkus
On 06/12/2013 11:01 AM, Tom Lane wrote:

> Arguably, the only thing wrong with multidim arrays is that they're not
> what the SQL standard asks for.  However, the original point in this
> thread was that we have some very bizarre corner-case behavior for empty
> arrays.  I'm going to be disappointed if all we can get out of this is
> a cardinality() function, and nothing is done about the empty-array
> semantics.

Well, we can't change the zero-dim behavior without breaking backwards
compatibility.  And enough people piled on to say NO to that, that it
went by the wayside.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Tom Lane
Josh Berkus  writes:
> Actually, if you think about it, we've never had "arrays" in PostgreSQL
> ... we have always had matrixes.  If you think about things that way,
> most of the current functionality makes sense.

Arguably, the only thing wrong with multidim arrays is that they're not
what the SQL standard asks for.  However, the original point in this
thread was that we have some very bizarre corner-case behavior for empty
arrays.  I'm going to be disappointed if all we can get out of this is
a cardinality() function, and nothing is done about the empty-array
semantics.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [9.3] Automatically updatable views vs writable foreign tables

2013-06-12 Thread Tom Lane
looking at this patch some more ...

Dean Rasheed  writes:
> One place where I think we have diverged from the spec, however, is in
> information_schema.columns.updatable. This should be returning 'YES'
> if the individual column is updatable, and I see no reason for that
> the require the relation to support DELETE, which is what we currently
> do (and always have done).

I'm not convinced about this change.  The spec's notion of updatability
requires both UPDATE and DELETE to be allowed; that's why they don't
have a separate is_deletable attribute.  And they don't have any such
thing as a column whose updatability doesn't require updatability of the
underlying table.  So I think the previous behavior was correct and
should be maintained: although Postgres does permit decoupling
deletability from updatability, only tables/columns for which both
operations are possible should be marked is_updatable in the
information_schema.  Otherwise, an application relying on the assumption
that "is_updatable" means it can DELETE will be broken.

I can see however that varying opinions on this are possible.  Although
I'd removed the separate pg_column_is_updatable() function from your
patch with the intent of using pg_relation_is_updatable() directly,
I'm now thinking about putting back the former, so that this decision
is taken in C code where we can change it without an initdb.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Josh Berkus

>> Multidim arrays are why we can't have nice things.
> 
> Yeah, I think that was not our best moment.  :-(

They were one of those hacks which looked really clever at the time, but
proved not to be so.  Unfortunately, they *are* useful, and are being
used; I use MD arrays all the time to push data in and out of PL/R., and
now that we have a JSON type I'm using them to generate arrays-of-arrays
using JSON conversion functions.  I'm sure many others are doing the
same, so there's no killing the feature.

Actually, if you think about it, we've never had "arrays" in PostgreSQL
... we have always had matrixes.  If you think about things that way,
most of the current functionality makes sense.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Robert Haas
On Wed, Jun 12, 2013 at 5:05 AM, Brendan Jurd  wrote:
> On 12 June 2013 18:22, Dean Rasheed  wrote:
>> +1 for having a function to return the total number of elements in an
>> array, because that's something that's currently missing from SQL.
>>
>> However, I think that CARDINALITY() should be that function.
>>
>> I'm not convinced that having CARDINALITY() return the length of the
>> first dimension is more spec-compatible, since our multi-dimensional
>> arrays aren't nested arrays, and it seems unlikely that they ever will
>> be. I'd argue that it's at least equally spec-compatible to have
>> CARDINALITY() return the total number of elements in the array, if you
>> think of a multi-dimensional array as a collection of elements
>> arranged in a regular pattern.
>
> It's true that our multidims aren't nested, but they are the nearest
> thing we have.  If we want to keep the door open for future attempts
> to nudge multidim arrays into closer approximation of nested arrays,
> it would be better to have the nested interpretation of CARDINALITY.

I think there's just about zero chance of something like that ever
happening.  The problem is that our type system just can't support it.
 A function or operator that takes an array needs to declare whether
it's going to return an array or whether it's going to return the base
type.  It can't decide to return one or the other at run-time
depending on the dimensionality of the array.

For this to really work, we'd need the number of dimensions to be
baked into the array type.  The obvious implementation would be to
have N array types per base type rather than 1, each with a different
number of dimensions.  Then a subscripting function which took a
1-dimensional array could return anyelement, and the similarly named
function which took a 2-dimensional array could return a 1-dimensional
array.

I believe the reason it wasn't done this way initially was because of
pg_type bloat; having 6 extra type definitions for every type we
support is unappealing.  We could force them to be explicitly declared
as we do for range types.  Or we could rewrite a whole lotta code to
understand a "type" as something more complex than "an OID from
pg_type", so that we don't need pre-defined entries in pg_type for
array types in the first place.

But none of these things are nudges.  Making any real improvement in
this area is going to take major surgery, not a nudge.

> Multidim arrays are why we can't have nice things.

Yeah, I think that was not our best moment.  :-(

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Robert Haas
On Wed, Jun 12, 2013 at 12:07 PM, Peter Eisentraut  wrote:
> On 6/12/13 10:55 AM, Robert Haas wrote:
>> But it's got to be pretty common to archive to a local
>> path that happens to be a remote mount, or to a local directory whose
>> contents are subsequently copied off by a batch job.  Making that work
>> nicely with near-zero configuration would be a significant advance.
>
> Doesn't that just move the problem to managing NFS or batch jobs?  Do we
> want to encourage that?
>
> I suspect that there are actually only about 5 or 6 common ways to do
> archiving (say, local, NFS, scp, rsync, S3, ...).  There's no reason why
> we can't fully specify and/or script what to do in each of these cases.

Go for it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] fallocate / posix_fallocate for new WAL file creation (etc...)

2013-06-12 Thread Robert Haas
On Tue, Jun 11, 2013 at 12:58 PM, Stephen Frost  wrote:
> * Merlin Moncure (mmonc...@gmail.com) wrote:
>> It's understood that posix_fallocate is faster at this -- the question
>> on the table is 'does this matter in context of postgres?'.
>> Personally I think this patch should go in regardless -- the concerns
>> made IMNSHO are specious.
>
> I've not had a chance to look at this patch, but I tend to agree with
> Merlin.

I also think this is a good idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server side lo-funcs name

2013-06-12 Thread Robert Haas
On Tue, Jun 11, 2013 at 10:11 AM, Tatsuo Ishii  wrote:
>> Did you see my email with proposed alternative text?  You didn't even
>> fix the whitespace error I pointed out.
>
> I don't know why but I haven't received your email. I just found the
> email in the web archive and it looks better than what I proposed. Do
> you want to commit it yourself?

Done.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] transforms

2013-06-12 Thread Peter Eisentraut
On 6/12/13 1:20 AM, Josh Berkus wrote:
> Peter, All:
> 
> Does anyone feel like fixing the LOAD issue with transforms?  I haven't
> seen any activity on the patch.

I plan to send in an updated patch.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Peter Eisentraut
On 6/12/13 10:55 AM, Robert Haas wrote:
> But it's got to be pretty common to archive to a local
> path that happens to be a remote mount, or to a local directory whose
> contents are subsequently copied off by a batch job.  Making that work
> nicely with near-zero configuration would be a significant advance.

Doesn't that just move the problem to managing NFS or batch jobs?  Do we
want to encourage that?

I suspect that there are actually only about 5 or 6 common ways to do
archiving (say, local, NFS, scp, rsync, S3, ...).  There's no reason why
we can't fully specify and/or script what to do in each of these cases.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Robert Haas
On Wed, Jun 12, 2013 at 11:32 AM, Magnus Hagander  wrote:
> Wouldn't that encourage people to do local archiving, which is almost always
> a bad idea?

Maybe, but refusing to improve the UI because people might then use
the feature seems wrong-headed.

> I'd rather improve the experience with pg_receivexlog or another way that
> does remote archiving...

Sure, remote archiving is great, and I'm glad you've been working on
it.  In general, I think that's a cleaner approach, but there are
still enough people using archive_command that we can't throw them
under the bus.

> I guess archiving to a nfs mount or so isn't too bad, but archiving and
> using a cronjob to get the files off is typically a great way to loose data,
> and we really shouldn't encourage that by default, Imo.

Well, I think what we're encouraging right now is for people to do it
wrong.  The proliferation of complex tools to manage this process
suggests that it is not easy to manage without a complex tool.  That's
a problem.  And we regularly have users who discover, under a variety
of circumstances, that they've been doing it wrong.  If there's a
better solution than hard-wiring some smarts about local directories,
I'm all ears - but making the simple case just work would still be
better than doing nothing.  Right now you have to be a rocket
scientist no matter what configuration you're running.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Robert Haas
On Sat, Jun 8, 2013 at 7:20 PM, Jeff Janes  wrote:
> If archiving is on and failure is due to no space, could we just keep trying
> XLogFileInit again for a couple minutes to give archiving a chance to do its
> things?  Doing that while holding onto locks and a critical section would be
> unfortunate, but if the alternative is a PANIC, it might be acceptable.

Blech.  I think that's setting our standards pretty low.  It would
neither be possible to use the system nor to shut it down cleanly; I
think the effect would be to turn an immediate PANIC into a
slightly-delayed PANIC, possibly accompanied by some DBA panic.

It seems to me that there are two general ways of approaching this problem.

1. Discover sooner that we're out of space.  Once we've modified the
buffer and entered the critical section, it's too late to have second
thoughts about completing the operation.  If we could guarantee prior
to modifying the buffers that enough WAL space was present to store
the record we're about to write, then we'd be certain not to fail for
this reason.  In theory, this is simple: keep track of how much
uncommitted WAL space we have.  Increment the value when we create new
WAL segments and decrement it by the size of the WAL record we plan to
write.  In practice, it's not so simple.  We don't know whether we're
going to emit FPIs until after we enter the critical section, so the
size of the record can't be known precisely early enough.  We could
think about estimating the space needed conservatively and truing it
up occasionally.  However, there's a second problem: the
available-WAL-space counter would surely become a contention point.

Here's a sketch of a possible solution.  Suppose we know that an
individual WAL record can't be larger than, uh, 64kB.  I'm not sure
there is a limit on the size of a WAL record, but let's say there is,
or we can install one, at around that size limit.  Before we enter a
critical section that's going to write a WAL record, we verify that
the amount of WAL space remaining is at least 64kB * MaxBackends.  If
it's not, we embark on a series of short sleeps, rechecking after each
one; if we hit some time limit, we ERROR out.  As long as every
backend checks this before every WAL record, we can always be sure
there will be at least 64kB left for us.  With this approach, the
shared variable that stores the amount of WAL space remaining only
needs to be updated under WALInsertLock; the "reservation" step only
involves a read.  That might be cheap enough not to matter.

2. Recover from the fact that we ran out of space by backing out the
changes to shared buffers.  Initially, I thought this might be a
promising approach: if we've modified any shared buffers and discover
that we can't log the changes, just invalidate the buffers!  Of
course, it doesn't work, because the buffer might have have already
been dirty when we locked it.  So we'd actually need a way to reverse
out all the changes we were about to log.  That's probably too
expensive to contemplate, in general; and the code would likely get so
little testing as to invite bugs.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Magnus Hagander
On Jun 12, 2013 4:56 PM, "Robert Haas"  wrote:
>
> On Sat, Jun 8, 2013 at 10:36 AM, MauMau  wrote:
> > Yes, I feel designing reliable archiving, even for the simplest case -
copy
> > WAL to disk, is very difficult.  I know there are following three
problems
> > if you just follow the PostgreSQL manual.  Average users won't notice
them.
> > I guess even professional DBAs migrating from other DBMSs won't, either.
> >
> > 1. If the machine or postgres crashes while archive_command is copying
a WAL
> > file, later archive recovery fails.
> > This is because cp leaves a file of less than 16MB in archive area, and
> > postgres refuses to start when it finds such a small archive WAL file.
> > The solution, which IIRC Tomas san told me here, is to do like "cp %p
> > /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".
> >
> > 2. archive_command dumps core when you run pg_ctl stop -mi.
> > This is because postmaster sends SIGQUIT to all its descendants.  The
core
> > files accumulate in the data directory, which will be backed up with the
> > database.  Of course those core files are garbage.
> > archive_command script needs to catch SIGQUIT and exit.
> >
> > 3. You cannot know the reason of archive_command failure (e.g. archive
area
> > full) if you don't use PostgreSQL's server logging.
> > This is because archive_command failure is not logged in
syslog/eventlog.
> >
> >
> > I hope PostgreSQL will provide a reliable archiving facility that is
ready
> > to use.
>
> +1.  I think we should have a way to set an archive DIRECTORY, rather
> than an archive command.  And if you set it, then PostgreSQL should
> just do all of that stuff correctly, without any help from the user.

Wouldn't that encourage people to do local archiving, which is almost
always a bad idea?

I'd rather improve the experience with pg_receivexlog or another way that
does remote archiving...

> Of course, some users will want to archive to a remote machine via ssh
> or rsync or what-have-you, and those users will need to provide their
> own tools.  But it's got to be pretty common to archive to a local
> path that happens to be a remote mount, or to a local directory whose
> contents are subsequently copied off by a batch job.  Making that work
> nicely with near-zero configuration would be a significant advance.

I guess archiving to a nfs mount or so isn't too bad, but archiving and
using a cronjob to get the files off is typically a great way to loose
data, and we really shouldn't encourage that by default, Imo.

/Magnus


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Tatsuo Ishii
> On Sat, Jun 8, 2013 at 10:36 AM, MauMau  wrote:
>> Yes, I feel designing reliable archiving, even for the simplest case - copy
>> WAL to disk, is very difficult.  I know there are following three problems
>> if you just follow the PostgreSQL manual.  Average users won't notice them.
>> I guess even professional DBAs migrating from other DBMSs won't, either.
>>
>> 1. If the machine or postgres crashes while archive_command is copying a WAL
>> file, later archive recovery fails.
>> This is because cp leaves a file of less than 16MB in archive area, and
>> postgres refuses to start when it finds such a small archive WAL file.
>> The solution, which IIRC Tomas san told me here, is to do like "cp %p
>> /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".
>>
>> 2. archive_command dumps core when you run pg_ctl stop -mi.
>> This is because postmaster sends SIGQUIT to all its descendants.  The core
>> files accumulate in the data directory, which will be backed up with the
>> database.  Of course those core files are garbage.
>> archive_command script needs to catch SIGQUIT and exit.
>>
>> 3. You cannot know the reason of archive_command failure (e.g. archive area
>> full) if you don't use PostgreSQL's server logging.
>> This is because archive_command failure is not logged in syslog/eventlog.
>>
>>
>> I hope PostgreSQL will provide a reliable archiving facility that is ready
>> to use.
> 
> +1.  I think we should have a way to set an archive DIRECTORY, rather
> than an archive command.  And if you set it, then PostgreSQL should
> just do all of that stuff correctly, without any help from the user.
> Of course, some users will want to archive to a remote machine via ssh
> or rsync or what-have-you, and those users will need to provide their
> own tools.  But it's got to be pretty common to archive to a local
> path that happens to be a remote mount, or to a local directory whose
> contents are subsequently copied off by a batch job.  Making that work
> nicely with near-zero configuration would be a significant advance.

And there's another example why we need an archive command:

> I'm just setting up pgpool replication on Amazon AWS.
> I'm sending WAL archives to an S3 bucket, which doesn't appear as a
> directory on the server.

From:
http://www.pgpool.net/pipermail/pgpool-general/2013-June/001851.html
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Tom Lane
Simon Riggs  writes:
> On 12 June 2013 01:35, Tom Lane  wrote:
>> On the whole, I think the effort would be a lot more usefully spent on
>> trying to make the existing NUMERIC support go faster.

> Did you have a specific idea in mind? Or an area of investigation?

As I said further up, we should look at Intel's library and see if we
can learn anything that would help the NUMERIC code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Claudio Freire
On Wed, Jun 12, 2013 at 11:55 AM, Robert Haas  wrote:
>> I hope PostgreSQL will provide a reliable archiving facility that is ready
>> to use.
>
> +1.  I think we should have a way to set an archive DIRECTORY, rather
> than an archive command.  And if you set it, then PostgreSQL should
> just do all of that stuff correctly, without any help from the user.
> Of course, some users will want to archive to a remote machine via ssh
> or rsync or what-have-you, and those users will need to provide their
> own tools.  But it's got to be pretty common to archive to a local
> path that happens to be a remote mount, or to a local directory whose
> contents are subsequently copied off by a batch job.  Making that work
> nicely with near-zero configuration would be a significant advance.


That, or provide a standard archive command that takes the directory
as argument?

I bet we have tons of those available among us users...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Tom Lane
Craig Ringer  writes:
> On 06/12/2013 02:24 PM, Tom Dunstan wrote:
>> Oh, interesting. Do the ruby/rails folks use that rather than a pure-ruby
>> driver? I guess I'm spoiled - most of my development happens on the JVM,
>> and the JDBC driver doesn't use libpq.

> Yes, they do - including a horde of deeply confused and frustrated Rails
> users struggling to understand why they're getting "no such file or
> directory" or "permission denied" messages about Pg's unix socket,
> because of course they're linked to Apple's libpq which has a different
> default unix socket path, and unless they explicitly specify `host:
> localhost` in their Rails database.yml they get a unix socket connection.

I poked at this a little bit, wondering whether it'd be practical to fix
the problem by configuring third-party postmasters to create a socket
where Apple's libpq is expecting.  However, it looks like (in Lion
anyway) what libpq is expecting is this:

$ /usr/bin/psql -l
psql: could not connect to server: Permission denied
Is the server running locally and accepting
connections on Unix domain socket "/var/pgsql_socket/.s.PGSQL.5432"?

and that directory is configured with no public permissions at all:

$ ls -ld /var/pgsql_socket
drwxr-x---  2 _postgres  _postgres  68 Jun 13  2011 /var/pgsql_socket/

which basically means that no third-party code should ever be expecting
to communicate with a postmaster through there.

This being the case, I wonder if the Ruby PG gem shouldn't be written
to override the default socket location, which it could do with
something like

if (getenv("PGHOST") == NULL)
putenv("PGHOST=/tmp");

without needing to muck with interpretation of connection strings.
Of course, if third-party packagings of PG aren't consistent about
where they think the socket goes, we won't know what to put there...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hard limit on WAL space used (because PANIC sucks)

2013-06-12 Thread Robert Haas
On Sat, Jun 8, 2013 at 10:36 AM, MauMau  wrote:
> Yes, I feel designing reliable archiving, even for the simplest case - copy
> WAL to disk, is very difficult.  I know there are following three problems
> if you just follow the PostgreSQL manual.  Average users won't notice them.
> I guess even professional DBAs migrating from other DBMSs won't, either.
>
> 1. If the machine or postgres crashes while archive_command is copying a WAL
> file, later archive recovery fails.
> This is because cp leaves a file of less than 16MB in archive area, and
> postgres refuses to start when it finds such a small archive WAL file.
> The solution, which IIRC Tomas san told me here, is to do like "cp %p
> /archive/dir/%f.tmp && mv /archive/dir/%f.tmp /archive/dir/%f".
>
> 2. archive_command dumps core when you run pg_ctl stop -mi.
> This is because postmaster sends SIGQUIT to all its descendants.  The core
> files accumulate in the data directory, which will be backed up with the
> database.  Of course those core files are garbage.
> archive_command script needs to catch SIGQUIT and exit.
>
> 3. You cannot know the reason of archive_command failure (e.g. archive area
> full) if you don't use PostgreSQL's server logging.
> This is because archive_command failure is not logged in syslog/eventlog.
>
>
> I hope PostgreSQL will provide a reliable archiving facility that is ready
> to use.

+1.  I think we should have a way to set an archive DIRECTORY, rather
than an archive command.  And if you set it, then PostgreSQL should
just do all of that stuff correctly, without any help from the user.
Of course, some users will want to archive to a remote machine via ssh
or rsync or what-have-you, and those users will need to provide their
own tools.  But it's got to be pretty common to archive to a local
path that happens to be a remote mount, or to a local directory whose
contents are subsequently copied off by a batch job.  Making that work
nicely with near-zero configuration would be a significant advance.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Simon Riggs
On 12 June 2013 01:35, Tom Lane  wrote:

> On the whole, I think the effort would be a lot more usefully spent on
> trying to make the existing NUMERIC support go faster.

Did you have a specific idea in mind? Or an area of investigation?

--
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Ants Aasma
On Jun 12, 2013 2:02 AM, "Tatsuo Ishii"  wrote:
> No, I'm not talking about conflict resolution.
>
> From http://www.cs.cmu.edu/~natassa/courses/15-823/F02/papers/replication.pdf:
> --
> Eager or Lazy Replication?
>  Eager replication:
>  keep all replicas synchronized by updating all
>  replicas in a single transaction
>
>  Lazy replication:
>  asynchronously propagate replica updates to
>  other nodes after replicating transaction commits
> --
>
> Parallel query execution needs to assume that each node synchronized
> in a commit, otherwise the summary of each query result executed on
> each node is meaningless.

As far as I can see the lazy-eager terminology is based on a
multi-master configuration and doesn't really apply for PostgreSQL
streaming replication.

Parallel query execution doesn't require commits to synchronize all
nodes. Parallel execution needs consistent snapshots across all nodes.
In effect this means that nodes need to agree on commit ordering,
either total order or a partial order that accounts for causality.
Most applications also want the guarantee that once they receive
commit confirmation, next snapshot they take will consider their
transaction as committed.

Coincidentally getting cluster wide consistent snapshots and delaying
until some specific point in commit ordering is almost trivial to
solve with Commit Sequence Number based snapshot scheme that I
proposed.

Regards,
Ants Aasma
-- 
Cybertec Schönig & Schönig GmbH
Gröhrmühlgasse 26
A-2700 Wiener Neustadt
Web: http://www.postgresql-support.de


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improvement of checkpoint IO scheduler for stable transaction responses

2013-06-12 Thread Robert Haas
On Mon, Jun 10, 2013 at 3:48 PM, Simon Riggs  wrote:
> On 10 June 2013 11:51, KONDO Mitsumasa  wrote:
>> I create patch which is improvement of checkpoint IO scheduler for stable
>> transaction responses.
>
> Looks like good results, with good measurements. Should be an
> interesting discussion.

+1.

I suspect we want to poke at the algorithms a little here and maybe
see if we can do this without adding new GUCs.  Also, I think this is
probably two separate patches, in the end.  But the direction seems
good to me.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Andres Freund
On 2013-06-12 08:48:39 -0400, Stephen Frost wrote:
> * Magnus Hagander (mag...@hagander.net) wrote:
> > On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund  
> > wrote:
> > > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> > >> The attached patch fixes this problem. It just changes walsender so that 
> > >> it
> > >> waits for all the outstanding WAL records to be replicated to the standby
> > >> before closing the replication connection.
> > >
> > > Imo this is a fix that needs to get backpatched... The code tried to do
> > > this but failed, I don't think it really gives grounds for valid *new*
> > > concerns.
> > 
> > +1 (without having looked at the code itself, it's definitely a
> > behaviour that needs to be fixed)
> 
> Yea, I was also thinking it would be reasonable to backpatch this; it
> really looks like a bug that we're allowing this to happen today.
> 
> So, +1 on a backpatch for me.  I've looked at the patch (it's a
> one-liner, plus some additional comments) but havn't looked through the
> overall code surrounding it.

I've read most of the surrounding code and I think the patch is as
sensible as it can be without reworking the whole walsender main loop
which seems like a job for another day.

I'd personally write
  if (caughtup && !pq_is_send_pending() &&
  sentPtr == MyWalSnd->flush)
as
  if (caughtup && sentPtr == MyWalSnd->flush &&
  !pq_is_send_pending())

Since pq_is_send_pending() basically can only be false if the flush
comparison is true. There's the tiny chance that we were sending a
message out just before which is why we should include the
!pq_is_send_pending() condition at all in that if().

But that's fairly, fairly minor.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund  wrote:
> > On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> >> The attached patch fixes this problem. It just changes walsender so that it
> >> waits for all the outstanding WAL records to be replicated to the standby
> >> before closing the replication connection.
> >
> > Imo this is a fix that needs to get backpatched... The code tried to do
> > this but failed, I don't think it really gives grounds for valid *new*
> > concerns.
> 
> +1 (without having looked at the code itself, it's definitely a
> behaviour that needs to be fixed)

Yea, I was also thinking it would be reasonable to backpatch this; it
really looks like a bug that we're allowing this to happen today.

So, +1 on a backpatch for me.  I've looked at the patch (it's a
one-liner, plus some additional comments) but havn't looked through the
overall code surrounding it.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Andres Freund
On 2013-06-12 19:47:46 +0800, Craig Ringer wrote:
> On 06/12/2013 05:55 PM, Greg Stark wrote:
> > On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer  
> > wrote:
> >> The main thing I'm wondering is how/if to handle backward compatibility 
> >> with
> >> the existing NUMERIC and its DECIMAL alias
> > If it were 100% functionally equivalent you could just hide the
> > implementation internally. Have a bit that indicates which
> > representation was stored and call the right function depending.
> 
> That's what I was originally wondering about, but as Tom pointed out it
> won't work. We'd still need to handle scale and precision greater than
> that offered by _Decimal128 and wouldn't know in advance how much
> scale/precision they wanted to preserve. So we'd land up upcasting
> everything to NUMERIC whenever we did anything with it anyway, only to
> then convert it back into the appropriate fixed size decimal type for
> storage.

Well, you can limit the "upcasting" to the cases where we would exceed
the precision.

> Pretty pointless, and made doubly so by the fact that if we're
> not using a nice fixed-width type and have to support VARLENA we miss
> out on a whole bunch of performance benefits.

I rather doubt that using a 1byte varlena - which it will be for
reasonably sized Datums - will be a relevant bottleneck here. Maybe if
you only have 'NOT NULL', fixed width columns, but even then...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Magnus Hagander
On Wed, Jun 12, 2013 at 1:48 PM, Andres Freund  wrote:
> Hi,
>
> On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
>> The attached patch fixes this problem. It just changes walsender so that it
>> waits for all the outstanding WAL records to be replicated to the standby
>> before closing the replication connection.
>
> Imo this is a fix that needs to get backpatched... The code tried to do
> this but failed, I don't think it really gives grounds for valid *new*
> concerns.

+1 (without having looked at the code itself, it's definitely a
behaviour that needs to be fixed)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Andres Freund
Hi,

On 2013-06-12 07:53:29 +0900, Fujii Masao wrote:
> The attached patch fixes this problem. It just changes walsender so that it
> waits for all the outstanding WAL records to be replicated to the standby
> before closing the replication connection.

Imo this is a fix that needs to get backpatched... The code tried to do
this but failed, I don't think it really gives grounds for valid *new*
concerns.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Craig Ringer
On 06/12/2013 05:55 PM, Greg Stark wrote:
> On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer  wrote:
>> The main thing I'm wondering is how/if to handle backward compatibility with
>> the existing NUMERIC and its DECIMAL alias
> If it were 100% functionally equivalent you could just hide the
> implementation internally. Have a bit that indicates which
> representation was stored and call the right function depending.

That's what I was originally wondering about, but as Tom pointed out it
won't work. We'd still need to handle scale and precision greater than
that offered by _Decimal128 and wouldn't know in advance how much
scale/precision they wanted to preserve. So we'd land up upcasting
everything to NUMERIC whenever we did anything with it anyway, only to
then convert it back into the appropriate fixed size decimal type for
storage. Pretty pointless, and made doubly so by the fact that if we're
not using a nice fixed-width type and have to support VARLENA we miss
out on a whole bunch of performance benefits.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clean switchover

2013-06-12 Thread Magnus Hagander
On Wed, Jun 12, 2013 at 6:41 AM, Amit Kapila  wrote:
> On Wednesday, June 12, 2013 4:23 AM Fujii Masao wrote:
>> Hi,
>>
>> In streaming replication, when we shutdown the master, walsender tries
>> to send all the outstanding WAL records including the shutdown
>> checkpoint record to the standby, and then to exit. This basically
>> means that all the WAL records are fully synced between two servers
>> after the clean shutdown of the master. So, after promoting the standby
>> to new master, we can restart the stopped master as new standby without
>> the need for a fresh backup from new master.
>>
>> But there is one problem: though walsender tries to send all the
>> outstanding WAL records, it doesn't wait for them to be replicated to
>> the standby. IOW, walsender closes the replication connection as soon
>> as it sends WAL records.
>> Then, before receiving all the WAL records, walreceiver can detect the
>> closure of connection and exit. We cannot guarantee that there is no
>> missing WAL in the standby after clean shutdown of the master. In this
>> case, backup from new master is required when restarting the stopped
>> master as new standby. I have experienced this case several times,
>> especially when enabling WAL archiving.
>>
>> The attached patch fixes this problem. It just changes walsender so
>> that it waits for all the outstanding WAL records to be replicated to
>> the standby before closing the replication connection.
>>
>> You may be concerned the case where the standby gets stuck and the
>> walsender keeps waiting for the reply from that standby. In this case,
>> wal_sender_timeout detects such inactive standby and then walsender
>> ends. So even in that case, the shutdown can end.
>
> Do you think it can impact time to complete shutdown?
> After completing shutdown, user will promote standby to master, so if there
> is delay in shutdown, it can cause delay in switchover.

I'd expect a controlled switchover to happen without dataloss. Yes,
this could make it take a bit longer time, but it guarantees you don't
loose data. ISTM that if you don't care about the potential dataloss,
you can just use a faster shutdown method (e.g. immediate)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] JSON and unicode surrogate pairs

2013-06-12 Thread Andrew Dunstan


On 06/12/2013 12:13 AM, Craig Ringer wrote:

On 06/12/2013 08:42 AM, Andrew Dunstan wrote:

If we work by analogy to Postgres' own handling of Unicode escapes,
we'll raise an error on any Unicode escape beyond ASCII (not on input
for legacy reasons, but on trying to process such datums). I gather that
would meet your objection.

I could live with that if eager validation on input was the default, but
could be disabled by setting (say) compat_lazy_json_validation = on .
I don't like the idea of leaving us saddled with weak validation just
that's what we've got. It's been an ongoing source of pain as UTF-8
support has improved and I'd really like a way to avoid semi-valid JSON
making it into the DB and causing similar problems.



I think it's rather too late in the cycle to be proposing new GUCs. We 
can revisit this for 9.4 perhaps.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Re: Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Greg Stark
On Wed, Jun 12, 2013 at 12:56 AM, Craig Ringer  wrote:
> The main thing I'm wondering is how/if to handle backward compatibility with
> the existing NUMERIC and its DECIMAL alias

If it were 100% functionally equivalent you could just hide the
implementation internally. Have a bit that indicates which
representation was stored and call the right function depending.


-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Brendan Jurd
On 12 June 2013 18:22, Dean Rasheed  wrote:
> +1 for having a function to return the total number of elements in an
> array, because that's something that's currently missing from SQL.
>
> However, I think that CARDINALITY() should be that function.
>
> I'm not convinced that having CARDINALITY() return the length of the
> first dimension is more spec-compatible, since our multi-dimensional
> arrays aren't nested arrays, and it seems unlikely that they ever will
> be. I'd argue that it's at least equally spec-compatible to have
> CARDINALITY() return the total number of elements in the array, if you
> think of a multi-dimensional array as a collection of elements
> arranged in a regular pattern.

It's true that our multidims aren't nested, but they are the nearest
thing we have.  If we want to keep the door open for future attempts
to nudge multidim arrays into closer approximation of nested arrays,
it would be better to have the nested interpretation of CARDINALITY.
Given what we've just gone through with array_length, it seems that
once we select a behaviour for CARDINALITY, we will be stuck with it
permanently.

The problem with thinking of our multidim arrays as just a weirdly
crumpled arrangement of a single collection, is that we've already
abused the nesting syntax for declaring them.

> Also, the spec describes CARDINALITY() and UNNEST() using the same
> language, and I think it's implicit in a couple of places that
> CARDINALITY() should match the number of rows returned by UNNEST(),
> which we've already implemented as fully unnesting every element.
>
> We're about to add ORDINALITY to UNNEST(), and to me it would be very
> odd to have the resulting maximum ordinality exceed the array's
> cardinality.

Yeah, that makes sense.  Well the good news is that either way,
CARDINALITY will do what people want in the most common case where the
array is one-dimensional.

Multidim arrays are why we can't have nice things.

Cheers,
BJ


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Tom Dunstan
On 12 June 2013 17:30, Dave Page  wrote:

> Messing with the path (or the dynamic load path) can cause all sorts
> of fun and interesting problems for users, as we found in the early
> days with the EDB installers. I realise it doesn't help these users
> (who doubtless don't know it exists) but what we do these days is drop
> a "pg_env.sh" file in the installation directory that the user can
> source to set their PATH and various PG* environment variables when
> they need/want to.
>

Well, I was imagining something like a helpful dialog box saying "would you
like me to fix your path? I'm just going to source
/Applications/Postgres.app/env.sh in your bash_profile" and the user can
click "ok" or "no thanks I'll do it myself". It might lead to even more
confusion, but it's got to be better than the pg gem silently linking
against the wrong libpq and subsequently failing in interesting ways.

Of course, if they've already installed the pg gem then it's too late
anyway, but at least reinstalling it would then work.

Blech. The more I think about it, the more I like the idea of libpq bundled
with the gem.

Cheers

Tom


Re: [HACKERS] [PATCH] Exorcise "zero-dimensional" arrays (Was: Re: Should array_length() Return NULL)

2013-06-12 Thread Dean Rasheed
On 12 June 2013 04:50, David E. Wheeler  wrote:
> On Jun 11, 2013, at 3:09 PM, Brendan Jurd  wrote:
>
>> There have been attempts to add a cardinality function in the past, as
>> it is required by the SQL spec, but these attempts have stalled when
>> trying to decide how it should handle multidim arrays.  Having it
>> return the length of the first dimension is the more spec-compatible
>> way to go, but some folks argued that it should work as
>> ArrayGetNItems, because we don't already have a function for that at
>> the SQL level.  Therefore I propose we add cardinality() per the spec,
>> and another function to expose ArrayGetNItems.
>>
>> And that's about where we got to, when the whole discussion was put on
>> a time-out to make room for the beta.
>>
>> I am withdrawing the original zero-D patch in favour of the proposed
>> new functions.  If you have an opinion about that, please do chime in.
>> Depending on how that goes I may post a patch implementing my new
>> proposal in the next few days.
>
> +1 to this proposal. Modulo function names, perhaps. I don’t much care what 
> they're called, as long as the work as you describe here.
>

+1 for having a function to return the total number of elements in an
array, because that's something that's currently missing from SQL.

However, I think that CARDINALITY() should be that function.

I'm not convinced that having CARDINALITY() return the length of the
first dimension is more spec-compatible, since our multi-dimensional
arrays aren't nested arrays, and it seems unlikely that they ever will
be. I'd argue that it's at least equally spec-compatible to have
CARDINALITY() return the total number of elements in the array, if you
think of a multi-dimensional array as a collection of elements
arranged in a regular pattern.

Also, the spec describes CARDINALITY() and UNNEST() using the same
language, and I think it's implicit in a couple of places that
CARDINALITY() should match the number of rows returned by UNNEST(),
which we've already implemented as fully unnesting every element.

We're about to add ORDINALITY to UNNEST(), and to me it would be very
odd to have the resulting maximum ordinality exceed the array's
cardinality.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Dave Page
On Wed, Jun 12, 2013 at 7:24 AM, Tom Dunstan  wrote:
>
> Another alternative is for the Postgres.app to add its bin dir to the user's
> (or system's) path on first startup. Then the correct pg_config will be
> found (and the correct psql, pgdump etc etc as well). The app could in
> theory even go looking for existing pg gem installed under .rvm or whatever
> and prompt the user to reinstall the gem.

Messing with the path (or the dynamic load path) can cause all sorts
of fun and interesting problems for users, as we found in the early
days with the EDB installers. I realise it doesn't help these users
(who doubtless don't know it exists) but what we do these days is drop
a "pg_env.sh" file in the installation directory that the user can
source to set their PATH and various PG* environment variables when
they need/want to.


--
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Tom Dunstan
On 12 June 2013 16:30, Craig Ringer  wrote:

> None of this is hard if you have  clue what you're doing. Rebuild the Pg
> gem against the right libpq by fixing your PATH so it finds the right
> pg_config, set host=/tmp, or set host=localhost. Any of the three will
> work. Unfortunately most of these users seem to struggle with that, and
> their approach to "it didn't work" appears to be "find another
> tool/tutorial and try that instead".
>

So we need an "official" tutorial? But which distribution would we point
people to? :)


> The postgres.app documentation its self doesn't look quite right when it
> comes to Ruby, actually. For Ruby/Rails it says the user should use "gem
> install pg" but it doesn't tell them to set the PATH first, so they'll get
> whatever random pg_config is on the PATH first, often Apple's elderly Pg
> with its different socket directory path, etc. Sure, they can get around
> that just by setting host: localhost, but it'd be nice to see that improved
> so it tells them how to build their Pg gem against the correct libpq. Or,
> better, has Postgres.app automatically install it for them when they
> install it.
>

Hmm, but where to install it? People using rvm or bundler will have their
gems tucked away in a variety of places.

Cheers

Tom


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Tom Dunstan
On 12 June 2013 16:12, Craig Ringer  wrote:

> Yes, they do - including a horde of deeply confused and frustrated Rails
> users struggling to understand why they're getting "no such file or
> directory" or "permission denied" messages about Pg's unix socket,
> because of course they're linked to Apple's libpq which has a different
> default unix socket path, and unless they explicitly specify `host:
> localhost` in their Rails database.yml they get a unix socket connection.
>

Maybe we could ask the rails people to stick a "host: localhost" into their
postgresql examples? Might help a bit, and most users at that level won't
need the absolutely most up-to-date libpq. Of course, there are probably
hundreds of tutorials all over the net that won't have the fix.


> Another option would be to have to explicitly allow use of Apple's
> PostgreSQL (based on known install paths) though; think
> "--use-system-postgresql".
>

Yeah, or --allow-old-libpq or something like that. How will the gem
installer know if the pg_config that it has found is a system one or not?
Going by version number is probably easier.


> I get the strong impression from what I've been reading that a fairly
> typical Rails user setup is:
>
> * Install homebrew
> * Install PostgreSQL using homebrew but don't start it
> * Build the Pg gem against homebrew postgresql's libpq
> * Download and run postgres.app
> * Run your Pg gem using the libpq from homebrew against the postgres.app
> server
>
> Ugh.
>

Hmm. Seems like all the more reason to steer people away from socket-based
comms.


> Good point... though that also raises more concerns regarding consumers
> of the Pg library. And extensions, for that matter; if extensions are
> out-of-tree you need versioned subdirectories, otherwise you'll have
> conflicts between 9.2 and 9.3 (for example) versions of the same
> extensions.
>

Right. I was picturing something like
~/Library/Postgres.app/9.2/extensions. We shouldn't be breaking extensions
within a major release.


> It's also another issue with libpq. User upgrades Postgres.app and
> suddenly their Ruby gems stop working with some linkage error they
> probably don't understand.
>
> (All this is, IMO, really a lesson in why Apple should introduce a
> non-awful packaging system into OS X).


Well, I'm not holding my breath on their packaging changing anytime soon. :)

I wonder if a better approach might be to actually have the gem bundle its
own copy of libpq. Then there's no question of linkage errors when the
server on the system changes, and if users are using tcp rather than unix
sockets, they should be pretty well insulated from those sorts of issues.
Just specify the right port and you're good to go.

Is libpg buildable without building the whole tree? Is it downloadable
without downloading the whole distribution? Hmm.

Cheers

Tom


Re: [HACKERS] [PATCH] pgbench --throttle (submission 7 - with lag measurement)

2013-06-12 Thread Fabien COELHO


Did you look at the giant latency spikes at the end of the test run I 
submitted the graph for?  I wanted to nail down what was causing those 
before worrying about the startup timing.


If you are still worried: if you run the very same command without 
throttling and measure the same latency, does the same thing happens at 
the end? My guess is that it should be "yes". If it is no, I'll try out 
pgbench-tools.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding IEEE 754:2008 decimal floating point and hardware support for it

2013-06-12 Thread Thomas Munro
On 12 June 2013 00:56, Craig Ringer  wrote:

> The main thing I'm wondering is how/if to handle backward compatibility
> with the existing NUMERIC and its DECIMAL alias, or whether adding new
> DECIMAL32, DECIMAL64, and DECIMAL128 types would be more appropriate. I'd
> love to just use the SQL standard types name DECIMAL if possible, and the
> standard would allow for it (see below), but backward compat would be a
> challenge, as would coming up with a sensible transparent promotion scheme
> from 32->64->128->numeric and ways to stop undesired promotion.
>

For what it's worth, DB2 9.5 and later call these types DECFLOAT(16) and
DECFLOAT(34), and they are distinct from DECIMAL/NUMERIC.

http://www.ibm.com/developerworks/data/library/techarticle/dm-0801chainani/


Re: [HACKERS] Parallell Optimizer

2013-06-12 Thread Yeb Havinga

On 2013-06-07 19:09, Fred&Dani&Pandora&Aquiles wrote:
I asked a while ago in this group about the possibility to implement a 
parallel planner in a multithread way, and  the replies were that the 
proposed approach couldn't be implemented, because the postgres is not 
thread-safe. With the new feature Background Worker Processes, such 
implementation would be possible? If yes, do you can see possible 
problems in implement this approach, for example, the bgprocess can't 
access some planning core functions like make_join_rel, acess them in 
parallel and so on. I want start a research to work in a parallel 
planner in postgres, I succeeded in in the DBMS H2, but my first 
option still is the postgres, and any help is welcome.


The topic has been researched and experimented with on PostgreSQL 8.3, 
described in a vldb-2008 paper called Parallelizing Query Optimization, 
available on http://www.vldb.org/pvldb/1/1453882.pdf


regards,
Yeb

PS: apologies for redundant posting.



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Configurable location for extension .control files

2013-06-12 Thread Craig Ringer
On 06/12/2013 02:52 PM, Peter Geoghegan wrote:
> On Tue, Jun 11, 2013 at 11:42 PM, Craig Ringer  wrote:
>> Anyway, point being that PostgreSQL from Macports, Homebrew, and/or
>> EnterpriseDB's installer might be present ... and even in use.
> Perhaps you should direct those users towards http://postgresapp.com

Well, at that point they're usually already in a bit of a mess that they
don't know how to get themselves out of. They're often *also* attempting
to use Postgres.app, but struggling with issues with unix socket
directory defaults, etc. The postgres.app docs do appear to offer some
useful basic guidance for users on how to uninstalll whatever they
might've installed.

None of this is hard if you have  clue what you're doing. Rebuild the Pg
gem against the right libpq by fixing your PATH so it finds the right
pg_config, set host=/tmp, or set host=localhost. Any of the three will
work. Unfortunately most of these users seem to struggle with that, and
their approach to "it didn't work" appears to be "find another
tool/tutorial and try that instead".

Sure, they're not my (or your) problem. I'd still like to see usability
in this area improve if it's possible.

The postgres.app documentation its self doesn't look quite right when it
comes to Ruby, actually. For Ruby/Rails it says the user should use "gem
install pg" but it doesn't tell them to set the PATH first, so they'll
get whatever random pg_config is on the PATH first, often Apple's
elderly Pg with its different socket directory path, etc. Sure, they can
get around that just by setting host: localhost, but it'd be nice to see
that improved so it tells them how to build their Pg gem against the
correct libpq. Or, better, has Postgres.app automatically install it for
them when they install it.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services