Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread Martijn van Oosterhout
On Mon, Jun 02, 2014 at 01:29:25PM -0400, Robert Haas wrote:
 I agree, but I think it's important to note that Alex's complaint is
 not unique - the way things work now is a real source of frustration
 for users.  In a previous job, I wrote a schema-upgrade script that
 dropped all of the views in reverse creation order, applied the schema
 updates, and then recreated all the views. This worked, but it was a
 lot of hassle that I would have preferred to avoid, and in a
 higher-volume application, simultaneously grabbing exclusive locks on
 a large number of critical views would have been a non-starter.  In
 the job before that, I did the same thing manually, which was no fun
 at all.  This was actually what posted me to write one of my first
 patches, committed by Bruce as
 ff1ea2173a92dea975d399a4ca25723f83762e55.

Would it be sufficient to automatically pass the type change through
only if nothing in the view actually references it in a function,
operator, group by, order by, etc?  That is, it only appears in the
SELECT list unadorned?  Or is that too limiting?

Have a nice day,
-- 
Martijn van Oosterhout   klep...@svana.org   http://svana.org/kleptog/
 He who writes carelessly confesses thereby at the very outset that he does
 not attach much importance to his own thoughts.
   -- Arthur Schopenhauer


signature.asc
Description: Digital signature


Re: [HACKERS] 9.4 release notes

2014-06-03 Thread Peter Geoghegan
On Sun, May 4, 2014 at 5:46 AM, Bruce Momjian br...@momjian.us wrote:
 Feedback expected and welcomed.

One item currently reads Improve valgrind error reporting.  I
suggest this be changed to Add support for Valgrind memcheck memory
error detector. It was possible to use the tool before, but the lack
of cooperation from Postgres made this far less useful.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Memory deallocation after calling cast function

2014-06-03 Thread Soroosh Sardari
Hi

I have problem with memory deallocation. look at the following queries

1- create table test01(a) as select generate_series(1,1)::int8 ;

2- create table test02(a) as select generate_series(1,1) ;

In execution of first query, memory usage increase rapidly until the
transaction comes to end and deallocate all the memory which allocated with
palloc.
I have wondered why all the memory deallocated at the end, so the cast is
removed and query executed again. memory usage was not the same. It was
grow very slow.

I need help to find the right point to deallocate the memory,
Any idea will be appreciated.

Soroosh Sardari


Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-03 Thread Magnus Hagander
On Mon, Jun 2, 2014 at 4:07 PM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-06-02 22:59:55 +0900, Fujii Masao wrote:
  On Thu, May 29, 2014 at 3:02 PM, Peter Geoghegan p...@heroku.com wrote:
   On Wed, May 28, 2014 at 10:49 PM, Fujii Masao masao.fu...@gmail.com
 wrote:
   You're concerned about the scenario using pg_upgrade? I'm not sure
 the detail
   of pg_upgrade. But if it doesn't work properly, we should have gotten
   the trouble
  
   I'm not worried about pg_upgrade, because by design pg_stat_statements
   will discard stats files that originated in earlier versions. However,
   I don't see a need to change pg_stat_statements to serialize its
   statistics to disk in the pg_stat directory before we branch off 9.4.
   As you mentioned, it's harmless.
 
  Yeah, that's an idea. OTOH, there is no *strong* reason to postpone
  the fix to 9.5. So I just feel inclined to apply the fix now...

 +1 for fixing it now.


+1 for fixing it for 9.4 before the next beta, but *not* backpatching it to
9.3 - it *is* a behaviour change after all..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Andres Freund

Hi,

On 2014-06-03 12:43:28 +1000, Haribabu Kommi wrote:
 *** a/src/test/regress/expected/create_function_3.out
 --- b/src/test/regress/expected/create_function_3.out
 ***
 *** 153,389  RESET SESSION AUTHORIZATION;
   SELECT proname, prorettype::regtype, proargtypes::regtype[]
  FROM pg_proc JOIN pg_namespace ON pronamespace = pg_namespace.oid
  WHERE nspname = 'pg_catalog' AND proleakproof ORDER BY proname;
 ! proname | prorettype | proargtypes  

 ! 
 ++-
 !  abstimeeq  | boolean| [0:1]={abstime,abstime}
...
 !  xideq  | boolean| [0:1]={xid,xid}
 ! (228 rows)
   
   --
   -- CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT
 --- 153,391 
   SELECT proname, prorettype::regtype, proargtypes::regtype[]
  FROM pg_proc JOIN pg_namespace ON pronamespace = pg_namespace.oid
  WHERE nspname = 'pg_catalog' AND proleakproof ORDER BY proname;
 !  proname | prorettype | proargtypes 
 
 ! 
 -++-
 !  abstimeeq   | boolean| [0:1]={abstime,abstime}
...
 !  xideq   | boolean| [0:1]={xid,xid}
 ! (230 rows)

I didn't reall look at the patch, but it very much looks to me like that
query result could use the \a\t treatment that rules.sql and
sanity_check.sql got. It's hard to see the actual difference
before/after the patch.
I'll patch that now, to reduce the likelihood of changes there causing
conflicts for more people.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposing pg_hibernate

2014-06-03 Thread Robert Haas
On Thu, May 29, 2014 at 12:12 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 IMHO, all of these caveats, would affect a very small fraction of
 use-cases and are eclipsed by the benefits this extension provides in
 normal cases.

 I agree with you that there are only few corner cases where evicting
 shared buffers by this utility would harm, but was wondering if we could
 even save those, say if it would only use available free buffers.  I think
 currently there is no such interface and inventing a new interface for this
 case doesn't seem to reasonable unless we see any other use case of
 such a interface.

It seems like it would be best to try to do this at cluster startup
time, rather than once recovery has reached consistency.  Of course,
that might mean doing it with a single process, which could have its
own share of problems.  But I'm somewhat inclined to think that if
recovery has already run for a significant period of time, the blocks
that recovery has brought into shared_buffers are more likely to be
useful than whatever pg_hibernate would load.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposing pg_hibernate

2014-06-03 Thread Gurjeet Singh
On Tue, Jun 3, 2014 at 7:57 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 29, 2014 at 12:12 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 IMHO, all of these caveats, would affect a very small fraction of
 use-cases and are eclipsed by the benefits this extension provides in
 normal cases.

 I agree with you that there are only few corner cases where evicting
 shared buffers by this utility would harm, but was wondering if we could
 even save those, say if it would only use available free buffers.  I think
 currently there is no such interface and inventing a new interface for this
 case doesn't seem to reasonable unless we see any other use case of
 such a interface.

 It seems like it would be best to try to do this at cluster startup
 time, rather than once recovery has reached consistency.  Of course,
 that might mean doing it with a single process, which could have its
 own share of problems.  But I'm somewhat inclined to think that if
 recovery has already run for a significant period of time, the blocks
 that recovery has brought into shared_buffers are more likely to be
 useful than whatever pg_hibernate would load.

I am not absolutely sure of the order of execution between recovery
process and the BGWorker, but ...

For sizeable shared_buffers size, the restoration of the shared
buffers can take several seconds. I have a feeling the users wouldn't
like their master database take up to a few minutes to start accepting
connections. From my tests [1],  In the 'App after Hibernator' [case]
... This took 70 seconds for reading the ~4 GB database.

[1]: 
http://gurjeet.singh.im/blog/2014/04/30/postgres-hibernator-reduce-planned-database-down-times/

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread Robert Haas
On Mon, Jun 2, 2014 at 10:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 I can see two answers.  Answer #1 is
 that the column type of bar.a changes from int to bigint and the view
 definition is still SELECT a FROM foo.  In that case, showing the user
 the SQL does not help them see and approve semantic changes because
 the SQL is completely unchanged.

 Yeah, we need some way of highlighting the semantic differences, and just
 printing ruleutils.c output doesn't do that.  But if the user is going to
 put in a change to whatever choice the tool makes by default here,
 I would expect that change to consist of adding (or removing) an explicit
 cast in the SQL-text view definition.  We can't make people learn some
 random non-SQL notation for this.

 Perhaps the displayed output of the tool could look something like

 CREATE VIEW bar AS
   SELECT
 a  -- this view output column will now be of type int8 not int4
   FROM foo;

 Or something else; I don't claim to be a good UI designer.  But in the
 end, this is 90% a UI problem, and that means that raw SQL is seriously
 poorly suited to solve it directly.

I guess I don't agree that is 90% a UI problem.  There's currently no
mechanism whatsoever by means of which a user can change the data type
of a column upon which a view depends.  If we had such a mechanism,
then perhaps someone could build a UI providing the sort of user
feedback you're suggesting to help them use it more safely.  But isn't
the core server support the first thing?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Vik Fearing
This patch implements a timeout for broken clients that idle in transaction.

This is a TODO item.

When the timeout occurs, the backend commits suicide with a FATAL
ereport.  I thought about just aborting the transaction to free the
locks but decided that the client is clearly broken so might as well
free up the connection as well.

The same behavior can be achieved by an external script that monitors
pg_stat_activity but by having it in core we get much finer tuning (it
is session settable) and also traces of it directly in the PostgreSQL
logs so it can be graphed by things like pgbadger.

Unfortunately, no notification is sent to the client because there's no
real way to do that right now.

-- 
Vik

*** a/contrib/postgres_fdw/connection.c
--- b/contrib/postgres_fdw/connection.c
***
*** 343,348  configure_remote_session(PGconn *conn)
--- 343,356 
  		do_sql_command(conn, SET extra_float_digits = 3);
  	else
  		do_sql_command(conn, SET extra_float_digits = 2);
+ 
+ 	/*
+ 	 * Ensure the remote server doesn't kill us off if we remain idle in a
+ 	 * transaction for too long.
+ 	 * XXX: The version number must be corrected prior to commit!
+ 	 */
+ 	if (remoteversion = 90400)
+ 		do_sql_command(conn, SET idle_in_transaction_timeout = 0);
  }
  
  /*
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 5545,5550  COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
--- 5545,5566 
/listitem
   /varlistentry
  
+  varlistentry id=guc-idle-in-transaction-timeout xreflabel=idle_in_transaction_timeout
+   termvarnameidle_in_transaction_timeout/varname (typeinteger/type)
+   indexterm
+primaryvarnameidle_in_transaction_timeout/ configuration parameter/primary
+   /indexterm
+   /term
+   listitem
+para
+ Terminate any session that is idle in transaction for longer than the specified
+ number of seconds.  This not only allows any locks to be released, but also allows
+ the connection slot to be reused.  However, aborted idle in transaction sessions
+ are not affected.  A value of zero (the default) turns this off.
+/para
+   /listitem
+  /varlistentry
+ 
   varlistentry id=guc-vacuum-freeze-table-age xreflabel=vacuum_freeze_table_age
termvarnamevacuum_freeze_table_age/varname (typeinteger/type)
indexterm
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***
*** 676,682  ERROR:  could not serialize access due to read/write dependencies among transact
   listitem
para
 Don't leave connections dangling quoteidle in transaction/quote
!longer than necessary.
/para
   /listitem
   listitem
--- 676,684 
   listitem
para
 Don't leave connections dangling quoteidle in transaction/quote
!longer than necessary.  The configuration parameter
!xref linkend=guc-idle-in-transaction-timeout may be used to
!automatically disconnect lingering sessions.
/para
   /listitem
   listitem
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
***
*** 57,62 
--- 57,63 
  int			DeadlockTimeout = 1000;
  int			StatementTimeout = 0;
  int			LockTimeout = 0;
+ int			IdleInTransactionTimeout = 0;
  bool		log_lock_waits = false;
  
  /* Pointer to this process's PGPROC and PGXACT structs, if any */
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***
*** 3937,3942  PostgresMain(int argc, char *argv[],
--- 3937,3946 
  			{
  set_ps_display(idle in transaction, false);
  pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
+ 
+ /* Start the idle-in-transaction timer, in seconds */
+ if (IdleInTransactionTimeout  0)
+ 	enable_timeout_after(IDLE_IN_TRANSACTION_TIMEOUT, 1000*IdleInTransactionTimeout);
  			}
  			else
  			{
***
*** 3970,3976  PostgresMain(int argc, char *argv[],
  		DoingCommandRead = false;
  
  		/*
! 		 * (5) check for any other interesting events that happened while we
  		 * slept.
  		 */
  		if (got_SIGHUP)
--- 3974,3986 
  		DoingCommandRead = false;
  
  		/*
! 		 * (5) turn off the idle-in-transaction timeout
! 		 */
! 		if (IdleInTransactionTimeout  0)
! disable_timeout(IDLE_IN_TRANSACTION_TIMEOUT, false);
! 
! 		/*
! 		 * (6) check for any other interesting events that happened while we
  		 * slept.
  		 */
  		if (got_SIGHUP)
***
*** 3980,3986  PostgresMain(int argc, char *argv[],
  		}
  
  		/*
! 		 * (6) process the command.  But ignore it if we're skipping till
  		 * Sync.
  		 */
  		if (ignore_till_sync  firstchar != EOF)
--- 3990,3996 
  		}
  
  		/*
! 		 * (7) process the command.  But ignore it if we're skipping till
  		 * Sync.
  		 */
  		if (ignore_till_sync  firstchar != EOF)
*** a/src/backend/utils/errcodes.txt
--- b/src/backend/utils/errcodes.txt

Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Abhijit Menon-Sen
At 2014-06-03 15:06:11 +0200, vik.fear...@dalibo.com wrote:

 This patch implements a timeout for broken clients that idle in
 transaction.

I think this is a nice feature, but I suggest that (at the very least)
the GUC should be named idle_transaction_timeout.

 +para
 + Terminate any session that is idle in transaction for longer than 
 the specified
 + number of seconds.  This not only allows any locks to be released, 
 but also allows
 + the connection slot to be reused.  However, aborted idle in 
 transaction sessions
 + are not affected.  A value of zero (the default) turns this off.
 +/para

I suggest:

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

It's not clear to me what However, aborted idle in transaction sessions
are not affected means.

The default value of 0 means that such sessions will not be
terminated.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-06-03 Thread Robert Haas
On Fri, May 30, 2014 at 2:38 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 (forgot to answer this question)

 On 05/30/2014 06:27 PM, Andres Freund wrote:

 On 2014-05-30 17:59:23 +0300, Heikki Linnakangas wrote:

 * I expanded pg_clog to 64-bits per XID, but people suggested keeping
 pg_clog as is, with two bits per commit, and adding a new SLRU for the
 commit LSNs beside it. Probably will need to do something like that to
 avoid
 bloating the clog.


 It also influences how on-disk compatibility is dealt with. So: How are
 you planning to deal with on-disk compatibility?


 Have pg_upgrade read the old clog and write it out in the new format.

That doesn't address Bruce's concern about CLOG disk consumption.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread David G Johnston
On Tuesday, June 3, 2014, Robert Haas [via PostgreSQL] 
ml-node+s1045698n5805857...@n5.nabble.com wrote:

 On Mon, Jun 2, 2014 at 10:00 PM, Tom Lane [hidden email]
 http://user/SendEmail.jtp?type=nodenode=5805857i=0 wrote:

  I can see two answers.  Answer #1 is
  that the column type of bar.a changes from int to bigint and the view
  definition is still SELECT a FROM foo.  In that case, showing the user
  the SQL does not help them see and approve semantic changes because
  the SQL is completely unchanged.
 
  Yeah, we need some way of highlighting the semantic differences, and
 just
  printing ruleutils.c output doesn't do that.  But if the user is going
 to
  put in a change to whatever choice the tool makes by default here,
  I would expect that change to consist of adding (or removing) an
 explicit
  cast in the SQL-text view definition.  We can't make people learn some
  random non-SQL notation for this.
 
  Perhaps the displayed output of the tool could look something like
 
  CREATE VIEW bar AS
SELECT
  a  -- this view output column will now be of type int8 not
 int4
FROM foo;
 
  Or something else; I don't claim to be a good UI designer.  But in the
  end, this is 90% a UI problem, and that means that raw SQL is seriously
  poorly suited to solve it directly.

 I guess I don't agree that is 90% a UI problem.  There's currently no
 mechanism whatsoever by means of which a user can change the data type
 of a column upon which a view depends.  If we had such a mechanism,
 then perhaps someone could build a UI providing the sort of user
 feedback you're suggesting to help them use it more safely.  But isn't
 the core server support the first thing?


The current mechanism is DROP VIEWs - ALTER TABLE - CREATE VIEWs

The UI would prompt the user for the desired ALTER TABLE
parameters, calculate the DROP/CREATE commands, then issue all three sets
as a single transaction.

Having a more surgical REWRITE RULE command to alter a view without
dropping it may provide for performance improvements but, conceptually, the
current mechanism should be sufficient to allow for this tool to be
developed.

The main thing that core could do to help is to store as text of the
original create view command - though it may be sufficient to reverse
engineer from the rule.  Having both available would give any tools more
options.

David J.




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Re-create-dependent-views-on-ALTER-TABLE-ALTER-COLUMN-TYPE-tp5804972p5805864.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Vik Fearing
On 06/03/2014 03:30 PM, Abhijit Menon-Sen wrote:
 At 2014-06-03 15:06:11 +0200, vik.fear...@dalibo.com wrote:
 This patch implements a timeout for broken clients that idle in
 transaction.
 I think this is a nice feature, but I suggest that (at the very least)
 the GUC should be named idle_transaction_timeout.

I prefer the way I have it, but not enough to put up a fight if other
people like your version better.

 +para
 + Terminate any session that is idle in transaction for longer than 
 the specified
 + number of seconds.  This not only allows any locks to be released, 
 but also allows
 + the connection slot to be reused.  However, aborted idle in 
 transaction sessions
 + are not affected.  A value of zero (the default) turns this off.
 +/para
 I suggest:

 Terminate any session with an open transaction that has been idle
 for longer than the specified duration in seconds. This allows any
 locks to be released and the connection slot to be reused.

 It's not clear to me what However, aborted idle in transaction sessions
 are not affected means.

 The default value of 0 means that such sessions will not be
 terminated.

How about this?

Terminate any session with an open transaction that has been idle
for longer than the specified duration in seconds. This allows any
locks to be released and the connection slot to be reused.

Sessions in the state idle in transaction (aborted) occupy a
connection slot but because they hold no locks, they are not
considered by this parameter.

The default value of 0 means that such sessions will not be
terminated.
-- 
Vik


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Mon, Jun 2, 2014 at 10:00 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Or something else; I don't claim to be a good UI designer.  But in the
 end, this is 90% a UI problem, and that means that raw SQL is seriously
 poorly suited to solve it directly.

 I guess I don't agree that is 90% a UI problem.  There's currently no
 mechanism whatsoever by means of which a user can change the data type
 of a column upon which a view depends.

Sure there is: I already illustrated it.  You can temporarily set the
view to some dummy definition that doesn't reference the target table,
then do the ALTER COLUMN TYPE, then redefine the view the way you want.
Wrap it up in a transaction, and it's even transparent.

Now, what that doesn't do is let you change the output column type(s)
of the view, but I'd argue that entirely independently of this problem
it'd be reasonable for CREATE OR REPLACE VIEW to allow changing a column
type if the view is unreferenced (ie, basically the same conditions under
which a table column type can be changed today).

If you want to argue that this is unnecessarily complex, you can do so,
but claiming that it's not possible is simply false.  I stand by the point
that what we lack is a sane UI for helping in complex cases --- and
nothing done behind-the-scenes in ALTER TABLE is going to qualify as
a sane UI.  The complexity in this approach would be easily hidden in
a support tool, which will have much bigger problems to solve than whether
its eventual command to the backend requires multiple SQL steps.

 If we had such a mechanism,
 then perhaps someone could build a UI providing the sort of user
 feedback you're suggesting to help them use it more safely.  But isn't
 the core server support the first thing?

I'm guessing you did not read
http://www.postgresql.org/message-id/18723.1401734...@sss.pgh.pa.us

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Fujii Masao
Hi,

I got the bug report of pg_basebackup off-list that it causes an error
when there is large file (e.g., 4GB) in the database cluster. It's easy
to reproduce this problem.

$ dd if=/dev/zero of=$PGDATA/test bs=1G count=4
$ pg_basebackup -D hoge -c fast
pg_basebackup: invalid tar block header size: 32768

2014-06-03 22:56:50 JST data LOG:  could not send data to client: Broken pipe
2014-06-03 22:56:50 JST data ERROR:  base backup could not send data,
aborting backup
2014-06-03 22:56:50 JST data FATAL:  connection to client lost

The cause of this problem is that pg_basebackup uses an integer to
store the size of the file to receive from the server and an integer
overflow can happen when the file is very large. I think that
pg_basebackup should be able to handle even such large file properly
because it can exist in the database cluster, for example,
the server log file under $PGDATA/pg_log can be such large one.
Attached patch changes pg_basebackup so that it uses uint64 to store
the file size and doesn't cause an integer overflow.

Thought?

Regards,

-- 
Fujii Masao
From d42fb3ebdd144e7302196ba02a8aab0c51094f24 Mon Sep 17 00:00:00 2001
From: MasaoFujii masao.fu...@gmail.com
Date: Tue, 3 Jun 2014 22:29:36 +0900
Subject: [PATCH] Fix pg_basebackup so that it can back up even large file.

So far, pg_basebackup used an integer variable to store the size of
the file to receive from the server. If the file size was too large
to fall within the range of an integer, an integer overflow would
happen and then pg_basebackup failed to back up that large file.

pg_basebackup should be able to handle even such large file properly
because it can exist in the database cluster, for example,
the server log file under $PGDATA/pg_log can be such large one.
This commit changes pg_basebackup so that it uses uint64 to store
the file size and doesn't cause an integer overflow.

Back-patch to 9.1.
---
 src/bin/pg_basebackup/pg_basebackup.c |4 ++--
 1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index b119fc0..959f0c0 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -1150,7 +1150,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
 {
 	char		current_path[MAXPGPATH];
 	char		filename[MAXPGPATH];
-	int			current_len_left;
+	uint64		current_len_left;
 	int			current_padding = 0;
 	bool		basetablespace = PQgetisnull(res, rownum, 0);
 	char	   *copybuf = NULL;
@@ -1216,7 +1216,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, int rownum)
 			}
 			totaldone += 512;
 
-			if (sscanf(copybuf + 124, %11o, current_len_left) != 1)
+			if (sscanf(copybuf + 124, %11lo, current_len_left) != 1)
 			{
 fprintf(stderr, _(%s: could not parse file size\n),
 		progname);
-- 
1.7.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I didn't reall look at the patch, but it very much looks to me like that
 query result could use the \a\t treatment that rules.sql and
 sanity_check.sql got. It's hard to see the actual difference
 before/after the patch.
 I'll patch that now, to reduce the likelihood of changes there causing
 conflicts for more people.

Personally, I would wonder why the regression tests contain such a query
in the first place.  It seems like nothing but a major maintenance PITA.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Andres Freund
On 2014-06-03 10:24:46 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I didn't reall look at the patch, but it very much looks to me like that
  query result could use the \a\t treatment that rules.sql and
  sanity_check.sql got. It's hard to see the actual difference
  before/after the patch.
  I'll patch that now, to reduce the likelihood of changes there causing
  conflicts for more people.
 
 Personally, I would wonder why the regression tests contain such a query
 in the first place.  It seems like nothing but a major maintenance PITA.

I haven't added it, but it seems appropriate in that specific case. The
number of leakproof functions should be fairly small and every addition
should be carefully reviewed... I am e.g. not sure that it's a good idea
to declare network_smaller/greater as leakproof - but it's hard to catch
that on the basic of pg_proc.h alone.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Andres Freund
Hi

On 2014-06-03 23:19:37 +0900, Fujii Masao wrote:
 I got the bug report of pg_basebackup off-list that it causes an error
 when there is large file (e.g., 4GB) in the database cluster. It's easy
 to reproduce this problem.
 
 The cause of this problem is that pg_basebackup uses an integer to
 store the size of the file to receive from the server and an integer
 overflow can happen when the file is very large. I think that
 pg_basebackup should be able to handle even such large file properly
 because it can exist in the database cluster, for example,
 the server log file under $PGDATA/pg_log can be such large one.
 Attached patch changes pg_basebackup so that it uses uint64 to store
 the file size and doesn't cause an integer overflow.

Right, makes sense.

 ---
  src/bin/pg_basebackup/pg_basebackup.c |4 ++--
  1 files changed, 2 insertions(+), 2 deletions(-)
 
 diff --git a/src/bin/pg_basebackup/pg_basebackup.c 
 b/src/bin/pg_basebackup/pg_basebackup.c
 index b119fc0..959f0c0 100644
 --- a/src/bin/pg_basebackup/pg_basebackup.c
 +++ b/src/bin/pg_basebackup/pg_basebackup.c
 @@ -1150,7 +1150,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, 
 int rownum)
  {
   charcurrent_path[MAXPGPATH];
   charfilename[MAXPGPATH];
 - int current_len_left;
 + uint64  current_len_left;

Any reason you changed the signedness here instead of just using a
int64? Did you review all current users?

   int current_padding = 0;
   boolbasetablespace = PQgetisnull(res, rownum, 0);
   char   *copybuf = NULL;
 @@ -1216,7 +1216,7 @@ ReceiveAndUnpackTarFile(PGconn *conn, PGresult *res, 
 int rownum)
   }
   totaldone += 512;
  
 - if (sscanf(copybuf + 124, %11o, current_len_left) != 
 1)
 + if (sscanf(copybuf + 124, %11lo, current_len_left) 
 != 1)

That's probably not going to work on 32bit platforms or windows where
you might need to use ll instead of l as a prefix. Also notice that
apparently (c.f. 9d7ded0f4277f5c0063eca8e871a34e2355a8371) sscanf can't
reliably be used for 64bit input :(. That pretty much sucks...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-03 10:24:46 -0400, Tom Lane wrote:
 Personally, I would wonder why the regression tests contain such a query
 in the first place.  It seems like nothing but a major maintenance PITA.

 I haven't added it, but it seems appropriate in that specific case. The
 number of leakproof functions should be fairly small and every addition
 should be carefully reviewed... I am e.g. not sure that it's a good idea
 to declare network_smaller/greater as leakproof - but it's hard to catch
 that on the basic of pg_proc.h alone.

Meh.  I agree that new leakproof functions should be carefully reviewed,
but I have precisely zero faith that this regression test will contribute
to that.  It hasn't even got a comment saying why changes here should
receive any scrutiny; moreover, it's not in a file where changes would be
likely to excite suspicion.  (Probably it should be in opr_sanity, if
we're going to have such a thing at all.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] strtoll/strtoull emulation

2014-06-03 Thread Andres Freund
Hi,

I recently had the need to use strtoull() in postgres code. Only to
discover that that's not available on some platforms. IIRC windows/msvc
was one of them. Now 9d7ded0f4277f5c0063eca8e871a34e2355a8371 added
another user - guarded by HAVE_STRTOULL. That commit will make things
worse on windows btw...
How about adding emulation for strtoll/strtoull to port/? The BSDs have
easily crib-able functions available...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-03 23:19:37 +0900, Fujii Masao wrote:
 -if (sscanf(copybuf + 124, %11o, current_len_left) != 
 1)
 +if (sscanf(copybuf + 124, %11lo, current_len_left) 
 != 1)

 That's probably not going to work on 32bit platforms or windows where
 you might need to use ll instead of l as a prefix. Also notice that
 apparently (c.f. 9d7ded0f4277f5c0063eca8e871a34e2355a8371) sscanf can't
 reliably be used for 64bit input :(. That pretty much sucks...

There's a far bigger problem there, which is if we're afraid that
current_len_left might exceed 4GB then what is it exactly that guarantees
it'll fit in an 11-digit field?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for CSN based snapshots

2014-06-03 Thread Greg Stark
On Tue, Jun 3, 2014 at 2:55 PM, Robert Haas robertmh...@gmail.com wrote:
 That doesn't address Bruce's concern about CLOG disk consumption.

Well we would only need the xid-lsn mapping for transactions since
globalxmin. Anything older we would just need the committed bit. So we
could maintain two structures, one like our current clog going back
until the freeze_max_age and one with 32-bits (or 64 bits?) per xid
but only going back as far as the globalxmin. There are a myriad of
compression techniques we could use on a sequence of mostly similar
mostly increasing numbers in a small range too. But I suspect they
wouldn't really be necessary.

Here's another thought. I don't see how to make this practical but it
would be quite convenient if it could be made so. The first time an
xid is looked up and is determined to be committed then all we'll ever
need in the future is the lsn. If we could replace the xid with the
LSN of the commit record right there in the page then future viewers
would be able to determine if it's visible without looking in the clog
or this new clog xid-lsn mapping. If that was the full LSN it would
never need to be frozen either. The problem with this is that the LSN
is big and actually moves faster than the xid. We could play the same
games with putting the lsn segment in the page header but it's
actually entirely feasible to have a snapshot that extends over
several segments. Way easier than a snapshot that extends over several
xid epochs. (This does make having a CSN like Oracle kind of tempting
after all)

-- 
greg


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Magnus Hagander
On Tue, Jun 3, 2014 at 4:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-03 23:19:37 +0900, Fujii Masao wrote:
  -if (sscanf(copybuf + 124, %11o,
 current_len_left) != 1)
  +if (sscanf(copybuf + 124, %11lo,
 current_len_left) != 1)

  That's probably not going to work on 32bit platforms or windows where
  you might need to use ll instead of l as a prefix. Also notice that
  apparently (c.f. 9d7ded0f4277f5c0063eca8e871a34e2355a8371) sscanf can't
  reliably be used for 64bit input :(. That pretty much sucks...

 There's a far bigger problem there, which is if we're afraid that
 current_len_left might exceed 4GB then what is it exactly that guarantees
 it'll fit in an 11-digit field?


Well, we will only write 11 digits in there, that's when we read it. But
print_val() on the server side should probably have an overflow check
there, which it doesn't. It's going to write some strange values int here
if it overflows..

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Andres Freund
On 2014-06-03 10:37:53 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-03 10:24:46 -0400, Tom Lane wrote:
  Personally, I would wonder why the regression tests contain such a query
  in the first place.  It seems like nothing but a major maintenance PITA.
 
  I haven't added it, but it seems appropriate in that specific case. The
  number of leakproof functions should be fairly small and every addition
  should be carefully reviewed... I am e.g. not sure that it's a good idea
  to declare network_smaller/greater as leakproof - but it's hard to catch
  that on the basic of pg_proc.h alone.
 
 Meh.  I agree that new leakproof functions should be carefully reviewed,
 but I have precisely zero faith that this regression test will contribute
 to that.

Well, I personally wouldn't have noticed that the OP's patch marked the
new functions as leakproof without that test. At least not while looking
at the patch. pg_proc.h is just too hard to read.

 It hasn't even got a comment saying why changes here should
 receive any scrutiny; moreover, it's not in a file where changes would be
 likely to excite suspicion.  (Probably it should be in opr_sanity, if
 we're going to have such a thing at all.)

I don't object to moving it there.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strtoll/strtoull emulation

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 I recently had the need to use strtoull() in postgres code. Only to
 discover that that's not available on some platforms. IIRC windows/msvc
 was one of them. Now 9d7ded0f4277f5c0063eca8e871a34e2355a8371 added
 another user - guarded by HAVE_STRTOULL. That commit will make things
 worse on windows btw...

Worse than what?  AFAICT, the old code would produce complete garbage
on Windows.  The new code at least gives the right answer for rowcounts
up to 4GB.

 How about adding emulation for strtoll/strtoull to port/? The BSDs have
 easily crib-able functions available...

Ugh.  Surely Windows has got *some* equivalent, perhaps named differently?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposing pg_hibernate

2014-06-03 Thread Gurjeet Singh
On Tue, Jun 3, 2014 at 8:13 AM, Gurjeet Singh gurj...@singh.im wrote:
 On Tue, Jun 3, 2014 at 7:57 AM, Robert Haas robertmh...@gmail.com wrote:
 On Thu, May 29, 2014 at 12:12 AM, Amit Kapila amit.kapil...@gmail.com 
 wrote:
 IMHO, all of these caveats, would affect a very small fraction of
 use-cases and are eclipsed by the benefits this extension provides in
 normal cases.

 I agree with you that there are only few corner cases where evicting
 shared buffers by this utility would harm, but was wondering if we could
 even save those, say if it would only use available free buffers.  I think
 currently there is no such interface and inventing a new interface for this
 case doesn't seem to reasonable unless we see any other use case of
 such a interface.

 It seems like it would be best to try to do this at cluster startup
 time, rather than once recovery has reached consistency.  Of course,
 that might mean doing it with a single process, which could have its
 own share of problems.  But I'm somewhat inclined to think that if

Currently pg_hibernator uses ReadBufferExtended() API, and AIUI, that
API requires a database connection//shared-memory attachment, and that
a backend process cannot switch between databases after the initial
connection.

 own share of problems.  But I'm somewhat inclined to think that if
 recovery has already run for a significant period of time, the blocks
 that recovery has brought into shared_buffers are more likely to be
 useful than whatever pg_hibernate would load.

The applications that connect to a standby may have a different access
pattern than the applications that are operating on the master
database. So the buffers that are being restored by startup process
may not be relevant to the workload on the standby.

Best regards,
-- 
Gurjeet Singh http://gurjeet.singh.im/

EDB www.EnterpriseDB.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 On Tue, Jun 3, 2014 at 4:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 There's a far bigger problem there, which is if we're afraid that
 current_len_left might exceed 4GB then what is it exactly that guarantees
 it'll fit in an 11-digit field?

 Well, we will only write 11 digits in there, that's when we read it. But
 print_val() on the server side should probably have an overflow check
 there, which it doesn't. It's going to write some strange values int here
 if it overflows..

My point is that having backups crash on an overflow doesn't really seem
acceptable.  IMO we need to reconsider the basebackup protocol and make
sure we don't *need* to put values over 4GB into this field.  Where's the
requirement coming from anyway --- surely all files in PGDATA ought to be
1GB max?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strtoll/strtoull emulation

2014-06-03 Thread Andres Freund
On 2014-06-03 10:55:17 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  I recently had the need to use strtoull() in postgres code. Only to
  discover that that's not available on some platforms. IIRC windows/msvc
  was one of them. Now 9d7ded0f4277f5c0063eca8e871a34e2355a8371 added
  another user - guarded by HAVE_STRTOULL. That commit will make things
  worse on windows btw...
 
 Worse than what?  AFAICT, the old code would produce complete garbage
 on Windows.  The new code at least gives the right answer for rowcounts
 up to 4GB.

Hm? Wouldn't it only have produced garbage on mingw and not msvc?

  How about adding emulation for strtoll/strtoull to port/? The BSDs have
  easily crib-able functions available...
 
 Ugh.  Surely Windows has got *some* equivalent, perhaps named differently?

Apparently they've added strtoull()/stroll() to msvc 2013... And there's
_strtoui64() which seems to have already existed a while back.

But it seems easier to me to add one fallback centrally somewhere that
works on all platforms. I am not sure that msvc is the only platform
missing strtoull() - although I didn't find anything relevant in a quick
search through the buildfarm. So maybe I am worrying over nothing.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Andres Freund
On 2014-06-03 11:04:58 -0400, Tom Lane wrote:
 Magnus Hagander mag...@hagander.net writes:
  On Tue, Jun 3, 2014 at 4:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  There's a far bigger problem there, which is if we're afraid that
  current_len_left might exceed 4GB then what is it exactly that guarantees
  it'll fit in an 11-digit field?
 
  Well, we will only write 11 digits in there, that's when we read it. But
  print_val() on the server side should probably have an overflow check
  there, which it doesn't. It's going to write some strange values int here
  if it overflows..
 
 My point is that having backups crash on an overflow doesn't really seem
 acceptable.  IMO we need to reconsider the basebackup protocol and make
 sure we don't *need* to put values over 4GB into this field.  Where's the
 requirement coming from anyway --- surely all files in PGDATA ought to be
 1GB max?

Fujii's example was logfiles in pg_log. But we allow to change the
segment size via a configure flag, so we should support that or remove
the ability to change the segment size...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strtoll/strtoull emulation

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-03 10:55:17 -0400, Tom Lane wrote:
 Ugh.  Surely Windows has got *some* equivalent, perhaps named differently?

 Apparently they've added strtoull()/stroll() to msvc 2013... And there's
 _strtoui64() which seems to have already existed a while back.

 But it seems easier to me to add one fallback centrally somewhere that
 works on all platforms. I am not sure that msvc is the only platform
 missing strtoull() - although I didn't find anything relevant in a quick
 search through the buildfarm. So maybe I am worrying over nothing.

It used to be called strtouq on some really old platforms, but we already
have code to deal with that naming.

I checked my pet dinosaur HPUX box, and it has HAVE_LONG_LONG_INT64 but
not HAVE_STRTOULL.  It's very possibly the last such animal in captivity
though.  I'm not really sure it's worth carrying a port file just to keep
that platform alive.

Another issue is that strtoull() is not necessarily what we want anyway:
what we want is the function corresponding to uint64, which on most
modern platforms is going to be strtoul().

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Magnus Hagander
On Tue, Jun 3, 2014 at 5:12 PM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-06-03 11:04:58 -0400, Tom Lane wrote:
  Magnus Hagander mag...@hagander.net writes:
   On Tue, Jun 3, 2014 at 4:40 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   There's a far bigger problem there, which is if we're afraid that
   current_len_left might exceed 4GB then what is it exactly that
 guarantees
   it'll fit in an 11-digit field?
 
   Well, we will only write 11 digits in there, that's when we read it.
 But
   print_val() on the server side should probably have an overflow check
   there, which it doesn't. It's going to write some strange values int
 here
   if it overflows..
 
  My point is that having backups crash on an overflow doesn't really seem
  acceptable.  IMO we need to reconsider the basebackup protocol and make
  sure we don't *need* to put values over 4GB into this field.  Where's the
  requirement coming from anyway --- surely all files in PGDATA ought to be
  1GB max?

 Fujii's example was logfiles in pg_log. But we allow to change the
 segment size via a configure flag, so we should support that or remove
 the ability to change the segment size...


With Fujii's fix, the new limit is 8GB, per the tar standard.

To get around that we could use the star extension or whatever it's
actually called (implemented by both gnu and bsd tar) , to maintain
compatibility.

Of course, a quick fix would be to just reject files 8GB - that may be
what we have to do backpatchable.

IIRC we discussed this at the original tmie and said it would not be needed
because such files shouldnt' be there - clearly we forgot to consider
logfiles..


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] strtoll/strtoull emulation

2014-06-03 Thread Andres Freund
On 2014-06-03 11:28:22 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-03 10:55:17 -0400, Tom Lane wrote:
  Ugh.  Surely Windows has got *some* equivalent, perhaps named differently?
 
  Apparently they've added strtoull()/stroll() to msvc 2013... And there's
  _strtoui64() which seems to have already existed a while back.
 
  But it seems easier to me to add one fallback centrally somewhere that
  works on all platforms. I am not sure that msvc is the only platform
  missing strtoull() - although I didn't find anything relevant in a quick
  search through the buildfarm. So maybe I am worrying over nothing.
 
 It used to be called strtouq on some really old platforms, but we already
 have code to deal with that naming.

So I guess we can just add a similar #define for _strto[u]i64 somewhere
in port/. And then fail if strtoull isn't available.

 I checked my pet dinosaur HPUX box, and it has HAVE_LONG_LONG_INT64 but
 not HAVE_STRTOULL.  It's very possibly the last such animal in captivity
 though.  I'm not really sure it's worth carrying a port file just to keep
 that platform alive.

I don't dare to make a judgement on that ;)

 Another issue is that strtoull() is not necessarily what we want anyway:
 what we want is the function corresponding to uint64, which on most
 modern platforms is going to be strtoul().

How about adding pg_strto[u]int(32|64) to deal with that? Mabye without
the pg_ prefix?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-03 11:04:58 -0400, Tom Lane wrote:
 My point is that having backups crash on an overflow doesn't really seem
 acceptable.  IMO we need to reconsider the basebackup protocol and make
 sure we don't *need* to put values over 4GB into this field.  Where's the
 requirement coming from anyway --- surely all files in PGDATA ought to be
 1GB max?

 Fujii's example was logfiles in pg_log. But we allow to change the
 segment size via a configure flag, so we should support that or remove
 the ability to change the segment size...

What we had better do, IMO, is fix things so that we don't have a filesize
limit in the basebackup format.  After a bit of googling, I found out that
recent POSIX specs for tar format include extended headers that among
other things support member files of unlimited size [1].  Rather than
fooling with partial fixes, we should make the basebackup logic use an
extended header when the file size is over INT_MAX.

regards, tom lane

[1] http://pubs.opengroup.org/onlinepubs/9699919799/
see pax under shells  utilities


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Magnus Hagander
On Tue, Jun 3, 2014 at 5:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-03 11:04:58 -0400, Tom Lane wrote:
  My point is that having backups crash on an overflow doesn't really seem
  acceptable.  IMO we need to reconsider the basebackup protocol and make
  sure we don't *need* to put values over 4GB into this field.  Where's
 the
  requirement coming from anyway --- surely all files in PGDATA ought to
 be
  1GB max?

  Fujii's example was logfiles in pg_log. But we allow to change the
  segment size via a configure flag, so we should support that or remove
  the ability to change the segment size...

 What we had better do, IMO, is fix things so that we don't have a filesize
 limit in the basebackup format.  After a bit of googling, I found out that
 recent POSIX specs for tar format include extended headers that among
 other things support member files of unlimited size [1].  Rather than
 fooling with partial fixes, we should make the basebackup logic use an
 extended header when the file size is over INT_MAX.


Yeah, pax seems to be the way to go. It's at least supported by GNU tar -
is it also supported on say BSD, or other popular platforms? (The size
extension in the general ustar format seems to be, so it would be a shame
if this one is less portable)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Andres Freund
On 2014-06-03 11:42:49 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  On 2014-06-03 11:04:58 -0400, Tom Lane wrote:
  My point is that having backups crash on an overflow doesn't really seem
  acceptable.  IMO we need to reconsider the basebackup protocol and make
  sure we don't *need* to put values over 4GB into this field.  Where's the
  requirement coming from anyway --- surely all files in PGDATA ought to be
  1GB max?
 
  Fujii's example was logfiles in pg_log. But we allow to change the
  segment size via a configure flag, so we should support that or remove
  the ability to change the segment size...
 
 What we had better do, IMO, is fix things so that we don't have a filesize
 limit in the basebackup format.

Agreed. I am just saying that we either need to support that case *or*
remove configurations where such large files are generated. But the
former is clearly preferrable since other files can cause large files to
exist as well.

 After a bit of googling, I found out that
 recent POSIX specs for tar format include extended headers that among
 other things support member files of unlimited size [1].  Rather than
 fooling with partial fixes, we should make the basebackup logic use an
 extended header when the file size is over INT_MAX.

That sounds neat enough. I guess we'd still add code to error out with
larger files for = 9.4?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Andres Freund
On 2014-06-03 17:57:52 +0200, Magnus Hagander wrote:
 On Tue, Jun 3, 2014 at 5:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  What we had better do, IMO, is fix things so that we don't have a filesize
  limit in the basebackup format.  After a bit of googling, I found out that
  recent POSIX specs for tar format include extended headers that among
  other things support member files of unlimited size [1].  Rather than
  fooling with partial fixes, we should make the basebackup logic use an
  extended header when the file size is over INT_MAX.

 Yeah, pax seems to be the way to go. It's at least supported by GNU tar -
 is it also supported on say BSD, or other popular platforms? (The size
 extension in the general ustar format seems to be, so it would be a shame
 if this one is less portable)

PG's tar.c already uses the ustar format and the referenced extension is
an extension to ustar as far as I understand it. So at least tarballs
with files  8GB would still continue to be readable with all currently
working implementations.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Magnus Hagander
On Jun 3, 2014 6:17 PM, Andres Freund and...@2ndquadrant.com wrote:

 On 2014-06-03 17:57:52 +0200, Magnus Hagander wrote:
  On Tue, Jun 3, 2014 at 5:42 PM, Tom Lane t...@sss.pgh.pa.us wrote:
   What we had better do, IMO, is fix things so that we don't have a
filesize
   limit in the basebackup format.  After a bit of googling, I found out
that
   recent POSIX specs for tar format include extended headers that
among
   other things support member files of unlimited size [1].  Rather than
   fooling with partial fixes, we should make the basebackup logic use an
   extended header when the file size is over INT_MAX.

  Yeah, pax seems to be the way to go. It's at least supported by GNU tar
-
  is it also supported on say BSD, or other popular platforms? (The size
  extension in the general ustar format seems to be, so it would be a
shame
  if this one is less portable)

 PG's tar.c already uses the ustar format and the referenced extension is
 an extension to ustar as far as I understand it. So at least tarballs
 with files  8GB would still continue to be readable with all currently
 working implementations.

Yeah, that is a clear advantage of that method. Didn't read up on pax
format backwards compatibility, does it have some trick to achieve
something similar?

/Magnus


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Andres Freund
On 2014-06-03 18:23:07 +0200, Magnus Hagander wrote:
 On Jun 3, 2014 6:17 PM, Andres Freund and...@2ndquadrant.com wrote:
  PG's tar.c already uses the ustar format and the referenced extension is
  an extension to ustar as far as I understand it. So at least tarballs
  with files  8GB would still continue to be readable with all currently
  working implementations.
 
 Yeah, that is a clear advantage of that method. Didn't read up on pax
 format backwards compatibility, does it have some trick to achieve
 something similar?

It just introduces a new file type 'x' that's only used when extended
features are needed. That file type then contains the extended header.
So the normal ustar header is used for small files, and if more is
needed an *additional* extended header is added.
Check:
http://pubs.opengroup.org/onlinepubs/9699919799/utilities/pax.html#tag_20_92_13_01

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup failed to back up large file

2014-06-03 Thread Tom Lane
Magnus Hagander mag...@hagander.net writes:
 Yeah, that is a clear advantage of that method. Didn't read up on pax
 format backwards compatibility, does it have some trick to achieve
 something similar?

I didn't read the fine print but it sounded like the extended header
would look like a separate file entry to a non-aware tar implementation,
which would write it out as a file and then get totally confused when
the length specified in the overlength file's entry didn't match the
amount of data following.  So it's a nice solution for some properties
but doesn't fail-soft for file length.  Not clear that there's any way
to achieve that though.

Another thought is we could make pg_basebackup simply skip any files that
exceed RELSEG_SIZE, on the principle that you don't really need/want
enormous log files to get copied anyhow.  We'd still need the pax
extension if the user had configured large RELSEG_SIZE, but having a
compatible tar could be documented as a requirement of doing that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SP-GiST bug.

2014-06-03 Thread Tom Lane
Teodor Sigaev teo...@sigaev.ru writes:
 The point I'm making is that the scenario your test case exposes is not
 an infinite loop of picksplits, but an infinite loop of choose calls.

 Thank you, now I see what I missed before. After some brainstorming, it's 
 possible to use '\0' only as end of string marker.  The idea is: when we 
 split 
 allthesame tuple to upper/lower we copy node label to upper tuple instead of 
 set 
 it to '\0'. Node labels of lower tuple will be unchanged but should be 
 ignored 
 for reconstructionValue - and they are ignored because of allTheSame flag.  
 See 
 attached patch.

I don't believe this patch works at all.  In particular, for an allTheSame
node occurring at the root level, omitting the nodeChar from the
reconstructed value is certainly wrong since there was no parent that
could have had a label containing the same character.  More generally, the
patch is assuming that any allTheSame tuple is one that is underneath a
split node; but that's surely wrong (where did such a tuple come from
before the split happened?).

If the root case were the only possible one then we could fix the problem
by adding a test on whether level == 0, but AFAICS, allTheSame tuples can
also get created at lower tree levels.  All you need is a bunch of strings
that are duplicates for more than the maximum prefix length.

I think what we have to do is use a different dummy value for the node
label of a pushed-down allTheSame tuple than we do for end-of-string.
This requires widening the node labels from uint8 to (at least) int16.
However, I think that's essentially free because pass-by-value node
labels are stored as Datums anyway.  In fact, it might even be upward
compatible, at least for cases that aren't broken today.

 I rewrited a patch to fix missed way - allTheSame result of picksplit and 
 tooBig 
 is set. I believe this patch is still needed because it could make a tree 
 more 
 efficient as it was demonstrated for quad tree.

The question here continues to be how sure we are that the case described
in checkAllTheSame's header comment can't actually occur.  We certainly
thought it could when we originally wrote this stuff.  I think possibly
we were wrong, but I'd like to see a clear proof of that before we discuss
dropping the logic that's meant to cope with the situation.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Andres Freund
Hi,

In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
printed. Should we perhaps do the same for 'Execution time'? That'd make
it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
regression tests.

Currently the output for that is:
postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
   QUERY PLAN

 Result (actual rows=1 loops=1)
 Total runtime: 0.035 ms
(2 rows)

Leaving off the total runtime doesn't seem bad to me.

Greetings,

Andres Freund

--
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread David G Johnston
Vik Fearing wrote
 On 06/03/2014 03:30 PM, Abhijit Menon-Sen wrote:
 At 2014-06-03 15:06:11 +0200, 

 vik.fearing@

  wrote:
 This patch implements a timeout for broken clients that idle in
 transaction.
 I think this is a nice feature, but I suggest that (at the very least)
 the GUC should be named idle_transaction_timeout.
 
 I prefer the way I have it, but not enough to put up a fight if other
 people like your version better.
 
 +
 para
 + Terminate any session that is idle in transaction for longer
 than the specified
 + number of seconds.  This not only allows any locks to be
 released, but also allows
 + the connection slot to be reused.  However, aborted idle in
 transaction sessions
 + are not affected.  A value of zero (the default) turns this
 off.
 +
 /para
 I suggest:

 Terminate any session with an open transaction that has been idle
 for longer than the specified duration in seconds. This allows any
 locks to be released and the connection slot to be reused.

 It's not clear to me what However, aborted idle in transaction sessions
 are not affected means.

 The default value of 0 means that such sessions will not be
 terminated.
 
 How about this?
 
 Terminate any session with an open transaction that has been idle
 for longer than the specified duration in seconds. This allows any
 locks to be released and the connection slot to be reused.
 
 Sessions in the state idle in transaction (aborted) occupy a
 connection slot but because they hold no locks, they are not
 considered by this parameter.
 
 The default value of 0 means that such sessions will not be
 terminated.
 -- 
 Vik

I do not see any reason for an aborted idle in transaction session to be
treated differently.

Given the similarity to statement_timeout using similar wording for both
would be advised.

*Statement Timeout:*
Abort any statement that takes more than the specified number of
milliseconds, starting from the time the command arrives at the server from
the client. If log_min_error_statement is set to ERROR or lower, the
statement that timed out will also be logged. A value of zero (the default)
turns this off.

Setting statement_timeout in postgresql.conf is not recommended because it
would affect all sessions.

*Idle Transaction Timeout:*

 Disconnect any session that has been idle in transaction (including
 aborted) for more than the specified number of milliseconds, starting from
 {however such is determined}.  
 
 A value of zero (the default) turns this off.
 
 Typical usage would be to set this to a small positive number in
 postgresql.conf and require sessions that expect long-running, idle,
 transactions to set it back to zero or some reasonable higher value.

While seconds is the better unit of measure I don't think that justifies
making this different from statement_timeout.  In either case users can
specify their own units.


Since we are killing the entire session, not just the transaction, a better
label would:

idle_transaction_session_timeout

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/idle-in-transaction-timeout-tp5805859p5805904.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
 printed. Should we perhaps do the same for 'Execution time'? That'd make
 it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
 regression tests.

 Currently the output for that is:
 postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
QUERY PLAN
 
  Result (actual rows=1 loops=1)
  Total runtime: 0.035 ms
 (2 rows)

 Leaving off the total runtime doesn't seem bad to me.

It seems a little weird to call it a cost ... but maybe that
ship has sailed given how we're treating the planning-time item.

I'm unconvinced that this'd add much to our regression testing capability,
though.  The standard thing is to do an EXPLAIN to check the plan shape
and then run the query to see if it gets the right answer.  Checking row
counts is pretty well subsumed by the latter, and is certainly not an
adequate substitute for it.

So on the whole, -1 ... this is an unintuitive and
non-backwards-compatible change that doesn't look like it buys much.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Andres Freund
On 2014-06-03 15:08:15 -0400, Tom Lane wrote:
 Andres Freund and...@2ndquadrant.com writes:
  In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
  printed. Should we perhaps do the same for 'Execution time'? That'd make
  it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
  regression tests.
 
  Currently the output for that is:
  postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
 QUERY PLAN
  
   Result (actual rows=1 loops=1)
   Total runtime: 0.035 ms
  (2 rows)
 
  Leaving off the total runtime doesn't seem bad to me.
 
 It seems a little weird to call it a cost ... but maybe that
 ship has sailed given how we're treating the planning-time item.

It's not what I'd have choosen when starting afresh, but as you say...

 I'm unconvinced that this'd add much to our regression testing capability,
 though.  The standard thing is to do an EXPLAIN to check the plan shape
 and then run the query to see if it gets the right answer.  Checking row
 counts is pretty well subsumed by the latter, and is certainly not an
 adequate substitute for it.

The specific case I wanted it for was to test that a CREATE INDEX in a
specific situation actually has indexed a recently dead row. That can be
made visible via bitmap index scans... Generally index vs heap cases
aren't that easy to check with just the toplevel result.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Robert Haas
On Tue, Jun 3, 2014 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
 printed. Should we perhaps do the same for 'Execution time'? That'd make
 it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
 regression tests.

 Currently the output for that is:
 postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
QUERY PLAN
 
  Result (actual rows=1 loops=1)
  Total runtime: 0.035 ms
 (2 rows)

 Leaving off the total runtime doesn't seem bad to me.

 It seems a little weird to call it a cost ... but maybe that
 ship has sailed given how we're treating the planning-time item.

Maybe we could make it be controlled by TIMING.  Seems like it fits
well-enough there.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Keith Fiske
On Tue, Jun 3, 2014 at 10:48 AM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-06-03 10:37:53 -0400, Tom Lane wrote:
  Andres Freund and...@2ndquadrant.com writes:
   On 2014-06-03 10:24:46 -0400, Tom Lane wrote:
   Personally, I would wonder why the regression tests contain such a
 query
   in the first place.  It seems like nothing but a major maintenance
 PITA.
 
   I haven't added it, but it seems appropriate in that specific case. The
   number of leakproof functions should be fairly small and every addition
   should be carefully reviewed... I am e.g. not sure that it's a good
 idea
   to declare network_smaller/greater as leakproof - but it's hard to
 catch
   that on the basic of pg_proc.h alone.
 
  Meh.  I agree that new leakproof functions should be carefully reviewed,
  but I have precisely zero faith that this regression test will contribute
  to that.

 Well, I personally wouldn't have noticed that the OP's patch marked the
 new functions as leakproof without that test. At least not while looking
 at the patch. pg_proc.h is just too hard to read.

  It hasn't even got a comment saying why changes here should
  receive any scrutiny; moreover, it's not in a file where changes would be
  likely to excite suspicion.  (Probably it should be in opr_sanity, if
  we're going to have such a thing at all.)

 I don't object to moving it there.

 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Andres's changes on June 3rd to
https://github.com/postgres/postgres/commits/master/src/test/regress/expected/create_function_3.out
are causing patch v2 to fail for that regression test file.

postgres $ patch -p1 -i ../inet_agg_v2.patch
patching file src/backend/utils/adt/network.c
patching file src/include/catalog/pg_aggregate.h
patching file src/include/catalog/pg_proc.h
patching file src/include/utils/builtins.h
patching file src/test/regress/expected/create_function_3.out
Hunk #1 FAILED at 153.
1 out of 1 hunk FAILED -- saving rejects to file
src/test/regress/expected/create_function_3.out.rej
patching file src/test/regress/expected/inet.out
patching file src/test/regress/sql/inet.sql

Otherwise it applies without any issues to the latest HEAD. I built and
started a new instance, and I was able to test at least the basic min/max
functionality

keith=# create table test_inet (id serial, ipaddress inet);
CREATE TABLE
Time: 25.546 ms
keith=# insert into test_inet (ipaddress) values ('192.168.1.1');
INSERT 0 1
Time: 3.143 ms
keith=# insert into test_inet (ipaddress) values ('192.168.1.2');
INSERT 0 1
Time: 2.932 ms
keith=# insert into test_inet (ipaddress) values ('127.0.0.1');
INSERT 0 1
Time: 1.786 ms
keith=# select min(ipaddress) from test_inet;
min
---
 127.0.0.1
(1 row)

Time: 3.371 ms
keith=# select max(ipaddress) from test_inet;
 max
-
 192.168.1.2
(1 row)

Time: 1.104 ms

--
Keith Fiske
Database Administrator
OmniTI Computer Consulting, Inc.
http://www.keithf4.com


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 3, 2014 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 It seems a little weird to call it a cost ... but maybe that
 ship has sailed given how we're treating the planning-time item.

 Maybe we could make it be controlled by TIMING.  Seems like it fits
 well-enough there.

Yeah, I thought about that too; but that sacrifices capability in the name
of terminological consistency.  The point of TIMING OFF is to not pay the
very high overhead of per-node timing calls ... but that doesn't mean you
don't want the overall runtime.  And it might not be convenient to get it
via client-side measurement.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Andres Freund
On June 3, 2014 9:40:27 PM CEST, Robert Haas robertmh...@gmail.com wrote:
On Tue, Jun 3, 2014 at 3:08 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Andres Freund and...@2ndquadrant.com writes:
 In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
 printed. Should we perhaps do the same for 'Execution time'? That'd
make
 it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
 regression tests.

 Currently the output for that is:
 postgres=# EXPLAIN (ANALYZE, TIMING OFF, COSTS OFF) SELECT 1;
QUERY PLAN
 
  Result (actual rows=1 loops=1)
  Total runtime: 0.035 ms
 (2 rows)

 Leaving off the total runtime doesn't seem bad to me.

 It seems a little weird to call it a cost ... but maybe that
 ship has sailed given how we're treating the planning-time item.

Maybe we could make it be controlled by TIMING.  Seems like it fits
well-enough there.

Don't think that fits well - TIMING imo is about reducing the timing overhead. 
But the server side total time is still interesting. I only thought about 
tacking it onto COST because that already is pretty much tailored for 
regression test usage. C.F. disabling the planning time.

Andres

-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Memory deallocation after calling cast function

2014-06-03 Thread Tom Lane
Soroosh Sardari soroosh.sard...@gmail.com writes:
 I have problem with memory deallocation. look at the following queries

 1- create table test01(a) as select generate_series(1,1)::int8 ;

Do it as, eg,

create table test01(a) as select g::int8 from generate_series(1,1) g;

SRFs in the SELECT targetlist tend to leak memory; this is not easily
fixable, and nobody is likely to try hard considering the feature's on
the edge of deprecation anyhow.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Alvaro Herrera
Andres Freund wrote:
 On June 3, 2014 9:40:27 PM CEST, Robert Haas robertmh...@gmail.com wrote:

 Maybe we could make it be controlled by TIMING.  Seems like it fits
 well-enough there.
 
 Don't think that fits well - TIMING imo is about reducing the timing
 overhead. But the server side total time is still interesting. I only
 thought about tacking it onto COST because that already is pretty much
 tailored for regression test usage. C.F. disabling the planning time.

Pah.  So what we need is a new mode, REGRESSTEST ON or something.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Robert Haas
On Tue, Jun 3, 2014 at 4:02 PM, Alvaro Herrera alvhe...@2ndquadrant.com wrote:
 Andres Freund wrote:
 On June 3, 2014 9:40:27 PM CEST, Robert Haas robertmh...@gmail.com wrote:

 Maybe we could make it be controlled by TIMING.  Seems like it fits
 well-enough there.

 Don't think that fits well - TIMING imo is about reducing the timing
 overhead. But the server side total time is still interesting. I only
 thought about tacking it onto COST because that already is pretty much
 tailored for regression test usage. C.F. disabling the planning time.

 Pah.  So what we need is a new mode, REGRESSTEST ON or something.

Well, we could invent that.  But I personally think piggybacking on
COSTS makes more sense.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 3, 2014 at 4:02 PM, Alvaro Herrera alvhe...@2ndquadrant.com 
 wrote:
 Pah.  So what we need is a new mode, REGRESSTEST ON or something.

 Well, we could invent that.  But I personally think piggybacking on
 COSTS makes more sense.

I've been eagerly waiting for 8.4 to die so I could stop worrying
about how far back I can back-patch regression test cases using
explain (costs off).  It'd be really annoying to have to wait
another five years to get a consistent new spelling of how to do
that.  So yeah, let's stick to using COSTS OFF in the tests.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread Robert Haas
On Tue, Jun 3, 2014 at 10:14 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 If we had such a mechanism,
 then perhaps someone could build a UI providing the sort of user
 feedback you're suggesting to help them use it more safely.  But isn't
 the core server support the first thing?

 I'm guessing you did not read
 http://www.postgresql.org/message-id/18723.1401734...@sss.pgh.pa.us

Argh, sorry, I saw that go by and it went past my eyes but obviously I
didn't really absorb it.  I guess we could do it that way.  But it
seems like quite a hassle to me; I think we're going to continue to
get complaints here until this is Easy.  And if it can't be made Easy,
then we're going to continue to get complaints forever.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] avoiding tuple copying in btree index builds

2014-06-03 Thread Robert Haas
On Sun, Jun 1, 2014 at 3:26 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Tue, May 6, 2014 at 12:04 AM, Robert Haas robertmh...@gmail.com wrote:
 On Mon, May 5, 2014 at 2:13 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-05-05 13:52:39 -0400, Robert Haas wrote:
  Today, I discovered that when building a btree index, the btree code
  uses index_form_tuple() to create an index tuple from the heap tuple,
  calls tuplesort_putindextuple() to copy that tuple into the sort's
  memory context, and then frees the original one it built.  This seemed
  inefficient, so I wrote a patch to eliminate the tuple copying.  It
  works by adding a function tuplesort_putindextuplevalues(), which
  builds the tuple in the sort's memory context and thus avoids the need
  for a separate copy.  I'm not sure if that's the best approach, but
  the optimization seems wortwhile.
 
  Hm. It looks like we could quite easily just get rid of
  tuplesort_putindextuple(). The hash usage doesn't look hard to convert.

 I glanced at that, but it wasn't obvious to me how to convert the hash
 usage.  If you have an idea, I'm all ears.

 I also think it's possible to have similar optimization for hash index
 incase it has to spool the tuple for sorting.

 In function hashbuildCallback(), when buildstate-spool is true, we
 can avoid to form index tuple. To check for nulls before calling

 _h_spool(), we can traverse the isnull array.

Hmm, that might work.  Arguably it's less efficient, but on the other
hand if it avoids forming the tuple sometimes it might be MORE
efficient.  And anyway the difference might not be enough to matter.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json casts

2014-06-03 Thread Peter Eisentraut
On 5/28/14, 6:48 PM, Andrew Dunstan wrote:
 
 On 05/27/2014 07:25 PM, Andrew Dunstan wrote:

 On 05/27/2014 07:17 PM, Tom Lane wrote:
 Stephen Frost sfr...@snowman.net writes:
 * Andrew Dunstan (and...@dunslane.net) wrote:
 Given that this would be a hard coded behaviour change, is it too
 late to do this for 9.4?
 No, for my 2c.
 If we do it by adding casts then it'd require an initdb, so I'd vote
 against that for 9.4.  If we just change behavior in json.c then that
 objection doesn't apply, so I wouldn't complain.




 I wasn't proposing to add a cast, just to allow users to add one if
 they wanted. But I'm quite happy to go with special-casing timestamps,
 and leave the larger question for another time.


 
 
 Here's a draft patch. I'm still checking to see if there are other
 places that need to be fixed, but I think this has the main one.

This was solved back in the day for the xml type, which has essentially
the same requirement, by adding an ISO-8601-compatible output option to
EncodeDateTime().  See map_sql_value_to_xml_value() in xml.c.  You ought
to be able to reuse that.  Seems easier than routing through to_char().




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re-create dependent views on ALTER TABLE ALTER COLUMN ... TYPE?

2014-06-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 On Tue, Jun 3, 2014 at 10:14 AM, Tom Lane t...@sss.pgh.pa.us wrote:
 I'm guessing you did not read
 http://www.postgresql.org/message-id/18723.1401734...@sss.pgh.pa.us

 Argh, sorry, I saw that go by and it went past my eyes but obviously I
 didn't really absorb it.  I guess we could do it that way.  But it
 seems like quite a hassle to me; I think we're going to continue to
 get complaints here until this is Easy.  And if it can't be made Easy,
 then we're going to continue to get complaints forever.

Well, my vision of it is that it *is* easy, if you're using the tool
(or, perhaps, one of several tools), and you have a case that doesn't
really require careful semantic review.  But trying to build this sort
of thing into the backend is the wrong approach: it's going to lead
to unpleasant compromises and/or surprises.  And we'd still have to
build that tool someday.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json casts

2014-06-03 Thread Tom Lane
Peter Eisentraut pete...@gmx.net writes:
 This was solved back in the day for the xml type, which has essentially
 the same requirement, by adding an ISO-8601-compatible output option to
 EncodeDateTime().  See map_sql_value_to_xml_value() in xml.c.  You ought
 to be able to reuse that.  Seems easier than routing through to_char().

I was wondering if we didn't have another way to do that.  to_char() is
squirrely enough that I really hate having any other core functionality
depending on it.  +1 for changing this.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Josh Berkus
Vik,

 When the timeout occurs, the backend commits suicide with a FATAL
 ereport.  I thought about just aborting the transaction to free the
 locks but decided that the client is clearly broken so might as well
 free up the connection as well.

Out of curiosity, how much harder would it have been just to abort the
transaction?  I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

Argument in favor of breaking the connection: most of the time, IITs are
caused by poor error-handling or garbage-collection code on the client
side, which has already abandoned the application session but hadn't let
go of the database handle.  Since the application is never going to
reuse the handle, terminating it is the right thing to do.

Argument in favor of just aborting the transaction: client connection
management software may not be able to deal cleanly with the broken
connection and may halt operation completely, especially if the client
finds out the connection is gone when they try to start their next
transaction on that connection.

My $0.018: terminating the connection is the preferred behavior.

 The same behavior can be achieved by an external script that monitors
 pg_stat_activity but by having it in core we get much finer tuning (it
 is session settable) and also traces of it directly in the PostgreSQL
 logs so it can be graphed by things like pgbadger.
 
 Unfortunately, no notification is sent to the client because there's no
 real way to do that right now.

Well, if you abort the connection, presumably the client finds out when
they try to send a query ...

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Could not finish anti-wraparound VACUUM when stop limit is reached

2014-06-03 Thread Jeff Janes
On Sun, May 25, 2014 at 8:40 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 While debugging the B-tree bug that Jeff Janes reported (
 http://www.postgresql.org/message-id/CAMkU=1y=VwF07Ay+Cpqk_
 7fpihrctmssv9y99sbghitkxpb...@mail.gmail.com), a new issue came up:

 If you reach the xidStopLimit, and try to run VACUUM, it fails with error:

 jjanes=# vacuum;
 ERROR:  database is not accepting commands to avoid wraparound data loss
 in database jjanes
 HINT:  Stop the postmaster and vacuum that database in single-user mode.
 You might also need to commit or roll back old prepared transactions.

 The backtrace looks like this:

 #0  errstart (elevel=20, filename=0x9590a0 varsup.c, lineno=120,
 funcname=0x9593f0 __func__.10455 GetNewTransactionId, domain=0x0)
 at elog.c:249
 #1  0x004f7c14 in GetNewTransactionId (isSubXact=0 '\000') at
 varsup.c:115
 #2  0x004f86db in AssignTransactionId (s=0xd62900
 TopTransactionStateData)
 at xact.c:510
 #3  0x004f84a4 in GetCurrentTransactionId () at xact.c:382
 #4  0x0062dc1c in vac_truncate_clog (frozenXID=1493663893,
 minMulti=1)
 at vacuum.c:909
 #5  0x0062dc06 in vac_update_datfrozenxid () at vacuum.c:888
 #6  0x0062cdf6 in vacuum (vacstmt=0x29e05e0, relid=0, do_toast=1
 '\001',
 bstrategy=0x2a5cc38, for_wraparound=0 '\000', isTopLevel=1 '\001') at
 vacuum.c:294
 #7  0x007a3c55 in standard_ProcessUtility (parsetree=0x29e05e0,
 queryString=0x29dfbf8 vacuum ;, context=PROCESS_UTILITY_TOPLEVEL,
 params=0x0,
 dest=0x29e0988, completionTag=0x7fff9411a490 ) at utility.c:645

 So, vac_truncate_clog() tries to get a new transaction ID, which fails
 because we've already reached the stop-limit. vac_truncate_clog() doesn't
 really need a new XID to be assigned, though, it only uses it to compare
 against datfrozenxid to see if wrap-around has already happened, so it
 could use ReadNewTransactionId() instead.



Like the attached patch, or does it need to be more complicated than that?





 Jeff's database seems to have wrapped around already, because after fixing
 the above, I get this:

 jjanes=# vacuum;
 WARNING:  some databases have not been vacuumed in over 2 billion
 transactions
 DETAIL:  You might have already suffered transaction-wraparound data loss.
 VACUUM


 We do not truncate clog when wraparound has already happened, so we never
 get past that point. Jeff advanced XID counter aggressively with some
 custom C code, so hitting the actual wrap-around is a case of don't do
 that. Still, the case is quite peculiar: pg_controldata says that nextXid
 is 4/1593661139. The oldest datfrozenxid is equal to that, 1593661139. So
 ISTM he managed to not just wrap around, but execute 2 billion more
 transactions after the wraparound and reach datfrozenxid again. I'm not
 sure how that happened.



I applied the attached patch to current git head (0ad1a816320a2b539a5) and
use it to start up a copy of the indicated data directory in ordinary (not
single user) mode after deleting postmaster.opts.  Autovacuum starts,
eventually completes, and then the database becomes usable again.

Perhap while investigating the matter you did something that pushed it over
the edge.  Could you try it again from a fresh untarring of the data
directory and see if you can reproduce my results?

Thanks,

Jeff


ReadNew.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Tom Lane
Josh Berkus j...@agliodbs.com writes:
 Out of curiosity, how much harder would it have been just to abort the
 transaction?  I think breaking the connection is probably the right
 behavior, but before folks start arguing it out, I wanted to know if
 aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss.  (I think, anyway.)

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever.  Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Alvaro Herrera
Josh Berkus wrote:

 Argument in favor of breaking the connection: most of the time, IITs are
 caused by poor error-handling or garbage-collection code on the client
 side, which has already abandoned the application session but hadn't let
 go of the database handle.  Since the application is never going to
 reuse the handle, terminating it is the right thing to do.

In some frameworks where garbage-collection is not involved with things
like this, what happens is that the connection object is reused later.
If you just drop the connection, the right error message will never
reach the application, but if you abort the xact, the next BEGIN issued
by the framework will receive the connection aborted due to
idle-in-transaction session message, which I assume is what this patch
would have sent.

Therefore I think if we want this at all it should abort the xact, not
drop the connection.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] json casts

2014-06-03 Thread Andrew Dunstan


On 06/03/2014 04:45 PM, Tom Lane wrote:

Peter Eisentraut pete...@gmx.net writes:

This was solved back in the day for the xml type, which has essentially
the same requirement, by adding an ISO-8601-compatible output option to
EncodeDateTime().  See map_sql_value_to_xml_value() in xml.c.  You ought
to be able to reuse that.  Seems easier than routing through to_char().

I was wondering if we didn't have another way to do that.  to_char() is
squirrely enough that I really hate having any other core functionality
depending on it.  +1 for changing this.




It's a bit of a pity neither of you spoke up in the 6 days since I 
published the draft patch. And honestly, some of the other code invoked 
by the XML code looks a bit squirrely too. But, OK, I will look at it. I 
guess I can assume that the output won't contain anything that needs 
escaping, so I can just add the leading and trailing quote marks without 
needing to call escape_json().


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Andrew Dunstan


On 06/03/2014 05:53 PM, Tom Lane wrote:

Josh Berkus j...@agliodbs.com writes:

Out of curiosity, how much harder would it have been just to abort the
transaction?  I think breaking the connection is probably the right
behavior, but before folks start arguing it out, I wanted to know if
aborting the transaction is even a reasonable thing to do.

FWIW, I think aborting the transaction is probably better, especially
if the patch is designed to do nothing to already-aborted transactions.
If the client is still there, it will see the abort as a failure in its
next query, which is less likely to confuse it completely than a
connection loss.  (I think, anyway.)

The argument that we might want to close the connection to free up
connection slots doesn't seem to me to hold water as long as we allow
a client that *isn't* inside a transaction to sit on an idle connection
forever.  Perhaps there is room for a second timeout that limits how
long you can sit idle independently of being in a transaction, but that
isn't this patch.




Yes, I had the same thought.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Could not finish anti-wraparound VACUUM when stop limit is reached

2014-06-03 Thread Jeff Janes
On Sun, May 25, 2014 at 8:53 AM, Andres Freund and...@2ndquadrant.com
wrote:

 On 2014-05-25 11:40:09 -0400, Heikki Linnakangas wrote:
  So, vac_truncate_clog() tries to get a new transaction ID, which fails
  because we've already reached the stop-limit. vac_truncate_clog() doesn't
  really need a new XID to be assigned, though, it only uses it to compare
  against datfrozenxid to see if wrap-around has already happened, so it
 could
  use ReadNewTransactionId() instead.

 Right. But IIRC we need one in the vicinity anyway to write new
 pg_database et al rows?


pg_database and pg_class are updated with heap_inplace_update in these
cases.

The page gets a new LSN, but the tuples do not get a new transaction ID.

Cheers,

Jeff


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Josh Berkus
On 06/03/2014 02:53 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 Out of curiosity, how much harder would it have been just to abort the
 transaction?  I think breaking the connection is probably the right
 behavior, but before folks start arguing it out, I wanted to know if
 aborting the transaction is even a reasonable thing to do.
 
 FWIW, I think aborting the transaction is probably better, especially
 if the patch is designed to do nothing to already-aborted transactions.
 If the client is still there, it will see the abort as a failure in its
 next query, which is less likely to confuse it completely than a
 connection loss.  (I think, anyway.)

Personally, I think we'll get about equal amounts of confusion either way.

 The argument that we might want to close the connection to free up
 connection slots doesn't seem to me to hold water as long as we allow
 a client that *isn't* inside a transaction to sit on an idle connection
 forever.  Perhaps there is room for a second timeout that limits how
 long you can sit idle independently of being in a transaction, but that
 isn't this patch.

Like I said, I'm marginally in favor of terminating the connection, but
I'd be completely satisfied with a timeout which aborted the transaction
instead.  Assuming that change doesn't derail this patch and keep it
from getting into 9.5, of course.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Allowing join removals for more join types

2014-06-03 Thread Noah Misch
On Wed, May 28, 2014 at 08:39:32PM +1200, David Rowley wrote:
 The attached patch allows join removals for both sub queries with left
 joins and also semi joins where a foreign key can prove the existence of
 the record.

When a snapshot can see modifications that queued referential integrity
triggers for some FK constraint, that constraint is not guaranteed to hold
within the snapshot until those triggers have fired.  For example, a query
running within a VOLATILE function f() in a statement like UPDATE t SET c =
f(c) may read data that contradicts FK constraints involving table t.
Deferred UNIQUE constraints, which we also do not yet use for deductions in
the planner, have the same problem; see commit 0f39d50.  This project will
need a design accounting for that hazard.

As a point of procedure, I recommend separating the semijoin support into its
own patch.  Your patch is already not small; delaying non-essential parts will
make the essential parts more accessible to reviewers.

Thanks,
nm

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Vik Fearing
On 06/04/2014 01:38 AM, Josh Berkus wrote:
 On 06/03/2014 02:53 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 Out of curiosity, how much harder would it have been just to abort the
 transaction?  I think breaking the connection is probably the right
 behavior, but before folks start arguing it out, I wanted to know if
 aborting the transaction is even a reasonable thing to do.
 FWIW, I think aborting the transaction is probably better, especially
 if the patch is designed to do nothing to already-aborted transactions.
 If the client is still there, it will see the abort as a failure in its
 next query, which is less likely to confuse it completely than a
 connection loss.  (I think, anyway.)
 Personally, I think we'll get about equal amounts of confusion either way.

 The argument that we might want to close the connection to free up
 connection slots doesn't seem to me to hold water as long as we allow
 a client that *isn't* inside a transaction to sit on an idle connection
 forever.  Perhaps there is room for a second timeout that limits how
 long you can sit idle independently of being in a transaction, but that
 isn't this patch.
 Like I said, I'm marginally in favor of terminating the connection, but
 I'd be completely satisfied with a timeout which aborted the transaction
 instead.  Assuming that change doesn't derail this patch and keep it
 from getting into 9.5, of course.

The change is as simple as changing the ereport from FATAL to ERROR.

Attached is a new patch doing it that way.

-- 
Vik

*** a/contrib/postgres_fdw/connection.c
--- b/contrib/postgres_fdw/connection.c
***
*** 343,348  configure_remote_session(PGconn *conn)
--- 343,356 
  		do_sql_command(conn, SET extra_float_digits = 3);
  	else
  		do_sql_command(conn, SET extra_float_digits = 2);
+ 
+ 	/*
+ 	 * Ensure the remote server doesn't abort our transaction if we keep it idle
+ 	 * for too long.
+ 	 * XXX: The version number must be corrected prior to commit!
+ 	 */
+ 	if (remoteversion = 90400)
+ 		do_sql_command(conn, SET idle_in_transaction_timeout = 0);
  }
  
  /*
*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 5545,5550  COPY postgres_log FROM '/full/path/to/logfile.csv' WITH csv;
--- 5545,5568 
/listitem
   /varlistentry
  
+  varlistentry id=guc-idle-in-transaction-timeout xreflabel=idle_in_transaction_timeout
+   termvarnameidle_in_transaction_timeout/varname (typeinteger/type)
+   indexterm
+primaryvarnameidle_in_transaction_timeout/ configuration parameter/primary
+   /indexterm
+   /term
+   listitem
+para
+Abort any transaction that has been idle for longer than the specified
+duration in seconds. This allows any locks the transaction may have taken
+to be released.
+/para
+para
+A value of zero (the default) turns this off.
+/para
+   /listitem
+  /varlistentry
+ 
   varlistentry id=guc-vacuum-freeze-table-age xreflabel=vacuum_freeze_table_age
termvarnamevacuum_freeze_table_age/varname (typeinteger/type)
indexterm
*** a/doc/src/sgml/mvcc.sgml
--- b/doc/src/sgml/mvcc.sgml
***
*** 676,682  ERROR:  could not serialize access due to read/write dependencies among transact
   listitem
para
 Don't leave connections dangling quoteidle in transaction/quote
!longer than necessary.
/para
   /listitem
   listitem
--- 676,684 
   listitem
para
 Don't leave connections dangling quoteidle in transaction/quote
!longer than necessary.  The configuration parameter
!xref linkend=guc-idle-in-transaction-timeout may be used to
!automatically abort any such transactions.
/para
   /listitem
   listitem
*** a/src/backend/storage/lmgr/proc.c
--- b/src/backend/storage/lmgr/proc.c
***
*** 57,62 
--- 57,63 
  int			DeadlockTimeout = 1000;
  int			StatementTimeout = 0;
  int			LockTimeout = 0;
+ int			IdleInTransactionTimeout = 0;
  bool		log_lock_waits = false;
  
  /* Pointer to this process's PGPROC and PGXACT structs, if any */
*** a/src/backend/tcop/postgres.c
--- b/src/backend/tcop/postgres.c
***
*** 3947,3952  PostgresMain(int argc, char *argv[],
--- 3947,3956 
  			{
  set_ps_display(idle in transaction, false);
  pgstat_report_activity(STATE_IDLEINTRANSACTION, NULL);
+ 
+ /* Start the idle-in-transaction timer, in seconds */
+ if (IdleInTransactionTimeout  0)
+ 	enable_timeout_after(IDLE_IN_TRANSACTION_TIMEOUT, 1000*IdleInTransactionTimeout);
  			}
  			else
  			{
***
*** 3980,3986  PostgresMain(int argc, char *argv[],
  		DoingCommandRead = false;
  
  		/*
! 		 * (5) check for any other interesting events that happened while we
  		 * slept.
  		 */
  		if (got_SIGHUP)
--- 3984,3996 
  		DoingCommandRead = false;
  
  		/*
! 	

Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread David G Johnston
On Tue, Jun 3, 2014 at 7:40 PM, Josh Berkus [via PostgreSQL] 
ml-node+s1045698n5805933...@n5.nabble.com wrote:

 On 06/03/2014 02:53 PM, Tom Lane wrote:

  Josh Berkus [hidden email]
 http://user/SendEmail.jtp?type=nodenode=5805933i=0 writes:
  Out of curiosity, how much harder would it have been just to abort the
  transaction?  I think breaking the connection is probably the right
  behavior, but before folks start arguing it out, I wanted to know if
  aborting the transaction is even a reasonable thing to do.
 
  FWIW, I think aborting the transaction is probably better, especially
  if the patch is designed to do nothing to already-aborted transactions.
  If the client is still there, it will see the abort as a failure in its
  next query, which is less likely to confuse it completely than a
  connection loss.  (I think, anyway.)

 Personally, I think we'll get about equal amounts of confusion either way.

  The argument that we might want to close the connection to free up
  connection slots doesn't seem to me to hold water as long as we allow
  a client that *isn't* inside a transaction to sit on an idle connection
  forever.  Perhaps there is room for a second timeout that limits how
  long you can sit idle independently of being in a transaction, but that
  isn't this patch.

 Like I said, I'm marginally in favor of terminating the connection, but
 I'd be completely satisfied with a timeout which aborted the transaction
 instead.  Assuming that change doesn't derail this patch and keep it
 from getting into 9.5, of course.


​Default to dropping the connection but give the usersadministrators the
capability to decide for themselves?

I still haven't heard an argument for why this wouldn't apply to aborted
idle-in-transactions.  I get the focus in on releasing locks but if the
transaction fails but still hangs around forever it is just as broken as
one that doesn't fail and hangs around forever.  Even if you limit the end
result to only aborting the transaction the end-user will likely want to
distinguish between their transaction failing and their failed transaction
remaining idle too long - if only to avoid the situation where they make
the transaction no longer fail but still hit the timeout.

Whether a connection is a resource the DBA wants to restore (at the expense
of better server-client communication) is a parental decision we shouldn't
force on them given how direct (feelings about GUCs aside).  The decision
to force the release of locks - the primary purpose of the patch - is made
by simply using the setting in the first place.

David J.
​




--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/idle-in-transaction-timeout-tp5805859p5805936.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.

Re: [HACKERS] Hide 'Execution time' in EXPLAIN (COSTS OFF)

2014-06-03 Thread Noah Misch
On Tue, Jun 03, 2014 at 08:05:48PM +0200, Andres Freund wrote:
 In 9.4. COSTS OFF for EXPLAIN prevents 'Planning time' to be
 printed. Should we perhaps do the same for 'Execution time'? That'd make
 it possible to use EXPLAIN (ANALYZE, COSTS OFF, TIMING OFF) in
 regression tests.

I have wanted and would use such an option.  I don't have a definite opinion
about how to spell the option.

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] BUG #9652: inet types don't support min/max

2014-06-03 Thread Haribabu Kommi
On Wed, Jun 4, 2014 at 5:46 AM, Keith Fiske ke...@omniti.com wrote:

 Andres's changes on June 3rd to
 https://github.com/postgres/postgres/commits/master/src/test/regress/expected/create_function_3.out
 are causing patch v2 to fail for that regression test file.

 postgres $ patch -p1 -i ../inet_agg_v2.patch
 patching file src/backend/utils/adt/network.c
 patching file src/include/catalog/pg_aggregate.h
 patching file src/include/catalog/pg_proc.h
 patching file src/include/utils/builtins.h
 patching file src/test/regress/expected/create_function_3.out
 Hunk #1 FAILED at 153.
 1 out of 1 hunk FAILED -- saving rejects to file
 src/test/regress/expected/create_function_3.out.rej
 patching file src/test/regress/expected/inet.out
 patching file src/test/regress/sql/inet.sql

 Otherwise it applies without any issues to the latest HEAD. I built and
 started a new instance, and I was able to test at least the basic min/max
 functionality

 keith=# create table test_inet (id serial, ipaddress inet);
 CREATE TABLE
 Time: 25.546 ms
 keith=# insert into test_inet (ipaddress) values ('192.168.1.1');
 INSERT 0 1
 Time: 3.143 ms
 keith=# insert into test_inet (ipaddress) values ('192.168.1.2');
 INSERT 0 1
 Time: 2.932 ms
 keith=# insert into test_inet (ipaddress) values ('127.0.0.1');
 INSERT 0 1
 Time: 1.786 ms
 keith=# select min(ipaddress) from test_inet;
 min
 ---
  127.0.0.1
 (1 row)

 Time: 3.371 ms
 keith=# select max(ipaddress) from test_inet;
  max
 -
  192.168.1.2
 (1 row)

 Time: 1.104 ms

Thanks for the test. Patch is re-based to the latest head.

Regards,
Hari Babu
Fujitsu Australia


inet_agg_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Vik Fearing
On 06/04/2014 02:01 AM, David G Johnston wrote:
 ​Default to dropping the connection but give the usersadministrators
 the capability to decide for themselves?

Meh.

 I still haven't heard an argument for why this wouldn't apply to
 aborted idle-in-transactions.  I get the focus in on releasing locks
 but if the transaction fails but still hangs around forever it is just
 as broken as one that doesn't fail and hangs around forever. 

My main concern was with locks and blocking VACUUM.  Aborted
transactions don't do either of those things.  The correct solution is
to terminate aborted transaction, too, or not terminate anything and
abort the idle ones.

 Even if you limit the end result to only aborting the transaction the
 end-user will likely want to distinguish between their transaction
 failing and their failed transaction remaining idle too long - if only
 to avoid the situation where they make the transaction no longer fail
 but still hit the timeout.

But hitting the timeout *is* failing.

With the new patch, the first query will say that the transaction was
aborted due to timeout.  Subsequent queries will do as they've always done.

-- 
Vik



Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Robert Haas
On Tue, Jun 3, 2014 at 5:53 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Josh Berkus j...@agliodbs.com writes:
 Out of curiosity, how much harder would it have been just to abort the
 transaction?  I think breaking the connection is probably the right
 behavior, but before folks start arguing it out, I wanted to know if
 aborting the transaction is even a reasonable thing to do.

 FWIW, I think aborting the transaction is probably better, especially
 if the patch is designed to do nothing to already-aborted transactions.
 If the client is still there, it will see the abort as a failure in its
 next query, which is less likely to confuse it completely than a
 connection loss.  (I think, anyway.)

I thought the reason why this hasn't been implemented before now is
that sending an ErrorResponse to the client will result in a loss of
protocol sync.  Sure, when the client sends the next query, they'll
then read the ErrorResponse.  So far so good.  The problem is that
they *won't* read whatever we send back as a response to their query,
because they think they already have, when in reality they've only
read the ErrorResponse sent much earlier.  This, at least as I've
understood it, is the reason why recovery conflicts kill off idle
sessions entirely instead of just aborting the transaction.  Andres
tried to fix that problem a few years ago without much luck; the most
relevant post to this particular issue seems to be:

http://www.postgresql.org/message-id/23631.1292521...@sss.pgh.pa.us

Assuming that the state of play hasn't changed in some way I'm unaware
of since 2010, I think the best argument for FATAL here is that it's
what we can implement.  I happen to think it's better anyway, because
the cases I've seen where this would actually be useful involve
badly-written applications that are not under the same administrative
control as the database and supposedly have built-in connection
poolers, except sometimes they seem to forget about some of the
connections they themselves opened.  The DBAs can't make the app
developers fix the app; they just have to try to survive.  Aborting
the transaction is a step in the right direction but since the guy at
the other end of the connection is actually or in effect dead, that
just leaves you with an eternally idle connection. Now we can say use
TCP keepalives for that but that only helps if the connection has
actually been severed; if the guy at the other end is still
technically there but is too brain-damaged to actually try to *use*
the connection for anything, it doesn't help.  Also, as I recently
discovered, there are still a few machines out there that don't
actually support TCP keepalives on a per-connection basis; you can
only configure it system-wide, and that's not always granular enough.

Anyway, if somebody really wants to go to the trouble of finding a way
to make this work without killing off the connection, I think that
would be cool and useful and whatever technology we develop there
could doubtless could be applied to other situations.  But I have a
nervous feeling that might be a hard enough problem to sink the whole
patch, which would be a shame, since the cases I've actually
encountered would be better off with FATAL anyway.

Just my $0.017 to go with Josh's $0.018.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Perl coding error in msvc build system?

2014-06-03 Thread Peter Eisentraut

I'm not sure whether the following coding actually detects any errors:

Solution.pm:

open(P, cl /? 21 |) || die cl command not found;

VSObjectFactory.pm:

open(P, nmake /? 21 |)
  || croak
Unable to determine Visual Studio version: The nmake command wasn't 
found.;

The problem is that because of the shell special characters, the command
is run through the shell, but the shell ignores the exit status of
anything but the last command in the pipe.  And the perlopentut man page
says In such a case, the open call will only indicate failure if Perl
can't even run the shell..  I can confirm that on a *nix system.  It
might be different on Windows, but it doesn't seem so.

The whole thing could probably be written in a less complicated way
using backticks, like

--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -72,16 +72,13 @@ sub DeterminePlatform
 
# Examine CL help output to determine if we are in 32 or 64-bit mode.
$self-{platform} = 'Win32';
-   open(P, cl /? 21 |) || die cl command not found;
-   while (P)
+   my $output = `cl /? 21`;
+   $?  8 == 0 or die cl command not found;
+   if ($output =~ /^\/favor:.+AMD64/)
{
-   if (/^\/favor:.+AMD64/)
-   {
-   $self-{platform} = 'x64';
-   last;
-   }
+   $self-{platform} = 'x64';
+   last;
}
-   close(P);
print Detected hardware platform: $self-{platform}\n;
 }





-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_stat directory and pg_stat_statements

2014-06-03 Thread Fujii Masao
On Tue, Jun 3, 2014 at 6:38 PM, Magnus Hagander mag...@hagander.net wrote:
 On Mon, Jun 2, 2014 at 4:07 PM, Andres Freund and...@2ndquadrant.com
 wrote:

 On 2014-06-02 22:59:55 +0900, Fujii Masao wrote:
  On Thu, May 29, 2014 at 3:02 PM, Peter Geoghegan p...@heroku.com wrote:
   On Wed, May 28, 2014 at 10:49 PM, Fujii Masao masao.fu...@gmail.com
   wrote:
   You're concerned about the scenario using pg_upgrade? I'm not sure
   the detail
   of pg_upgrade. But if it doesn't work properly, we should have gotten
   the trouble
  
   I'm not worried about pg_upgrade, because by design pg_stat_statements
   will discard stats files that originated in earlier versions. However,
   I don't see a need to change pg_stat_statements to serialize its
   statistics to disk in the pg_stat directory before we branch off 9.4.
   As you mentioned, it's harmless.
 
  Yeah, that's an idea. OTOH, there is no *strong* reason to postpone
  the fix to 9.5. So I just feel inclined to apply the fix now...

 +1 for fixing it now.


 +1 for fixing it for 9.4 before the next beta, but *not* backpatching it to
 9.3 - it *is* a behaviour change after all..

Yep, I just applied the patch only to HEAD.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] idle_in_transaction_timeout

2014-06-03 Thread Tom Lane
Robert Haas robertmh...@gmail.com writes:
 I thought the reason why this hasn't been implemented before now is
 that sending an ErrorResponse to the client will result in a loss of
 protocol sync.

Hmm ... you are right that this isn't as simple as an ereport(ERROR),
but I'm not sure it's impossible.  We could for instance put the backend
into skip-till-Sync state so that it effectively ignored the next command
message.  Causing that to happen might be impracticably messy, though.

I'm not sure whether cancel-transaction behavior is enough better than
cancel-session to warrant extra effort here.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery testing for beta

2014-06-03 Thread Amit Kapila
On Mon, Jun 2, 2014 at 9:44 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-02 09:03:25 -0700, Jeff Janes wrote:
  On Fri, May 30, 2014 at 8:09 PM, Amit Kapila amit.kapil...@gmail.com
   I think this is useful information and can be even included in core
   code.

 I'd like to include something, but I think those are a bit long...

There could be multiple options:
Option-1:
Delta encoded tuple/Compressed tuple -  if tuple is prefix and/or suffix
encoded
and don't mention anything otherwise.

Option-2:
Prefix delta encoded tuple/Suffix Delta encoded tuple/Delta encoded
tuple - depending on if tuple contains prefix, suffix or both type of
encodings.

  Non-HOT updates can also be compressed, if they happen to land in the
same
  page as the old version, so I copied that code into the non-HOT update
  section as well.

 Right.

I shall include this in updated patch.

  GNU make does not realize that pg_xlogdump depends
  on src/backend/access/rmgrdesc/heapdesc.c.  (I don't know how or why it
has
  that dependency, but changes did not take effect with a simple make
  install) Is that a known issue?  Is there someway to fix it?

 Hm. I can't reproduce this here. A simple 'touch heapdesc.c' triggers a
 rebuild of pg_xlogdump for me. Could you include the make output?

In Windows, there is a separate copy for *desc.c files for pg_xlogdump,
so unless I regenerate the files (perl mkvcbuild.pl), changes done
in src/backend/access/rmgrdesc/*desc.c doesn't take affect.

I think it is done as per blow code in Mkvcbuild.pm:
foreach my $xf (glob('src/backend/access/rmgrdesc/*desc.c'))
{
my $bf = basename $xf;
copy($xf, contrib/pg_xlogdump/$bf);
$pg_xlogdump-AddFile(contrib\\pg_xlogdump\\$bf);
}
copy(
'src/backend/access/transam/xlogreader.c',
'contrib/pg_xlogdump/xlogreader.c');

Note- I think it would have been better to discuss specifics of
pg_xlogdump in separate thread, however as the discussion
started here, I am also replying on this thread.  I shall post an
update of conclusion of this in new thread if patch is required.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Proposing pg_hibernate

2014-06-03 Thread Amit Kapila
On Tue, Jun 3, 2014 at 5:43 PM, Gurjeet Singh gurj...@singh.im wrote:
 On Tue, Jun 3, 2014 at 7:57 AM, Robert Haas robertmh...@gmail.com wrote:
  It seems like it would be best to try to do this at cluster startup
  time, rather than once recovery has reached consistency.  Of course,
  that might mean doing it with a single process, which could have its
  own share of problems.  But I'm somewhat inclined to think that if
  recovery has already run for a significant period of time, the blocks
  that recovery has brought into shared_buffers are more likely to be
  useful than whatever pg_hibernate would load.

 I am not absolutely sure of the order of execution between recovery
 process and the BGWorker, but ...

 For sizeable shared_buffers size, the restoration of the shared
 buffers can take several seconds.

Incase of recovery, the shared buffers saved by this utility are
from previous shutdown which doesn't seem to be of more use
than buffers loaded by recovery.

 I have a feeling the users wouldn't
 like their master database take up to a few minutes to start accepting
 connections.

I think this is fair point and to address this we can have an option to
decide when to load buffer's and have default value as load before
recovery.

 Currently pg_hibernator uses ReadBufferExtended() API, and AIUI, that
 API requires a database connection//shared-memory attachment, and that
 a backend process cannot switch between databases after the initial
 connection.

If recovery can load the buffer's to apply WAL, why can't it be done with
pg_hibernator.  Can't we use ReadBufferWithoutRelcache() to achieve
the purpose of pg_hibernator?

One other point:
 Note that the BuffersSaver process exists at all times, even when this
 parameter is set to `false`. This is to allow the DBA to enable/disable
the
 extension without having to restart the server. The BufferSaver process
 checks this parameter during server startup and right before shutdown, and
 honors this parameter's value at that time.

Why can't it be done when user register's the extension by using dynamic
background facility RegisterDynamicBackgroundWorker?

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com