Re: [HACKERS] postgresql.auto.conf read from wrong directory

2014-06-13 Thread Amit Kapila
On Thu, Jun 12, 2014 at 7:26 PM, Fujii Masao masao.fu...@gmail.com wrote:

 On Tue, May 27, 2014 at 2:05 PM, Amit Kapila amit.kapil...@gmail.com
wrote:
  On Sun, May 11, 2014 at 11:23 PM, Tom Lane t...@sss.pgh.pa.us wrote:
  I think it's clearly *necessary* to forbid setting data_directory in
  postgresql.auto.conf.  The file is defined to be found in the data
  directory, so any such setting is circular logic by definition;
  no good can come of not rejecting it.
 
  We already have a GUC flag bit about disallowing certain variables
  in the config file (though I don't remember if it's enforced or
  just advisory).  It seems to me that we'd better invent one for
  disallowing in ALTER SYSTEM, as well.
 
  I introduced a new flag bit (GUC_DISALLOW_IN_AUTO_FILE) to
  disallow parameters by Alter System and disallowed data_directory to
  be set by Alter System.

 We should document what types of parameters are not allowed to be set by
 ALTER SYSTEM SET?

Agreed, I had mentioned in Notes section of document.  Apart from that
I had disallowed parameters that are excluded from postgresql.conf by
initdb (Developer options) and they are recommended in user manual
to be not used in production.

 data_directory was displayed when I typed TAB just after ALTER SYSTEM
SET.
 Probably tab-completion for ALTER SYSTEM SET needs to be changed.

This information is not stored in pg_settings.  One way is to specify
manually all the parameters which are disallowed but it seems the query
will become clumsy, another could be to have another column in pg_settings.
Do you think of any other way?


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


prohibit_data_dir_by_alter_system-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] view reloptions

2014-06-13 Thread Jaime Casanova
On Wed, Jun 11, 2014 at 2:46 PM, Alvaro Herrera
alvhe...@2ndquadrant.com wrote:
 I just noticed by chance that view relations are using StdRdOptions to
 allocate their reloptions.  I can't find any reason for this, other than
 someone failed to realize that they should instead have a struct defined
 of its own, just like (say) GIN indexes do.  Views using StdRdOptions is
 quite pointless, given that it's used for stuff like fillfactor and
 autovacuum, neither of which apply to views.

 9.2 added security_barrier to StdRdOptions, and 9.4 is now adding
 check_option_offset, which is a string reloption with some funny rules.

[...]
 2) it would mean that security_barrier would change for external code
 that expects StdRdOptions rather than, say, ViewOptions
 3) I don't have time to do the legwork before CF1 anyway

 If we don't change it now, I'm afraid we'll be stuck with using
 StdRdOptions for views for all eternity.

 (It's a pity I didn't become aware of this earlier in 9.4 cycle when I
 added the multixact freeze reloptions ... I could have promoted a patch
 back then.)


Attached is a patch moving the reloptions of views into its own structure.
i also created a view_reloptions() function in order to not use
heap_reloptions() for views, but maybe that was too much?

i haven't seen the check_option_offset thingy yet, but i hope to take
a look at that tomorrow.

-- 
Jaime Casanova www.2ndQuadrant.com
Professional PostgreSQL: Soporte 24x7 y capacitación
Phone: +593 4 5107566 Cell: +593 987171157
From 87ed78a4f276484a37917c417286a16082030f13 Mon Sep 17 00:00:00 2001
From: Jaime Casanova ja...@2ndquadrant.com
Date: Fri, 13 Jun 2014 01:10:24 -0500
Subject: [PATCH] Move reloptions from views into its own structure.
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Per gripe from Álvaro Herrera.
---
 src/backend/access/common/reloptions.c |   42 ++-
 src/backend/commands/tablecmds.c   |9 +-
 src/include/access/reloptions.h|1 +
 src/include/utils/rel.h|   43 
 src/tools/pgindent/typedefs.list   |1 +
 5 files changed, 71 insertions(+), 25 deletions(-)

diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 522b671..c7ad6f9 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -834,10 +834,12 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc, Oid amoptions)
 	{
 		case RELKIND_RELATION:
 		case RELKIND_TOASTVALUE:
-		case RELKIND_VIEW:
 		case RELKIND_MATVIEW:
 			options = heap_reloptions(classForm-relkind, datum, false);
 			break;
+		case RELKIND_VIEW:
+			options = view_reloptions(datum, false);
+			break;
 		case RELKIND_INDEX:
 			options = index_reloptions(amoptions, datum, false);
 			break;
@@ -1200,10 +1202,6 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, vacuum_scale_factor)},
 		{autovacuum_analyze_scale_factor, RELOPT_TYPE_REAL,
 		offsetof(StdRdOptions, autovacuum) +offsetof(AutoVacOpts, analyze_scale_factor)},
-		{security_barrier, RELOPT_TYPE_BOOL,
-		offsetof(StdRdOptions, security_barrier)},
-		{check_option, RELOPT_TYPE_STRING,
-		offsetof(StdRdOptions, check_option_offset)},
 		{user_catalog_table, RELOPT_TYPE_BOOL,
 		offsetof(StdRdOptions, user_catalog_table)}
 	};
@@ -1225,6 +1223,38 @@ default_reloptions(Datum reloptions, bool validate, relopt_kind kind)
 }
 
 /*
+ * Option parser for views
+ */
+bytea *
+view_reloptions(Datum reloptions, bool validate)
+{
+	relopt_value *options;
+	ViewOptions *vopts;
+	int			numoptions;
+	static const relopt_parse_elt tab[] = {
+		{security_barrier, RELOPT_TYPE_BOOL,
+		offsetof(ViewOptions, security_barrier)},
+		{check_option, RELOPT_TYPE_STRING,
+		offsetof(ViewOptions, check_option_offset)}
+	};
+
+	options = parseRelOptions(reloptions, validate, RELOPT_KIND_VIEW, numoptions);
+
+	/* if none set, we're done */
+	if (numoptions == 0)
+		return NULL;
+
+	vopts = allocateReloptStruct(sizeof(ViewOptions), options, numoptions);
+
+	fillRelOptions((void *) vopts, sizeof(ViewOptions), options, numoptions,
+   validate, tab, lengthof(tab));
+
+	pfree(options);
+
+	return (bytea *) vopts;
+}
+
+/*
  * Parse options for heaps, views and toast tables.
  */
 bytea *
@@ -1248,8 +1278,6 @@ heap_reloptions(char relkind, Datum reloptions, bool validate)
 		case RELKIND_RELATION:
 		case RELKIND_MATVIEW:
 			return default_reloptions(reloptions, validate, RELOPT_KIND_HEAP);
-		case RELKIND_VIEW:
-			return default_reloptions(reloptions, validate, RELOPT_KIND_VIEW);
 		default:
 			/* other relkinds are not supported */
 			return NULL;
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 341262b..fd27cdb 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c

[HACKERS] loading .so file at run time

2014-06-13 Thread Rajmohan C
I am working with Postgresql 9.3.4 source using eclipse IDE in ubuntu
14.04. I have a library file abc.so to be loaded at run time when
postgresql server starts. When I run the server with

-D /home/rajmohan/projects/TPCH_database

 as argument in run configuration, I could see LOG message abc.so library
loaded and I am able to call library methods at run time. But this requires
a client to connect to server and pass cmds. But I want a stand-alone
postgres server such that I can give commands directly from eclipse command
prompt itself. For that when I set the argument in run configuration as

--single -D /home/rajmohan/projects/TPCH_database tpch,

the library loaded log message doesnot appears and library does not load.
How to load the library on stand-alone server and work?


Re: [HACKERS] make check For Extensions

2014-06-13 Thread Fabien COELHO


That does not mean that it starts a new cluster on a port. It means it 
will test it against an existing cluster after you have installed into 
that cluster.


Yes, that is what I was saying.

It invokes psql which is expected to work directly. Note that there 
is no temporary installation, it is tested against the installed and 
running postgres. Maybe having the ability to create a temporary 
installation, as you suggest, would be a nice extension.


Yes, that’s what I would like, so I could test *before* installing.


I would suggest to add that to https://wiki.postgresql.org/wiki/Todo.

I may look into it when I have time, over the summer. The key point is 
that there is no need for a temporary installation, but only of a 
temporary cluster, and to trick this cluster into loading the uninstalled 
extension, maybe by playing with dynamic_library_path in the temporary 
cluster.


--
Fabien.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] API change advice: Passing plan invalidation info from the rewriter into the planner?

2014-06-13 Thread Dean Rasheed
On 13 June 2014 01:13, Stephen Frost sfr...@snowman.net wrote:
 Greg, all,

 I will reply to the emails in detail when I get a chance but am out of town
 at a funeral, so it'll likely be delayed. I did want to echo my agreement
 for the most part with Greg and in particular...

 On Thursday, June 12, 2014, Gregory Smith gregsmithpg...@gmail.com wrote:

 On 6/11/14, 10:26 AM, Robert Haas wrote:

 Now, as soon as we introduce the concept that selecting from a table
 might not really mean read from the table but read from the table after
 applying this owner-specified qual, we're opening up a whole new set of
 attack surfaces. Every pg_dump is an opportunity to hack somebody else's
 account, or at least audit their activity.


 I'm in full agreement we should clearly communicate the issues around
 pg_dump in particular, because they can't necessarily be eliminated
 altogether without some major work that's going to take a while to finish.
 And if the work-around is some sort of GUC for killing RLS altogether,
 that's ugly but not unacceptable to me as a short-term fix.


 A GUC which is enable / disable / error-instead may work quiet well, with
 error-instead for pg_dump default if people really want it (there would have
 to be a way to disable that though, imv).

 Note that enable is default in general, disable would be for superuser only
 (or on start-up) to disable everything, and error-instead anyone could use
 but it would error instead of implementing RLS when querying an RLS-enabled
 table.

 This approach was suggested by an existing user testing out this RLS
 approach, to be fair, but it looks pretty sane to me as a way to address
 some of these concerns. Certainly open to other ideas and thoughts though.


Yeah, I was thinking something like this could work, but I would go
further. Suppose you had separate GRANTable privileges for direct
access to individual tables, bypassing RLS, e.g.

  GRANT DIRECT SELECT|INSERT|UPDATE|DELETE ON table_name TO role_name

Combined with the GUC (direct_table_access, say) to request direct
access to all tables. Then with direct_table_access = true/required, a
SELECT from a table would error if the user hadn't been granted the
DIRECT SELECT privilege on all the tables referenced in the query.
Tools like pg_dump would require direct_table_access, but there might
be other levels of access that didn't error out.

I think if I were using RLS, I would definitely want/expect this level
of fine-grained control over permissions on a per-table basis, rather
than the superuser/non-superuser level of control, or having
RLS-exempt users.

Actually, given the fact that the majority of users won't be using
RLS, I would be tempted to invert the above logic and have the new
privilege be for LIMITED access (via RLS quals). So a user granted the
normal SELECT privilege would be able to bypass RLS, but a user only
granted LIMITED SELECT wouldn't.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] loading .so file at run time

2014-06-13 Thread Michael Paquier
On Fri, Jun 13, 2014 at 3:29 PM, Rajmohan C csrajmo...@gmail.com wrote:
 I am working with Postgresql 9.3.4 source using eclipse IDE in ubuntu 14.04.
 I have a library file abc.so to be loaded at run time when postgresql server
 starts. When I run the server with

 -D /home/rajmohan/projects/TPCH_database

  as argument in run configuration, I could see LOG message abc.so library
 loaded and I am able to call library methods at run time. But this requires
 a client to connect to server and pass cmds. But I want a stand-alone
 postgres server such that I can give commands directly from eclipse command
 prompt itself. For that when I set the argument in run configuration as

 --single -D /home/rajmohan/projects/TPCH_database tpch,

 the library loaded log message doesnot appears and library does not load.
 How to load the library on stand-alone server and work?
Is it a custom extension that you are developing? Here you go:
http://www.postgresql.org/docs/devel/static/runtime-config-client.html#RUNTIME-CONFIG-CLIENT-PRELOAD
For an extension that needs to be loaded use shared_preload_libraries.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Quan Zongliang

Hi all,

Please find the attachment.

By my friend asking, for convenience,
support to define multi variables in single PL/pgSQL line.

Like this:

CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
DECLARE
local_a, local_b, local_c text := 'a1';
BEGIN
return local_a || local_b || local_c;
end;
$$ LANGUAGE plpgsql;


Regards,
Quan Zongliang


---
此电子邮件没有病毒和恶意软件,因为 avast! 防病毒保护处于活动状态。
http://www.avast.com
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index e3a992c..2737a3a 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -167,6 +167,7 @@ static	List			*read_raise_options(void);
 
 %type declhdr decl_sect
 %type varname decl_varname
+%type list	decl_varnames
 %type boolean	decl_const decl_notnull exit_type
 %type expr	decl_defval decl_cursor_query
 %type dtype	decl_datatype
@@ -471,9 +472,10 @@ decl_stmt		: decl_statement
 	}
 ;
 
-decl_statement	: decl_varname decl_const decl_datatype decl_collate decl_notnull decl_defval
+decl_statement	: decl_varnames decl_const decl_datatype decl_collate decl_notnull decl_defval
 	{
 		PLpgSQL_variable	*var;
+		ListCell *lc;
 
 		/*
 		 * If a collation is supplied, insert it into the
@@ -492,38 +494,44 @@ decl_statement	: decl_varname decl_const decl_datatype decl_collate decl_notnull
 			$3-collation = $4;
 		}
 
-		var = plpgsql_build_variable($1.name, $1.lineno,
-	 $3, true);
-		if ($2)
-		{
-			if (var-dtype == PLPGSQL_DTYPE_VAR)
-((PLpgSQL_var *) var)-isconst = $2;
-			else
-ereport(ERROR,
-		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		 errmsg(row or record variable cannot be CONSTANT),
-		 parser_errposition(@2)));
-		}
-		if ($5)
-		{
-			if (var-dtype == PLPGSQL_DTYPE_VAR)
-((PLpgSQL_var *) var)-notnull = $5;
-			else
-ereport(ERROR,
-		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		 errmsg(row or record variable cannot be NOT NULL),
-		 parser_errposition(@4)));
 
-		}
-		if ($6 != NULL)
+		foreach(lc, $1)
 		{
-			if (var-dtype == PLPGSQL_DTYPE_VAR)
-((PLpgSQL_var *) var)-default_val = $6;
-			else
-ereport(ERROR,
-		(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
-		 errmsg(default value for row or record variable is not supported),
-		 parser_errposition(@5)));
+			YYSTYPE *v = (YYSTYPE *) lfirst(lc);
+
+			var = plpgsql_build_variable(v-varname.name, v-varname.lineno,
+	 	$3, true);
+			if ($2)
+			{
+if (var-dtype == PLPGSQL_DTYPE_VAR)
+	((PLpgSQL_var *) var)-isconst = $2;
+else
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg(row or record variable cannot be CONSTANT),
+			 parser_errposition(@2)));
+			}
+			if ($5)
+			{
+if (var-dtype == PLPGSQL_DTYPE_VAR)
+	((PLpgSQL_var *) var)-notnull = $5;
+else
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg(row or record variable cannot be NOT NULL),
+			 parser_errposition(@4)));
+
+			}
+			if ($6 != NULL)
+			{
+if (var-dtype == PLPGSQL_DTYPE_VAR)
+	((PLpgSQL_var *) var)-default_val = $6;
+else
+	ereport(ERROR,
+			(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+			 errmsg(default value for row or record variable is not supported),
+			 parser_errposition(@5)));
+			}
 		}
 	}
 | decl_varname K_ALIAS K_FOR decl_aliasitem ';'
@@ -773,6 +781,22 @@ decl_varname	: T_WORD
 	}
 ;
 
+decl_varnames : decl_varname
+	{
+		YYSTYPE *v = palloc(sizeof(YYSTYPE));
+		v-varname.name = pstrdup($1.name);
+		v-varname.lineno = $1.lineno;
+		$$ = list_make1(v);
+	}
+| decl_varnames ',' decl_varname
+	{
+		YYSTYPE *v = palloc(sizeof(YYSTYPE));
+		v-varname.name = pstrdup($3.name);
+		v-varname.lineno = $3.lineno;
+		$$ = lappend($1, v);
+	}
+;
+
 decl_const		:
 	{ $$ = false; }
 | K_CONSTANT

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_xlogdump --stats

2014-06-13 Thread Abhijit Menon-Sen
At 2014-06-10 18:04:13 +0900, furu...@pm.nttdata.co.jp wrote:

 The function works fine. It is a good to the learning of PostgreSQL.

Thanks for having a look.

 pg_xlogdump.c: In function ‘XLogDumpStatsRow’:
 pg_xlogdump.c:851: warning: format ‘%20llu’ expects type ‘long long unsigned 
 int’, but argument 3 has type ‘uint64’

I've changed this to use %zu at Álvaro's suggestion. I'll post an
updated patch after I've finished some (unrelated) refactoring.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Amit Kapila
On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
  Function pg_create_logical_replication_slot() is trying to
  save slot twice once during CreateInitDecodingContext() and
  then in ReplicationSlotPersist(), isn't it better if we can make
  it do it just once?

 Doesn't work here. In the first save it's not yet marked as persistent -
 but we still need to safely reserve the xmin.

Okay, but if it crashes before saving the persistency to permanent
file and there remains a .tmp for this replication slot which it created
during save of this persistency information, then also xmin will get
lost, because during startup it will not consider such a slot.

 It's also not something
 that should happen very frequently, so I'm not worried about the
 overhead.

Right, just looking from the purpose of need for same.

  8. Is there a chance of inconsistency, if pg_restxlog resets the
  xlog and after restart, one of the slots contains xlog position
  older than what is resetted by pg_resetxlog?

 Yes. There's lots of ways to screw over your database by using
 pg_resetxlog.

Thats right, trying to think if there could be any thing which
won't even allow the server to get started due to replication slots
after pg_resetxlog.  As per my current understanding, I don't think
there is any such problem.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Pavel Stehule
Hello

+ it is natural in almost all languages including ADA
- it increases a distance between PL/pgSQL and PL/SQL


I am don't think, so this feature is necessary, but I am not against it.

Regards

Pavel



2014-06-13 9:20 GMT+02:00 Quan Zongliang quanzongli...@gmail.com:

 Hi all,

 Please find the attachment.

 By my friend asking, for convenience,
 support to define multi variables in single PL/pgSQL line.

 Like this:

 CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
 DECLARE
 local_a, local_b, local_c text := 'a1';
 BEGIN
 return local_a || local_b || local_c;
 end;
 $$ LANGUAGE plpgsql;


 Regards,
 Quan Zongliang


 ---
 此电子邮件没有病毒和恶意软件,因为 avast! 防病毒保护处于活动状态。
 http://www.avast.com


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Michael Paquier
On Fri, Jun 13, 2014 at 4:20 PM, Quan Zongliang quanzongli...@gmail.com wrote:
 By my friend asking, for convenience,
 support to define multi variables in single PL/pgSQL line.

 Like this:

 CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
 DECLARE
 local_a, local_b, local_c text := 'a1';
 BEGIN
 return local_a || local_b || local_c;
 end;
 $$ LANGUAGE plpgsql;
I don't recall that this is possible. Have a look at the docs as well:
http://www.postgresql.org/docs/current/static/plpgsql-declarations.html
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Use unique index for longer pathkeys.

2014-06-13 Thread Kyotaro HORIGUCHI
Hello, This is the continuation from the last CF.

This patch intends to make PG to use index for longer pathkeys
than index columns when,

 - The index is a unique index.
 - All index columns are NOT NULL.
 - The index column list is a subset of query_pathkeys.

The use cases for this patch are,

 - (Useless?) DISTINCT on the table with PK constraints.

   This alone is useless but this situation could take place when
   processing UNION (without ALL), which silently adds DISTINCT *
   to element queries.  Especially effective with ORDER BY and
   LIMIT clauses.  But this needs another patch for UNION
   optimization. This is the main motive for this patch.

 - Unfortunately, other use cases are somewhat boresome, mainly
   effective for essentially unnecessary ORDER BY or DISTINCT. To
   streatch a point, naively or mechanically composed queires
   could be saved by this patch..



Duing last CF, I proposed to match long pathkeys against index
columns during creating index paths. This worked fine but also it
is difficult to make sure that all side-effects are
eliminated. Finally Tom Lane suggested to truncate pathkeys while
generation of the pathkeys itself. So this patch comes.

This patch consists of mainly three parts.

 1. Mark index as 'Uniquely orders the table' if the conditions
are satisfied. This is the same from the previous patch.

 2. Collect the common primary pathkeys, which is pathkeys that
can perfectly sort all involved
RTE's. (collect_common_primary_pathkeys())

 3. Trim the query pathkeys using the common primary pathkeys.
(trim_query_pathkeys())

These steps take place between set_base_rel_sizes() and
set_base_rel_pathlists() in make_one_rel(). The reason for the
position is that the step 2 above needs all inheritence tables to
be extended in PlannerInfo and set_base_rel_sizes (currently)
does that. Then the trimmed pathkeys are used in
set_base_rel_pathlists so trim_query_pathkeys should be before
it. (This needs to be written as a comment?)

Finally, the new patch has grown up to twice in size.. Although
it seems a bit large, I think that side-effects are clearly
avoided.


This message is attatched by two patches.

 1. pathkey_and_uniqueindx_typ2_v1.patch : This new patch.

 2. pathkey_and_uniqueindx_v10_20130411.patch : The last version
of the previous approach.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index fdaa964..47b8056 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -50,6 +50,7 @@ int			geqo_threshold;
 join_search_hook_type join_search_hook = NULL;
 
 
+static List *collect_common_primary_pathkeys(PlannerInfo *root);
 static void set_base_rel_sizes(PlannerInfo *root);
 static void set_base_rel_pathlists(PlannerInfo *root);
 static void set_rel_size(PlannerInfo *root, RelOptInfo *rel,
@@ -58,6 +59,7 @@ static void set_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
  Index rti, RangeTblEntry *rte);
 static void set_plain_rel_size(PlannerInfo *root, RelOptInfo *rel,
    RangeTblEntry *rte);
+static void trim_query_pathkeys(PlannerInfo * root);
 static void set_plain_rel_pathlist(PlannerInfo *root, RelOptInfo *rel,
 	   RangeTblEntry *rte);
 static void set_foreign_size(PlannerInfo *root, RelOptInfo *rel,
@@ -102,6 +104,214 @@ static void recurse_push_qual(Node *setOp, Query *topquery,
   RangeTblEntry *rte, Index rti, Node *qual);
 static void remove_unused_subquery_outputs(Query *subquery, RelOptInfo *rel);
 
+/*
+ * collect_common_primary_pathkeys: Collects common unique and non-null index
+ * pathkeys from all base relations in current root.
+ */
+
+static List *
+collect_common_primary_pathkeys(PlannerInfo *root)
+{
+	List *result = NIL;
+	Index rti;
+	int   nindex = -1;
+	List **pathkeys_array0 = NULL;
+	List **pathkeys_array = NULL;
+	List **pathkeys_array2 = NULL;
+	bool multipass = (root-simple_rel_array_size  2);
+
+	for (rti = 1 ; rti  root-simple_rel_array_size  nindex ; rti++)
+	{
+		RelOptInfo *rel = root-simple_rel_array[rti];
+		List **pktmp;
+
+		if (rel == NULL)
+			continue;
+
+		Assert (rel-relid == rti);
+
+		/*
+		 * Scan all base relations except parent relations.
+		 */
+		if (root-simple_rte_array[rti]-inh ||
+			(rel-reloptkind != RELOPT_BASEREL 
+			 rel-reloptkind != RELOPT_OTHER_MEMBER_REL))
+			continue;
+
+		/* pathkeys_array is NULL for the first iteration */
+		if (pathkeys_array == NULL)
+		{
+			ListCell *lc;
+			int i = 0;
+
+			/* Skip rigging for AND operation if not needed. */
+			if (multipass)
+			{
+nindex = 0;
+
+foreach (lc, rel-indexlist)
+{
+	IndexOptInfo *index = (IndexOptInfo *) lfirst(lc);
+
+	if (index-full_ordered  index-indpred == NIL)
+		nindex++;
+}
+
+if (nindex == 0)
+	return NULL;
+
+pathkeys_array0 = palloc0(sizeof(List*) * nindex * 2);
+pathkeys_array  = pathkeys_array0;
+	

Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Ian Barwick
Hi

On 14/06/13 16:20, Quan Zongliang wrote:
 Hi all,
 
 Please find the attachment.
 
 By my friend asking, for convenience,
 support to define multi variables in single PL/pgSQL line.
 
 Like this:
 
 CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
 DECLARE
 local_a, local_b, local_c text := 'a1';
 BEGIN
 return local_a || local_b || local_c;
 end;
 $$ LANGUAGE plpgsql;

Please submit this patch to the current commitfest:

  https://commitfest.postgresql.org/action/commitfest_view?id=22


Regards

Ian Barwick


-- 
 Ian Barwick   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Pavel Stehule
2014-06-13 9:41 GMT+02:00 Michael Paquier michael.paqu...@gmail.com:

 On Fri, Jun 13, 2014 at 4:20 PM, Quan Zongliang quanzongli...@gmail.com
 wrote:
  By my friend asking, for convenience,
  support to define multi variables in single PL/pgSQL line.
 
  Like this:
 
  CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
  DECLARE
  local_a, local_b, local_c text := 'a1';
  BEGIN
  return local_a || local_b || local_c;
  end;
  $$ LANGUAGE plpgsql;
 I don't recall that this is possible. Have a look at the docs as well:
 http://www.postgresql.org/docs/current/static/plpgsql-declarations.html
 --


It will be possible with Quan' patch :)

Pavel


  Michael


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] WAL replay bugs

2014-06-13 Thread Heikki Linnakangas

On 06/13/2014 10:14 AM, Michael Paquier wrote:

On Mon, Jun 2, 2014 at 9:55 PM, Michael Paquier
michael.paqu...@gmail.com wrote:

On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
Perhaps there are parts of what is proposed here that could be made
more generalized, like the masking functions. So do not hesitate if
you have any opinion on the matter.

OK, attached is the result of this hacking:

Buffer capture facility: check WAL replay consistency

It is a tool aimed to be used by developers and buildfarm machines
that can be used to check for consistency at page level when replaying
WAL files among several nodes of a cluster (generally master and
standby node).

This facility is made of two parts:
- A server part, where all the changes happening at page level are
captured and inserted in a file called buffer_captures located at the
root of PGDATA. Each buffer entry is masked to make the comparison
across node consistent (flags like hint bits for example) and then
each buffer is captured is with the following format as a single line
of the output file:
LSN: %08X/%08X page: PAGE_IN_HEXA
Hexadecimal format makes it easier to detect differences between
pages, and format is chosen to facilitate comparison between buffer
entries.
- A client part, located in contrib/buffer_capture_cmp, that can be
used to compare buffer captures between nodes.


Oh, you moved the masking code from the client tool to the backend. Why? 
When debugging, it's useful to have the genuine, non-masked page image 
available.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-06-13 Thread Michael Paquier
On Fri, Jun 13, 2014 at 4:48 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 06/13/2014 10:14 AM, Michael Paquier wrote:

 On Mon, Jun 2, 2014 at 9:55 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:

 On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 Perhaps there are parts of what is proposed here that could be made
 more generalized, like the masking functions. So do not hesitate if
 you have any opinion on the matter.

 OK, attached is the result of this hacking:

 Buffer capture facility: check WAL replay consistency

 It is a tool aimed to be used by developers and buildfarm machines
 that can be used to check for consistency at page level when replaying
 WAL files among several nodes of a cluster (generally master and
 standby node).

 This facility is made of two parts:
 - A server part, where all the changes happening at page level are
 captured and inserted in a file called buffer_captures located at the
 root of PGDATA. Each buffer entry is masked to make the comparison
 across node consistent (flags like hint bits for example) and then
 each buffer is captured is with the following format as a single line
 of the output file:
 LSN: %08X/%08X page: PAGE_IN_HEXA
 Hexadecimal format makes it easier to detect differences between
 pages, and format is chosen to facilitate comparison between buffer
 entries.
 - A client part, located in contrib/buffer_capture_cmp, that can be
 used to compare buffer captures between nodes.


 Oh, you moved the masking code from the client tool to the backend. Why?
 When debugging, it's useful to have the genuine, non-masked page image
 available.
My thought is to share the CPU effort of masking between backends...
That's not a big deal to move them back to the client tool though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL replay bugs

2014-06-13 Thread Michael Paquier
On Fri, Jun 13, 2014 at 4:50 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Fri, Jun 13, 2014 at 4:48 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 On 06/13/2014 10:14 AM, Michael Paquier wrote:

 On Mon, Jun 2, 2014 at 9:55 PM, Michael Paquier
 michael.paqu...@gmail.com wrote:

 On Wed, Apr 23, 2014 at 9:43 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 Perhaps there are parts of what is proposed here that could be made
 more generalized, like the masking functions. So do not hesitate if
 you have any opinion on the matter.

 OK, attached is the result of this hacking:

 Buffer capture facility: check WAL replay consistency

 It is a tool aimed to be used by developers and buildfarm machines
 that can be used to check for consistency at page level when replaying
 WAL files among several nodes of a cluster (generally master and
 standby node).

 This facility is made of two parts:
 - A server part, where all the changes happening at page level are
 captured and inserted in a file called buffer_captures located at the
 root of PGDATA. Each buffer entry is masked to make the comparison
 across node consistent (flags like hint bits for example) and then
 each buffer is captured is with the following format as a single line
 of the output file:
 LSN: %08X/%08X page: PAGE_IN_HEXA
 Hexadecimal format makes it easier to detect differences between
 pages, and format is chosen to facilitate comparison between buffer
 entries.
 - A client part, located in contrib/buffer_capture_cmp, that can be
 used to compare buffer captures between nodes.


 Oh, you moved the masking code from the client tool to the backend. Why?
 When debugging, it's useful to have the genuine, non-masked page image
 available.
 My thought is to share the CPU effort of masking between backends...
And that having a set of API to do page masking on the server side
would be useful for extensions as well. Now that I recall this was one
of the first things that came to my mind when looking at this
facility, thinking that it would be useful to have them in a separate
file, with a dedicated header.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-13 Thread Kyotaro HORIGUCHI
Hello, this is a patch that add the function to clear backup
location information to pg_resetxlog.

As per the discussion held before, this function cannot be back
patched to the older versions than 9.4. And it also slipped over
9.4 so proposed in this CF.

This simplly erases the backup location information in pg_control
written during recovery. This is seen as the output of
pg_controldata.

 | $ pg_controldata 
 | pg_control version number:942
 | Catalog version number:   201406121
 | ...
!| Backup start location:0/0
!| Backup end location:  0/0
!| End-of-backup record required:no
 | ...

Under some condition, this values sticks having valid values even
though the WAL record which indicates the end of backup won't
come. This option could be used to forcibly finish the blocked
recovery procedure.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index 915a1ed..5b80cfa 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -86,6 +86,7 @@ main(int argc, char *argv[])
 	int			c;
 	bool		force = false;
 	bool		noupdate = false;
+	bool		resetbackuplocs = false;
 	MultiXactId set_oldestmxid = 0;
 	char	   *endptr;
 	char	   *endptr2;
@@ -111,7 +112,7 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt(argc, argv, fl:m:no:O:x:e:)) != -1)
+	while ((c = getopt(argc, argv, fl:m:no:O:x:e:b)) != -1)
 	{
 		switch (c)
 		{
@@ -123,6 +124,10 @@ main(int argc, char *argv[])
 noupdate = true;
 break;
 
+			case 'b':
+resetbackuplocs = true;
+break;
+
 			case 'e':
 set_xid_epoch = strtoul(optarg, endptr, 0);
 if (endptr == optarg || *endptr != '\0')
@@ -351,6 +356,13 @@ main(int argc, char *argv[])
 		ControlFile.checkPointCopy.PrevTimeLineID = minXlogTli;
 	}
 
+	if (resetbackuplocs)
+	{
+		ControlFile.backupStartPoint = InvalidXLogRecPtr;
+		ControlFile.backupEndPoint = InvalidXLogRecPtr;
+		ControlFile.backupEndRequired = false;
+	}
+
 	if (minXlogSegNo  newXlogSegNo)
 		newXlogSegNo = minXlogSegNo;
 
@@ -1088,6 +1100,7 @@ usage(void)
 	printf(_(  -O OFFSETset next multitransaction offset\n));
 	printf(_(  -V, --versionoutput version information, then exit\n));
 	printf(_(  -x XID   set next transaction ID\n));
+	printf(_(  -b   reset backup start/end locations\n));
 	printf(_(  -?, --help   show this help, then exit\n));
 	printf(_(\nReport bugs to pgsql-b...@postgresql.org.\n));
 }

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Quan Zongliang

On 06/13/2014 03:42 PM, Ian Barwick wrote:

Hi

On 14/06/13 16:20, Quan Zongliang wrote:

Hi all,

Please find the attachment.

By my friend asking, for convenience,
support to define multi variables in single PL/pgSQL line.

Like this:

CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
DECLARE
local_a, local_b, local_c text := 'a1';
BEGIN
return local_a || local_b || local_c;
end;
$$ LANGUAGE plpgsql;


Please submit this patch to the current commitfest:

   https://commitfest.postgresql.org/action/commitfest_view?id=22


Regards

Ian Barwick



submitted
https://commitfest.postgresql.org/action/patch_view?id=1475


---
此电子邮件没有病毒和恶意软件,因为 avast! 防病毒保护处于活动状态。
http://www.avast.com



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Andres Freund
On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
 On Thu, Jun 12, 2014 at 12:45 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-12 08:55:59 +0530, Amit Kapila wrote:
   Function pg_create_logical_replication_slot() is trying to
   save slot twice once during CreateInitDecodingContext() and
   then in ReplicationSlotPersist(), isn't it better if we can make
   it do it just once?
 
  Doesn't work here. In the first save it's not yet marked as persistent -
  but we still need to safely reserve the xmin.
 
 Okay, but if it crashes before saving the persistency to permanent
 file and there remains a .tmp for this replication slot which it created
 during save of this persistency information, then also xmin will get
 lost, because during startup it will not consider such a slot.

I can't follow here. If it crashes before it's marked persistent it'll
get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
cleaned up at restart.
It's fine if the entire slot is lost if the server crashes too early -
we haven't yet returned sucess...

  Yes. There's lots of ways to screw over your database by using
  pg_resetxlog.
 
 Thats right, trying to think if there could be any thing which
 won't even allow the server to get started due to replication slots
 after pg_resetxlog.  As per my current understanding, I don't think
 there is any such problem.

I'm not aware of any.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Michael Paquier
On Fri, Jun 13, 2014 at 4:43 PM, Pavel Stehule pavel.steh...@gmail.com wrote:



 2014-06-13 9:41 GMT+02:00 Michael Paquier michael.paqu...@gmail.com:

 On Fri, Jun 13, 2014 at 4:20 PM, Quan Zongliang quanzongli...@gmail.com
 wrote:
  By my friend asking, for convenience,
  support to define multi variables in single PL/pgSQL line.
 
  Like this:
 
  CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
  DECLARE
  local_a, local_b, local_c text := 'a1';
  BEGIN
  return local_a || local_b || local_c;
  end;
  $$ LANGUAGE plpgsql;
 I don't recall that this is possible. Have a look at the docs as well:
 http://www.postgresql.org/docs/current/static/plpgsql-declarations.html
 --


 It will be possible with Quan' patch :)
Sorry I misread his email.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] [PATCH] introduce XLogLockBlockRangeForCleanup()

2014-06-13 Thread Abhijit Menon-Sen
nbtxlog.c:btree_xlog_vacuum() contains the following comment:

 * XXX we don't actually need to read the block, we just need to
 * confirm it is unpinned. If we had a special call into the
 * buffer manager we could optimise this so that if the block is
 * not in shared_buffers we confirm it as unpinned.

The attached patch introduces an XLogLockBlockRangeForCleanup() function
that addresses this, as well as a special call into the buffer manager
that makes it possible to lock the buffers without reading them.

The patch is by Simon Riggs, with some minor edits and testing by me.

I'm adding the patch to the CommitFest, and will follow up with some
performance numbers soon.

-- Abhijit
diff --git a/src/backend/access/nbtree/nbtxlog.c b/src/backend/access/nbtree/nbtxlog.c
index 5f9fc49..dc90c02 100644
--- a/src/backend/access/nbtree/nbtxlog.c
+++ b/src/backend/access/nbtree/nbtxlog.c
@@ -501,33 +501,9 @@ btree_xlog_vacuum(XLogRecPtr lsn, XLogRecord *record)
 	 * here during crash recovery.
 	 */
 	if (HotStandbyActiveInReplay())
-	{
-		BlockNumber blkno;
-
-		for (blkno = xlrec-lastBlockVacuumed + 1; blkno  xlrec-block; blkno++)
-		{
-			/*
-			 * We use RBM_NORMAL_NO_LOG mode because it's not an error
-			 * condition to see all-zero pages.  The original btvacuumpage
-			 * scan would have skipped over all-zero pages, noting them in FSM
-			 * but not bothering to initialize them just yet; so we mustn't
-			 * throw an error here.  (We could skip acquiring the cleanup lock
-			 * if PageIsNew, but it's probably not worth the cycles to test.)
-			 *
-			 * XXX we don't actually need to read the block, we just need to
-			 * confirm it is unpinned. If we had a special call into the
-			 * buffer manager we could optimise this so that if the block is
-			 * not in shared_buffers we confirm it as unpinned.
-			 */
-			buffer = XLogReadBufferExtended(xlrec-node, MAIN_FORKNUM, blkno,
-			RBM_NORMAL_NO_LOG);
-			if (BufferIsValid(buffer))
-			{
-LockBufferForCleanup(buffer);
-UnlockReleaseBuffer(buffer);
-			}
-		}
-	}
+		XLogLockBlockRangeForCleanup(xlrec-node, MAIN_FORKNUM,
+	 xlrec-lastBlockVacuumed + 1,
+	 xlrec-block);
 
 	/*
 	 * If we have a full-page image, restore it (using a cleanup lock) and
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index b7829ff..302551f 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -287,10 +287,6 @@ XLogReadBuffer(RelFileNode rnode, BlockNumber blkno, bool init)
  *
  * In RBM_ZERO and RBM_ZERO_ON_ERROR modes, if the page doesn't exist, the
  * relation is extended with all-zeroes pages up to the given block number.
- *
- * In RBM_NORMAL_NO_LOG mode, we return InvalidBuffer if the page doesn't
- * exist, and we don't check for all-zeroes.  Thus, no log entry is made
- * to imply that the page should be dropped or truncated later.
  */
 Buffer
 XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
@@ -331,8 +327,6 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
 			log_invalid_page(rnode, forknum, blkno, false);
 			return InvalidBuffer;
 		}
-		if (mode == RBM_NORMAL_NO_LOG)
-			return InvalidBuffer;
 		/* OK to extend the file */
 		/* we do this in recovery only - no rel-extension lock needed */
 		Assert(InRecovery);
@@ -375,6 +369,73 @@ XLogReadBufferExtended(RelFileNode rnode, ForkNumber forknum,
 	return buffer;
 }
 
+/*
+ * XLogBlockRangeForCleanup is used in Hot Standby mode to emulate the
+ * locking effects imposed by VACUUM for nbtrees.
+ */
+void
+XLogLockBlockRangeForCleanup(RelFileNode rnode, ForkNumber forkNum,
+			 BlockNumber startBlkno,
+			 BlockNumber uptoBlkno)
+{
+	BlockNumber blkno;
+	BlockNumber lastblock;
+	BlockNumber	endingBlkno;
+	Buffer		buffer;
+	BufferAccessStrategy bstrategy;
+	SMgrRelation smgr;
+
+	Assert(startBlkno != P_NEW);
+	Assert(uptoBlkno != P_NEW);
+
+	/* Open the relation at smgr level */
+	smgr = smgropen(rnode, InvalidBackendId);
+
+	/*
+	 * Create the target file if it doesn't already exist.  This lets us cope
+	 * if the replay sequence contains writes to a relation that is later
+	 * deleted.  (The original coding of this routine would instead suppress
+	 * the writes, but that seems like it risks losing valuable data if the
+	 * filesystem loses an inode during a crash.  Better to write the data
+	 * until we are actually told to delete the file.)
+	 */
+	smgrcreate(smgr, forkNum, true);
+
+	lastblock = smgrnblocks(smgr, forkNum);
+
+	endingBlkno = uptoBlkno;
+	if (lastblock  endingBlkno)
+		endingBlkno = lastblock;
+
+	/*
+	 * We need to use a BufferAccessStrategy because we do not want to
+	 * increment the buffer's usage_count as we read them. However,
+	 * there is no ring buffer associated with this strategy, since we
+	 * never actually read or write the contents, just lock them.
+	 */
+	bstrategy = GetAccessStrategy(BAS_DISCARD);
+
+	for (blkno 

Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Amit Kapila
On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund and...@2ndquadrant.com
wrote:
 On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
  Okay, but if it crashes before saving the persistency to permanent
  file and there remains a .tmp for this replication slot which it created
  during save of this persistency information, then also xmin will get
  lost, because during startup it will not consider such a slot.

 I can't follow here. If it crashes before it's marked persistent it'll
 get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
 cleaned up at restart.

Yes that's fine, the point I wanted to say here is that the flush
of slot in CreateInitDecodingContext() wouldn't save xmin in all
cases which is I think what it want to achieve.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Few observations in replication slots related code

2014-06-13 Thread Andres Freund
On 2014-06-13 14:21:33 +0530, Amit Kapila wrote:
 On Fri, Jun 13, 2014 at 1:45 PM, Andres Freund and...@2ndquadrant.com
 wrote:
  On 2014-06-13 13:01:59 +0530, Amit Kapila wrote:
   Okay, but if it crashes before saving the persistency to permanent
   file and there remains a .tmp for this replication slot which it created
   during save of this persistency information, then also xmin will get
   lost, because during startup it will not consider such a slot.
 
  I can't follow here. If it crashes before it's marked persistent it'll
  get deleted at startup (c.f. RestoreSlotFromDisk). And .tmp slots are
  cleaned up at restart.
 
 Yes that's fine, the point I wanted to say here is that the flush
 of slot in CreateInitDecodingContext() wouldn't save xmin in all
 cases which is I think what it want to achieve.

It'll savely stored until a xact abort (will be deleted during cleanup),
PANIC crash (deleted during startup) or until the slot is marked as
persistent. Which is what we need.

Maybe this helps:
/*
 * For logical decoding, it's extremely important that we never remove 
any
 * data that's still needed for decoding purposes, even after a crash;
 * otherwise, decoding will produce wrong answers.  Ordinary streaming
 * replication also needs to prevent old row versions from being removed
 * too soon, but the worst consequence we might encounter there is
 * unwanted query cancellations on the standby.  Thus, for logical
 * decoding, this value represents the latest xmin that has actually 
been
 * written to disk, whereas for streaming replication, it's just the 
same
 * as the persistent value (data.xmin).
 */
TransactionId effective_xmin;
TransactionId effective_catalog_xmin;


Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Quantify small changes to predicate evaluation

2014-06-13 Thread Dennis Butterstein
Hello everybody,

I am currently trying to squeeze some performance out of the current
predicate evaluation implementation. In fact I was able to get some positive
results.
I have some more points on my developement plan, but I am facing the problem
that theses changes will only help very little when talking about runtimes.

Therefore I am not able to quantify these changes at the moment as my
overall query runtimes do not seem to be stable across instance restarts.

Consider the following example:
Start postgres, run to warm up buffers
measure query wall clock time: 24.141s
Stop Postgres:

Start postgres, run to warm up buffers
measure query wall clock time: 25.845
Stop Postgres

I expect my current changes to be resposible for about 0.2-0.3s for this
query but because of the huge time differences I am not able to quantify my
changes.

Maybe somebody can tell me about a better approach to quantify my changes or
give me some input on how to get more stable postgres time measurements.

Thank you very much.

Kind regards

Dennis

If needed please find some configuration parameters below
max_connections = 2
superuser_reserved_connections = 1  
ssl = off
shared_buffers = 20GB
max_prepared_transactions = 0
work_mem = 500MB
fsync = off
archive_mode = off
hot_standby = off
effective_cache_size = 20GB
logging_collector = off
debug_print_parse = off
debug_print_rewritten = off
debug_print_plan = off
debug_pretty_print = off
log_checkpoints = off
log_connections = off
log_disconnections = off
log_duration = off
log_hostname = off
track_activities = off
track_counts = off
track_io_timing = off
track_functions = none
log_parser_stats = off
log_planner_stats = off
log_executor_stats = off
log_statement_stats = off
autovacuum = off



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Quantify-small-changes-to-predicate-evaluation-tp5807190.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Quantify small changes to predicate evaluation

2014-06-13 Thread Marti Raudsepp
On Fri, Jun 13, 2014 at 12:46 PM, Dennis Butterstein soullinu...@web.de wrote:
 I expect my current changes to be resposible for about 0.2-0.3s for this
 query but because of the huge time differences I am not able to quantify my
 changes.

 Maybe somebody can tell me about a better approach to quantify my changes or
 give me some input on how to get more stable postgres time measurements.

There can be other reasons, but for read-only benchmarks, much of this
variability seems to come from the operating system's power management
and scheduler.

I had some luck in reducing variance on Linux with some tricks.
Disable CPU frequency scaling:
for i in /sys/devices/system/cpu/cpu*/cpufreq/scaling_governor; do
echo performance  $i; done

Disable the turbo boost feature if your CPU supports it:
echo 1  /sys/devices/system/cpu/intel_pstate/no_turbo

Always launch PostgreSQL and pgbench on a fixed CPU core, using taskset -c3

And exclude all other processes from this core (locking them to cores 0, 1, 2):
ps -A -o pid h |xargs -n1 taskset -cp -a 0-2 /dev/null

Transparent hugepage support may also interfere:
echo never  /sys/kernel/mm/transparent_hugepage/defrag
echo never  /sys/kernel/mm/transparent_hugepage/enabled

I'm sure there are more tricks, but this should get you pretty far.

Regards,
Marti


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_resetxlog to clear backup start/end locations.

2014-06-13 Thread Fujii Masao
On Fri, Jun 13, 2014 at 5:08 PM, Kyotaro HORIGUCHI
horiguchi.kyot...@lab.ntt.co.jp wrote:
 Hello, this is a patch that add the function to clear backup
 location information to pg_resetxlog.

 As per the discussion held before, this function cannot be back
 patched to the older versions than 9.4. And it also slipped over
 9.4 so proposed in this CF.

 This simplly erases the backup location information in pg_control
 written during recovery. This is seen as the output of
 pg_controldata.

  | $ pg_controldata
  | pg_control version number:942
  | Catalog version number:   201406121
  | ...
 !| Backup start location:0/0
 !| Backup end location:  0/0
 !| End-of-backup record required:no
  | ...

 Under some condition, this values sticks having valid values even
 though the WAL record which indicates the end of backup won't
 come. This option could be used to forcibly finish the blocked
 recovery procedure.

The document about pg_resetxlog needs to be updated.

I think that pg_resetxlog should reset backup locations by default
since they are useless (rather harmful) after pg_resetxlog. Thought?

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] rm_desc signature

2014-06-13 Thread Heikki Linnakangas
As part of the WAL-format changing patch I've been working on, I changed 
the signature of the rm_desc function from:


void (*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
void (*rm_desc) (StringInfo buf, XLogRecord *record);

The WAL-format patch needed that because it added more functions/macros 
for working with XLogRecords, also used by rm_desc routines, but it 
seems like a sensible change anyway. IMHO it was always a bit strange 
that rm_desc was passed the info field and record payload separately.


So I propose to do that change as a separate commit. Per attached. This 
has no functional changes, it's just refactoring.


Any objections?

- Heikki
commit aa645133fa92dcf4c89749f5121bf4c6631d298c
Author: Heikki Linnakangas heikki.linnakan...@iki.fi
Date:   Fri Jun 13 14:20:39 2014 +0300

Change the signature of rm_desc so that it's passed XLogRecord.

diff --git a/contrib/pg_xlogdump/pg_xlogdump.c b/contrib/pg_xlogdump/pg_xlogdump.c
index 824b8c3..c555786 100644
--- a/contrib/pg_xlogdump/pg_xlogdump.c
+++ b/contrib/pg_xlogdump/pg_xlogdump.c
@@ -351,7 +351,7 @@ XLogDumpDisplayRecord(XLogDumpConfig *config, XLogRecPtr ReadRecPtr, XLogRecord
 		   !!(XLR_BKP_BLOCK(3)  record-xl_info));
 
 	/* the desc routine will printf the description directly to stdout */
-	desc-rm_desc(NULL, record-xl_info, XLogRecGetData(record));
+	desc-rm_desc(NULL, record);
 
 	putchar('\n');
 
diff --git a/contrib/pg_xlogdump/rmgrdesc.h b/contrib/pg_xlogdump/rmgrdesc.h
index edf8257..d964118 100644
--- a/contrib/pg_xlogdump/rmgrdesc.h
+++ b/contrib/pg_xlogdump/rmgrdesc.h
@@ -13,7 +13,7 @@
 typedef struct RmgrDescData
 {
 	const char *rm_name;
-	void		(*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
+	void		(*rm_desc) (StringInfo buf, XLogRecord *record);
 } RmgrDescData;
 
 extern const RmgrDescData RmgrDescTable[];
diff --git a/src/backend/access/rmgrdesc/clogdesc.c b/src/backend/access/rmgrdesc/clogdesc.c
index 25a8522..e82baa8 100644
--- a/src/backend/access/rmgrdesc/clogdesc.c
+++ b/src/backend/access/rmgrdesc/clogdesc.c
@@ -18,9 +18,10 @@
 
 
 void
-clog_desc(StringInfo buf, uint8 xl_info, char *rec)
+clog_desc(StringInfo buf, XLogRecord *record)
 {
-	uint8		info = xl_info  ~XLR_INFO_MASK;
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = record-xl_info  ~XLR_INFO_MASK;
 
 	if (info == CLOG_ZEROPAGE)
 	{
diff --git a/src/backend/access/rmgrdesc/dbasedesc.c b/src/backend/access/rmgrdesc/dbasedesc.c
index 1ea02e7..0230716 100644
--- a/src/backend/access/rmgrdesc/dbasedesc.c
+++ b/src/backend/access/rmgrdesc/dbasedesc.c
@@ -19,9 +19,10 @@
 
 
 void
-dbase_desc(StringInfo buf, uint8 xl_info, char *rec)
+dbase_desc(StringInfo buf, XLogRecord *record)
 {
-	uint8		info = xl_info  ~XLR_INFO_MASK;
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = record-xl_info  ~XLR_INFO_MASK;
 
 	if (info == XLOG_DBASE_CREATE)
 	{
diff --git a/src/backend/access/rmgrdesc/gindesc.c b/src/backend/access/rmgrdesc/gindesc.c
index cd1edff..12d68d7 100644
--- a/src/backend/access/rmgrdesc/gindesc.c
+++ b/src/backend/access/rmgrdesc/gindesc.c
@@ -77,9 +77,10 @@ desc_recompress_leaf(StringInfo buf, ginxlogRecompressDataLeaf *insertData)
 }
 
 void
-gin_desc(StringInfo buf, uint8 xl_info, char *rec)
+gin_desc(StringInfo buf, XLogRecord *record)
 {
-	uint8		info = xl_info  ~XLR_INFO_MASK;
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = record-xl_info  ~XLR_INFO_MASK;
 
 	switch (info)
 	{
@@ -121,7 +122,7 @@ gin_desc(StringInfo buf, uint8 xl_info, char *rec)
 	ginxlogRecompressDataLeaf *insertData =
 	(ginxlogRecompressDataLeaf *) payload;
 
-	if (xl_info  XLR_BKP_BLOCK(0))
+	if (record-xl_info  XLR_BKP_BLOCK(0))
 		appendStringInfo(buf,  (full page image));
 	else
 		desc_recompress_leaf(buf, insertData);
@@ -159,7 +160,7 @@ gin_desc(StringInfo buf, uint8 xl_info, char *rec)
 
 appendStringInfoString(buf, Vacuum data leaf page, );
 desc_node(buf, xlrec-node, xlrec-blkno);
-if (xl_info  XLR_BKP_BLOCK(0))
+if (record-xl_info  XLR_BKP_BLOCK(0))
 	appendStringInfo(buf,  (full page image));
 else
 	desc_recompress_leaf(buf, xlrec-data);
diff --git a/src/backend/access/rmgrdesc/gistdesc.c b/src/backend/access/rmgrdesc/gistdesc.c
index b7f8766..cdac496 100644
--- a/src/backend/access/rmgrdesc/gistdesc.c
+++ b/src/backend/access/rmgrdesc/gistdesc.c
@@ -42,9 +42,10 @@ out_gistxlogPageSplit(StringInfo buf, gistxlogPageSplit *xlrec)
 }
 
 void
-gist_desc(StringInfo buf, uint8 xl_info, char *rec)
+gist_desc(StringInfo buf, XLogRecord *record)
 {
-	uint8		info = xl_info  ~XLR_INFO_MASK;
+	char	   *rec = XLogRecGetData(record);
+	uint8		info = record-xl_info  ~XLR_INFO_MASK;
 
 	switch (info)
 	{
diff --git a/src/backend/access/rmgrdesc/hashdesc.c b/src/backend/access/rmgrdesc/hashdesc.c
index 23be856..86a0376 100644
--- a/src/backend/access/rmgrdesc/hashdesc.c
+++ b/src/backend/access/rmgrdesc/hashdesc.c
@@ -17,6 +17,6 @@
 #include access/hash.h
 
 void

Re: [HACKERS] rm_desc signature

2014-06-13 Thread Andres Freund
On 2014-06-13 14:37:33 +0300, Heikki Linnakangas wrote:
 As part of the WAL-format changing patch I've been working on, I changed the
 signature of the rm_desc function from:
 
 void (*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
 void (*rm_desc) (StringInfo buf, XLogRecord *record);
 
 The WAL-format patch needed that because it added more functions/macros for
 working with XLogRecords, also used by rm_desc routines, but it seems like a
 sensible change anyway. IMHO it was always a bit strange that rm_desc was
 passed the info field and record payload separately.

+1. I've found this annoying in the past.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rm_desc signature

2014-06-13 Thread Fujii Masao
On Fri, Jun 13, 2014 at 8:39 PM, Andres Freund and...@2ndquadrant.com wrote:
 On 2014-06-13 14:37:33 +0300, Heikki Linnakangas wrote:
 As part of the WAL-format changing patch I've been working on, I changed the
 signature of the rm_desc function from:

 void (*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
 void (*rm_desc) (StringInfo buf, XLogRecord *record);

 The WAL-format patch needed that because it added more functions/macros for
 working with XLogRecords, also used by rm_desc routines, but it seems like a
 sensible change anyway. IMHO it was always a bit strange that rm_desc was
 passed the info field and record payload separately.

+1, too.

-/* #define WAL_DEBUG */
+#define WAL_DEBUG

ISTM you just forgot to exclude this change from the patch.

Regards,

-- 
Fujii Masao


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] rm_desc signature

2014-06-13 Thread Abhijit Menon-Sen
At 2014-06-13 13:39:58 +0200, and...@2ndquadrant.com wrote:

  void (*rm_desc) (StringInfo buf, uint8 xl_info, char *rec);
  void (*rm_desc) (StringInfo buf, XLogRecord *record);
  
  […]
 
 +1. I've found this annoying in the past.

I like it too. I was just moving some code from pg_xlogdump into another
(new) rm_desc-like callback, and passing in the XLogRecord makes much
more sense to me.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Tom Lane
Quan Zongliang quanzongli...@gmail.com writes:
 CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
 DECLARE
 local_a, local_b, local_c text := 'a1';
 BEGIN
 return local_a || local_b || local_c;
 end;
 $$ LANGUAGE plpgsql;

This does not seem like a terribly good idea from here.  The main problem
with the syntax is that it's very unclear whether the initializer (if any)
applies to all the variables or just one.  C users will probably think
the latter but your example seems to suggest that you think the former.
I doubt that this adds so much usefulness that it's worth adding confusion
too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit of logout

2014-06-13 Thread Fujii Masao
On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Hi,

 Some users enable log_disconnections in postgresql.conf to audit all logouts.
 But since log_disconnections is defined with PGC_BACKEND, it can be changed
 at connection start. This means that any client (even nonsuperuser) can freely
 disable log_disconnections not to log his or her logout even when the
 system admin
 enables it in postgresql.conf. Isn't this problematic for audit?

That's harmful for audit purpose. I think that we should make
log_disconnections PGC_SUSET rather than PGC_BACKEND in order
to forbid non-superusers from changing its setting. Attached
patch does this.

Also defining log_disconnections with PGC_BACKEND itself seems strange.
Since it's used only at connection termination, there seems to be
no need to fix its setting value at connection startup. No? OTOH,
for example, log_connections and post_auth_delay are defined with
PGC_BACKEND and their settings can be changed only at connection startup.
This seems intuitive because they are used only at connection
startup and it's useless to change their settings after that. But
the situation of log_disconnections seems different from them.
Am I missing something?

One concern is; the patch may break the existing application if it
relies on the current behavior of log_disconnections. But I'm
wondering if such applications really exist.

Thought?

Regards,

-- 
Fujii Masao
From d3e6db1516a8cbb557e38a56b26c34ed7e51d9e1 Mon Sep 17 00:00:00 2001
From: MasaoFujii masao.fu...@gmail.com
Date: Fri, 13 Jun 2014 22:09:39 +0900
Subject: [PATCH] Make log_disconnections PGC_SUSET rather than PGC_BACKEND.

So far even non-superusers could disable log_disconnections in order to
prevent their session logout from being logged because the parameter was
defined with PGC_BACKEND. This was harmful in the systems which need to
audit all session logouts by using log_disconnections. For this problem,
this commit changes the GUC context of log_disconnections to PGC_SUSET
and forbids non-superuser from changing its setting.
---
 doc/src/sgml/config.sgml |2 +-
 src/backend/utils/misc/guc.c |2 +-
 2 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 697cf99..184d864 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -4234,7 +4234,7 @@ local0.*/var/log/postgresql
 varnamelog_connections/varname but at session termination,
 and includes the duration of the session.  This is off by
 default.
-This parameter cannot be changed after session start.
+Only superusers can change this setting.
/para
   /listitem
  /varlistentry
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 1d094f0..7c84c9f 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -922,7 +922,7 @@ static struct config_bool ConfigureNamesBool[] =
 		NULL, NULL, NULL
 	},
 	{
-		{log_disconnections, PGC_BACKEND, LOGGING_WHAT,
+		{log_disconnections, PGC_SUSET, LOGGING_WHAT,
 			gettext_noop(Logs end of a session, including duration.),
 			NULL
 		},
-- 
1.7.1


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Pavel Stehule
We can disallow custom initialization when when variables are declared as
list.

Quan' example is 100% valid in SQL/PSM and what I read about ADA then in
ADA too.

Regards

Pavel


2014-06-13 16:04 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 Quan Zongliang quanzongli...@gmail.com writes:
  CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
  DECLARE
  local_a, local_b, local_c text := 'a1';
  BEGIN
  return local_a || local_b || local_c;
  end;
  $$ LANGUAGE plpgsql;

 This does not seem like a terribly good idea from here.  The main problem
 with the syntax is that it's very unclear whether the initializer (if any)
 applies to all the variables or just one.  C users will probably think
 the latter but your example seems to suggest that you think the former.
 I doubt that this adds so much usefulness that it's worth adding confusion
 too.

 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Andres Freund
On 2014-06-13 16:12:36 +0200, Pavel Stehule wrote:
 Quan' example is 100% valid in SQL/PSM and what I read about ADA then in
 ADA too.

So what? plpgsql is neither language and this doesn't seem to be the way
to make them actually closer (which I doubt would be a good idea in the
first place).

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Audit of logout

2014-06-13 Thread Tom Lane
Fujii Masao masao.fu...@gmail.com writes:
 On Thu, Jun 12, 2014 at 8:51 PM, Fujii Masao masao.fu...@gmail.com wrote:
 Some users enable log_disconnections in postgresql.conf to audit all logouts.
 But since log_disconnections is defined with PGC_BACKEND, it can be changed
 at connection start. This means that any client (even nonsuperuser) can 
 freely
 disable log_disconnections not to log his or her logout even when the
 system admin
 enables it in postgresql.conf. Isn't this problematic for audit?

 That's harmful for audit purpose. I think that we should make
 log_disconnections PGC_SUSET rather than PGC_BACKEND in order
 to forbid non-superusers from changing its setting. Attached
 patch does this.

TBH, this complaint seems entirely artificial.  If somebody needs to audit
disconnections, and they see a connection record with no disconnection
record but the session's no longer there, they certainly know that a
disconnection occurred.  And if there wasn't a server crash to explain it,
they go slap the wrist of whoever violated corporate policy by turning off
log_disconnections.  Why do we need system-level enforcement of this?

Moreover, your proposed fix breaks the entire point of the PGC_BACKEND
setting, which was to try to prevent disconnections from being logged
or not logged when the corresponding connection was not logged or logged
respectively.  If an auditor wants the system to enforce that there are
matched pairs of those records, he's not exactly going to be satisfied by
being told that only superusers can cause them to not match.

I wonder whether we should just get rid of log_disconnections as a
separate variable, instead logging disconnections when log_connections
is set.

Another answer is to make both variables PGC_SIGHUP, on the grounds
that it doesn't make much sense for them not to be applied system-wide;
except that I think there was some idea that logging might be enabled
per-user or per-database using ALTER ROLE/DATABASE.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Pavel Stehule
2014-06-13 16:17 GMT+02:00 Andres Freund and...@2ndquadrant.com:

 On 2014-06-13 16:12:36 +0200, Pavel Stehule wrote:
  Quan' example is 100% valid in SQL/PSM and what I read about ADA then in
  ADA too.

 So what? plpgsql is neither language and this doesn't seem to be the way
 to make them actually closer (which I doubt would be a good idea in the
 first place).


PL/pgSQL is based on PL/SQL, that is based on ADA. Next PL/SQL takes some
statements from SQL/PSM .. GET DIAGNOSTICS statement, and we implemented
these statements in PL/pgSQL too.

Some statements in PL/pgSQL are our design - RAISE is good example. So
PL/pgSQL is mix PL/SQL, SQL/PSM, and some proprietary

Regards

Pavel







 Greetings,

 Andres Freund

 --
  Andres Freund http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training  Services



Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Tom Lane
Andres Freund and...@2ndquadrant.com writes:
 On 2014-06-13 16:12:36 +0200, Pavel Stehule wrote:
 Quan' example is 100% valid in SQL/PSM and what I read about ADA then in
 ADA too.

 So what? plpgsql is neither language and this doesn't seem to be the way
 to make them actually closer (which I doubt would be a good idea in the
 first place).

What plpgsql actually tries to model is Oracle's PL/SQL, in which this
syntax is specifically *not* allowed (at least according to the 2008-or-so
manual I have handy).

The SQL/PSM reference is kind of interesting, since as far as I can tell
the standard does allow this syntax but it fails to explain what the
initialization behavior is.  The actual text of SQL:2011 14.4 SQL
variable declaration general rule 1 is:

  If SQL variable declaration contains default clause DC, then let DV be
  the default option contained in DC. Otherwise let DV be null
  specification. Let SV be the variable defined by the SQL variable
  declaration. The following SQL-statement is effectively executed:

SET SV = DV

It says the variable, not the variables, and definitely not for each
variable.  Are we supposed to read this as only one variable getting
initialized?  Even assuming that that's an obvious thinko and they meant
to say for each variable SV, the following is executed, it's unclear
whether DV is to be evaluated once, or once per variable.  If it's a
volatile expression then that matters.

At the very least I think we should stay away from this syntax until
the SQL committee understand it better than they evidently do today.
I don't want to implement it and then get caught by a future
clarification that resolves the issue differently than we did.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread David G Johnston
Tom Lane-2 wrote
 Andres Freund lt;

 andres@

 gt; writes:
 On 2014-06-13 16:12:36 +0200, Pavel Stehule wrote:
 Quan' example is 100% valid in SQL/PSM and what I read about ADA then in
 ADA too.
 
 So what? plpgsql is neither language and this doesn't seem to be the way
 to make them actually closer (which I doubt would be a good idea in the
 first place).
 
 What plpgsql actually tries to model is Oracle's PL/SQL, in which this
 syntax is specifically *not* allowed (at least according to the 2008-or-so
 manual I have handy).
 
 At the very least I think we should stay away from this syntax until
 the SQL committee understand it better than they evidently do today.
 I don't want to implement it and then get caught by a future
 clarification that resolves the issue differently than we did.

Haven't read the patch so, conceptually...

Its not quite as unclear as you make it out to be:

local_a, local_b, local_c text := 'a1'; 

The text type declaration MUST apply to all three variables, so extending
that to include the default assignment would be the internally consistent
decision.

I'm not sure the following would be useful but:

var_1, var_2, var_3 integer := generate_series(1,3)

If the expression results in either a 3x1 or a 1x3 (in the three var
example) we could do an expansion.  If it results in a 1x1 that value would
be copied without re-executing the function.

Though I suppose someone might want to do the following:

random_1, random_2, random_3 float := random(1234);

The decision to copy, not re-execute, is safer to use as the behavior and
force explicitness in the re-execute situation.


Until then suggest that the friend do:

DECLARE 
local_a text := 'a1';
local_b text := local_a;
local_c text := local_a;
BEGIN 



--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/PL-pgSQL-support-to-define-multi-variables-once-tp5807168p5807215.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Tom Lane
David G Johnston david.g.johns...@gmail.com writes:
 Tom Lane-2 wrote
 At the very least I think we should stay away from this syntax until
 the SQL committee understand it better than they evidently do today.
 I don't want to implement it and then get caught by a future
 clarification that resolves the issue differently than we did.

 Its not quite as unclear as you make it out to be:

Yes it is.

 Though I suppose someone might want to do the following:
 random_1, random_2, random_3 float := random(1234);
 The decision to copy, not re-execute, is safer to use as the behavior and
 force explicitness in the re-execute situation.

I would agree with that argument, if we both sat on the SQL committee and
were discussing how to resolve the ambiguity.  We don't, and we have no
good way to predict what they'll do (when and if they do anything :-().

The problem I've got is that a literal reading of the spec seems to
suggest multiple evaluation, since DV appears to refer to the syntactic
construct representing the initializer, not its evaluated value.  It's
hard to argue that the spec isn't telling us to do this:

 SET random_1 = random(1234);
 SET random_2 = random(1234);
 SET random_3 = random(1234);

That's not the reading I want, and it's not the reading you want either,
but there is nothing in the existing text that justifies single
evaluation.  So I think we'd be well advised to sit on our hands until
the committee clarifies that.  It's not like there is some urgent reason
to have this feature.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] make check For Extensions

2014-06-13 Thread David E. Wheeler
On Jun 12, 2014, at 11:40 PM, Fabien COELHO coe...@cri.ensmp.fr wrote:

 I would suggest to add that to https://wiki.postgresql.org/wiki/Todo.
 
 I may look into it when I have time, over the summer. The key point is that 
 there is no need for a temporary installation, but only of a temporary 
 cluster, and to trick this cluster into loading the uninstalled extension, 
 maybe by playing with dynamic_library_path in the temporary cluster.

The temporary cluster will be in a temporarty `initdb`ed directory, no? If so, 
you can just install the extension there.

Best,

David



signature.asc
Description: Message signed with OpenPGP using GPGMail


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Alvaro Herrera
Quan Zongliang wrote:
 Hi all,
 
 Please find the attachment.
 
 By my friend asking, for convenience,
 support to define multi variables in single PL/pgSQL line.
 
 Like this:
 
 CREATE OR REPLACE FUNCTION try_mutlivardef() RETURNS text AS $$
 DECLARE
 local_a, local_b, local_c text := 'a1';
 BEGIN
 return local_a || local_b || local_c;
 end;
 $$ LANGUAGE plpgsql;

This seems pretty odd.  I think if you were to state it like this:

create or replace function multivar() returns text language plpgsql as $$
declare
a, b, c text;
begin
a := b := c := 'a1--';
return a || b || c;
end;
$$;

it would make more sense to me.  There are two changes to current
behavior in that snippet; one is the ability to declare several
variables in one go (right now you need one declaration for each), and
the other is that assignment is an expression that evaluates to the
assigned value, so you can assign that to another variable.

Personally, in C I don't think I do this:
int a, b;
but always int a; int b; (two lines of course).  This is matter of
personal taste, but it seems clearer to me.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread Pavel Stehule
2014-06-13 17:32 GMT+02:00 Tom Lane t...@sss.pgh.pa.us:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  At the very least I think we should stay away from this syntax until
  the SQL committee understand it better than they evidently do today.
  I don't want to implement it and then get caught by a future
  clarification that resolves the issue differently than we did.

  Its not quite as unclear as you make it out to be:

 Yes it is.

  Though I suppose someone might want to do the following:
  random_1, random_2, random_3 float := random(1234);
  The decision to copy, not re-execute, is safer to use as the behavior and
  force explicitness in the re-execute situation.

 I would agree with that argument, if we both sat on the SQL committee and
 were discussing how to resolve the ambiguity.  We don't, and we have no
 good way to predict what they'll do (when and if they do anything :-().

 The problem I've got is that a literal reading of the spec seems to
 suggest multiple evaluation, since DV appears to refer to the syntactic
 construct representing the initializer, not its evaluated value.  It's
 hard to argue that the spec isn't telling us to do this:

  SET random_1 = random(1234);
  SET random_2 = random(1234);
  SET random_3 = random(1234);

 That's not the reading I want, and it's not the reading you want either,
 but there is nothing in the existing text that justifies single
 evaluation.  So I think we'd be well advised to sit on our hands until
 the committee clarifies that.  It's not like there is some urgent reason
 to have this feature.


I don't think so this feature is 100% necessary, but a few users requested
some more compressed form of variable declarations.

we can allow multi variable declaration without initial value specification

so a,b,c text can be valid, and a,b,c text := 'hello' not

It is just step to users who knows this feature from others languages.

Regards

Pavel



 regards, tom lane


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] PL/pgSQL support to define multi variables once

2014-06-13 Thread David Johnston
On Fri, Jun 13, 2014 at 11:32 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 David G Johnston david.g.johns...@gmail.com writes:
  Tom Lane-2 wrote
  At the very least I think we should stay away from this syntax until
  the SQL committee understand it better than they evidently do today.
  I don't want to implement it and then get caught by a future
  clarification that resolves the issue differently than we did.

  Its not quite as unclear as you make it out to be:

 Yes it is.


​Not withstanding the decision making of the SQL committee I was rejecting
as inconsistent:

SET random_1 = 0;
SET random_2 = 0;
SET random_3 = random(1234); ​

The ambiguity regarding re-execute or copy still remains.


 That's not the reading I want, and it's not the reading you want either,
 but there is nothing in the existing text that justifies single
 evaluation.  So I think we'd be well advised to sit on our hands until
 the committee clarifies that.  It's not like there is some urgent reason
 to have this feature.



Agreed.


I don't suppose there is any support or prohibition on the :

one,two,three integer := generate_series(1,3)​;

interpretation...not that I can actually come up with a good use case that
wouldn't be better implemented via a loop in the main body.

David J.


Re: [HACKERS] Audit of logout

2014-06-13 Thread David G Johnston
Tom Lane-2 wrote
 Another answer is to make both variables PGC_SIGHUP, on the grounds
 that it doesn't make much sense for them not to be applied system-wide;
 except that I think there was some idea that logging might be enabled
 per-user or per-database using ALTER ROLE/DATABASE.

From a trouble-shooting standpoint if I know that client software in
question is focused on particular users/databases being able to only enable
connection logging for those would make sense.  Whether that is a true
production concern is another matter but the possibility does exist.

I personally do not get how a logoff is a risk auditing event.  Logons and
actual queries I can understand.

For the same reason keeping them separate has merit since for imaginable
circumstances the logoffs are noise but the logons are important - and
forcing them to be on/off in tandem disallows the option to remove the noise
from the logs.

David J.






--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/Audit-of-logout-tp5806985p5807224.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-13 Thread Fabrízio de Royes Mello
On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hi,

  I need to port pgsql onto a controller which doesn't have a framework of
  creating multiple users for administrative purposes. The entire
 controller
  is managed by a single root user and that is the reason I am trying to
  change the pgsql initdb behavior. Do you think of any other better
  alternative?

 The reason you didn't see initdb completed is that it execs
 postgres on the way.

 As you know, it is strongly discourged on ordinary environment,
 but that framework sounds to be a single-user environment like
 what MS-DOS was, where any security risk comes from the
 characterisc is acceptable.

 I could see initdb and postgres operating as root for the moment
 (which means any possible side-effect is not checked) by making
 changes at four point in the whole postgresql source
 tree. Perhaps only two of them are needed for your wish.

 postgresql $ find . -type f -print | xargs grep -nH 'geteuid() == 0'
 ./src/backend/main/main.c:377:  if (geteuid() == 0)
 ./src/bin/pg_ctl/pg_ctl.c:2121: if (geteuid() == 0)
 ./src/bin/initdb/initdb.c:778:  if (geteuid() == 0)
  /* 0 is root's uid */
 ./src/bin/pg_resetxlog/pg_resetxlog.c:250:  if (geteuid() == 0)

 Try replacing these conditions with (0  geteuid() == 0) and
 you would see it run as root.


Maybe a compile option like '--enable-run-as-root' could be added to allow
it without the need of change the source code.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
 Timbira: http://www.timbira.com.br
 Blog sobre TI: http://fabriziomello.blogspot.com
 Perfil Linkedin: http://br.linkedin.com/in/fabriziomello
 Twitter: http://twitter.com/fabriziomello


Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-13 Thread Tom Lane
=?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
 On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI 
 horiguchi.kyot...@lab.ntt.co.jp wrote:
 Try replacing these conditions with (0  geteuid() == 0) and
 you would see it run as root.

 Maybe a compile option like '--enable-run-as-root' could be added to allow
 it without the need of change the source code.

If we wanted it to be easy to run PG as root, those tests wouldn't be
there in the first place.  We don't want that, so there will never be
anything to override that policy as easily as setting a configure option.

In the case at hand, the OP wants to run on some weird system that
apparently doesn't have multiple users at all.  It's a pretty safe bet
that hacking out those geteuid tests is just the tip of the iceberg of
what he's going to have to change to make it work, because it probably
deviates from typical-Unix-environment in a lot of other ways too.
So I can't get too concerned about how easy this one aspect is for him.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unable to attach client process to postgresql server using gdb

2014-06-13 Thread Robert Haas
On Fri, Jun 13, 2014 at 1:42 AM, Rajmohan C csrajmo...@gmail.com wrote:
I am working with PostgreSQL 9.3.4 source using Eclipse IDE in ubuntu
 14.04. I am facing a problem in attaching client process to postgresql
 server using gdb to debug. When I start the postmaster then I connect to it
 from client on a terminal. It works fine. Queries get responses. When I run
 debug config from eclipse then select postgres process id from list I get
 error saying

 Can't find a source file at
 /build/buildd/eglibc-2.19/socket/../sysdeps/unix/sysv/linux/x86_64/recv.c
 Locate the file or edit the source lookup path to include its location.

That is likely just a warning, and you've actually still attached to
the target process.

  After this when I send any query from client, it just stucks. No response
 comes. After attaching gdb to client process, client does not get any
 response from postgres server. One thing to note is that I was able to debug
 properly till yesterday. But now it is not working. I tried reinstalling but
 did not help. How could I fix this issue? Kindly help.

...which explains why it doesn't respond any more.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Clarification of FDW API Documentation

2014-06-13 Thread Jason Petersen
I've been deep in the FDW APIs lately and have come up with a couple of
questions about the [user-facing documentation][1].

# Requirement that DELETE use junk columns

The bit I'm confused by is the parenthetical in this bit at the end of the
section on `AddForeignUpdateTargets`:

 If the AddForeignUpdateTargets pointer is set to NULL, no extra target
 expressions are added. (This will make it impossible to implement DELETE
 operations, though UPDATE may still be feasible if the FDW relies on an
 unchanging primary key to identify rows.)

Later on, the section on `ExecForeignDelete` says (emphasis mine):

 The junk column(s) **must** be used to identify the tuple to be deleted.

Why is this a requirement? At the moment, `postgres_fdw` asks remote machines
for the `ctid` of tuples during a scan so it can use that `ctid` to create a
targeted `DELETE` during `ExecForeignDelete`, but I think an alternative could
avoid the use of the `ctid` junk column altogether...

Imagine if `BeginForeignScan` set up a remote cursor and `IterateForeignScan`
just fetched _one tuple at a time_ (unlike the current behavior where they are
fetched in batches). The tuple would be passed to `ExecForeignDelete` (as is
required), but the remote cursor would remain pointing at that tuple. Couldn't
`ExecForeignDelete` just call `DELETE FROM table WHERE CURRENT OF cursor` to
then delete that tuple?

Even if there is no guarantee that `IterateForeignScan` is called exactly once
before each `ExecForeignDelete` call (which would remove the ability to have
them cooperate using this single cursor), one could easily devise other storage
backends that don't need junk columns to perform `DELETE` operations.

So why the strong language around this functionality?

# Examples of `NULL` return after modification

Each of the `ExecForeign`- functions needs to return a tuple representing the
row inserted, deleted, or modified. But each function's documentation contains
an aside similar to this:

 The return value is either a slot containing the data that was actually
 inserted (this might differ from the data supplied, for example as a result
 of trigger actions), or NULL if no row was actually inserted (again,
 typically as a result of triggers).

Is this even accurate in PostgreSQL 9.3? Can triggers fire against foreign
tables? If so, can someone provide an example where the foreign scan has found
a tuple, passed it to `ExecForeignDelete`, and then no delete takes place (i.e.
`ExecForeignDelete` returns `NULL`)? As far as I can reason, if the foreign
scan has found a tuple, the update and delete actions need to do _something_
with it. Maybe I'm missing something.

--Jason

[1]: http://www.postgresql.org/docs/9.3/static/fdw-callbacks.html



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Clarification of FDW API Documentation

2014-06-13 Thread Tom Lane
Jason Petersen ja...@citusdata.com writes:
 Imagine if `BeginForeignScan` set up a remote cursor and `IterateForeignScan`
 just fetched _one tuple at a time_ (unlike the current behavior where they are
 fetched in batches). The tuple would be passed to `ExecForeignDelete` (as is
 required), but the remote cursor would remain pointing at that tuple. Couldn't
 `ExecForeignDelete` just call `DELETE FROM table WHERE CURRENT OF cursor` to
 then delete that tuple?

No.  This is not guaranteed (or even likely) to work in join cases: the
tuple to be updated/deleted might no longer be the current one of the scan.
You *must* arrange for the scan to return enough information to uniquely
identify the tuple later, and that generally means adding some resjunk
columns.

 Even if there is no guarantee that `IterateForeignScan` is called exactly once
 before each `ExecForeignDelete` call (which would remove the ability to have
 them cooperate using this single cursor), one could easily devise other 
 storage
 backends that don't need junk columns to perform `DELETE` operations.

Such as?  I could imagine having an optimization that works like you
suggest for simple scan cases, but it's not there now, and it could not
be the only mode.

 Each of the `ExecForeign`- functions needs to return a tuple representing the
 row inserted, deleted, or modified. But each function's documentation contains
 an aside similar to this:

 The return value is either a slot containing the data that was actually
 inserted (this might differ from the data supplied, for example as a result
 of trigger actions), or NULL if no row was actually inserted (again,
 typically as a result of triggers).

 Is this even accurate in PostgreSQL 9.3? Can triggers fire against foreign
 tables?

Any local trigger execution would be handled by the core executor.
What this is on about is that the remote database might have modified or
suppressed the operation as a result of triggers on the remote table;
and we'd like the FDW to return data that reflects what actually got
inserted/updated/deleted remotely.  (I guess a particular FDW might have a
policy of not reporting such things accurately, but the point of the text
is that if you want to tell the truth you can do so.)

Perhaps it would help if these paragraphs said remote trigger not
just trigger.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-13 Thread Shreesha
@Fabrizio de Royes Mello, Even upon making changes as per your suggestion,
I could see that initdb is failing for the same reason:
***
/mswitch/pgsql/bin # ./initdb -D ../data/
The files belonging to this database system will be owned by user root.
This user must also own the server process.

The database cluster will be initialized with locale C.
The default database encoding has accordingly been set to SQL_ASCII.
The default text search configuration will be set to english.

Data page checksums are disabled.

fixing permissions on existing directory ../data ... ok
creating subdirectories ... ok
selecting default max_connections ... 10
selecting default shared_buffers ... 400kB
creating configuration files ... ok
creating template1 database in ../data/base/1 ... root execution of the
PostgreSQL server is not permitted.
The server must be started under an unprivileged user ID to prevent
possible system security compromise.  See the documentation for
more information on how to properly start the server.
child process exited with exit code 1
initdb: removing contents of data directory ../data
***

Please let me know if I need to make changes in any other place. Appreciate
your help!



On Fri, Jun 13, 2014 at 9:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI 
  horiguchi.kyot...@lab.ntt.co.jp wrote:
  Try replacing these conditions with (0  geteuid() == 0) and
  you would see it run as root.

  Maybe a compile option like '--enable-run-as-root' could be added to
 allow
  it without the need of change the source code.

 If we wanted it to be easy to run PG as root, those tests wouldn't be
 there in the first place.  We don't want that, so there will never be
 anything to override that policy as easily as setting a configure option.

 In the case at hand, the OP wants to run on some weird system that
 apparently doesn't have multiple users at all.  It's a pretty safe bet
 that hacking out those geteuid tests is just the tip of the iceberg of
 what he's going to have to change to make it work, because it probably
 deviates from typical-Unix-environment in a lot of other ways too.
 So I can't get too concerned about how easy this one aspect is for him.

 regards, tom lane




-- 
~Shreesha.


[HACKERS] ALTER TABLESPACE MOVE command tag tweak

2014-06-13 Thread Alvaro Herrera
The ALTER TABLESPACE MOVE command affects tables, not tablespaces; and
as such, I think event triggers should support that command.  I'm not
proposing to change event triggers at this stage, but since IMO we will
want to do that in 9.5, we need it to have a different command tag than
plain ALTER TABLESPACE.  This is so that check_ddl_tag() can compare
the tag with ALTER TABLESPACE and say unsupported, and ALTER
TABLESPACE MOVE and say supported.  Both are currently spelled the
same, which will be a problem.

Therefore I propose the attached patch for 9.4.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/src/backend/tcop/utility.c b/src/backend/tcop/utility.c
index 3423898..5553221 100644
--- a/src/backend/tcop/utility.c
+++ b/src/backend/tcop/utility.c
@@ -1806,7 +1805,7 @@ CreateCommandTag(Node *parsetree)
 			break;
 
 		case T_AlterTableSpaceMoveStmt:
-			tag = ALTER TABLESPACE;
+			tag = ALTER TABLESPACE MOVE;
 			break;
 
 		case T_CreateExtensionStmt:

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to change the pgsql source code and build it??

2014-06-13 Thread Shreesha
Like Tom said, just setting the the configure option doesn't let the root
user in. Looks like lot more changes are required.

Tom, according to you, what do you think would be my best bet to allow the
root user initialize and start the database server?

Thanks,
Shreesha.


On Fri, Jun 13, 2014 at 9:43 AM, Tom Lane t...@sss.pgh.pa.us wrote:

 =?UTF-8?Q?Fabr=C3=ADzio_de_Royes_Mello?= fabriziome...@gmail.com writes:
  On Thu, Jun 12, 2014 at 10:59 PM, Kyotaro HORIGUCHI 
  horiguchi.kyot...@lab.ntt.co.jp wrote:
  Try replacing these conditions with (0  geteuid() == 0) and
  you would see it run as root.

  Maybe a compile option like '--enable-run-as-root' could be added to
 allow
  it without the need of change the source code.

 If we wanted it to be easy to run PG as root, those tests wouldn't be
 there in the first place.  We don't want that, so there will never be
 anything to override that policy as easily as setting a configure option.

 In the case at hand, the OP wants to run on some weird system that
 apparently doesn't have multiple users at all.  It's a pretty safe bet
 that hacking out those geteuid tests is just the tip of the iceberg of
 what he's going to have to change to make it work, because it probably
 deviates from typical-Unix-environment in a lot of other ways too.
 So I can't get too concerned about how easy this one aspect is for him.

 regards, tom lane




-- 
~Shreesha.


Re: [HACKERS] ALTER TABLESPACE MOVE command tag tweak

2014-06-13 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 The ALTER TABLESPACE MOVE command affects tables, not tablespaces; and
 as such, I think event triggers should support that command.  I'm not
 proposing to change event triggers at this stage, but since IMO we will
 want to do that in 9.5, we need it to have a different command tag than
 plain ALTER TABLESPACE.  This is so that check_ddl_tag() can compare
 the tag with ALTER TABLESPACE and say unsupported, and ALTER
 TABLESPACE MOVE and say supported.  Both are currently spelled the
 same, which will be a problem.

 Therefore I propose the attached patch for 9.4.

Hm.  While the specific change here seems harmless enough, the argument
for it seems to me to indicate that the very design is broken.  Do you
expect event triggers to distinguish all the different subflavors of
ALTER TABLE, for example, on the basis of the command tag?  Backwards
compatibility is going to prevent us from refining the tag strings
that much.  So ISTM this concern means what we'd better be thinking
about is some other way for event triggers to find out what they're
dealing with.

A different thought is that what event triggers would probably like
is for this command to be reported to them as a series of
ALTER TABLE SET TABLESPACE events, one per moved table.  If it's
done like that then the tag for the outer ALTER TABLESPACE may not
be so important.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] replicating DROP commands across servers

2014-06-13 Thread Alvaro Herrera
Here's a patch implementing the proposed idea.  This is used in the
Bidirectional Replication stuff by Simon/Andres; it works well.


One thing of note is that I added output flags for normal and
original, which mostly come from performDeletion flags.  This let one
select only such objects when trying to replicate a drop; otherwise,
we'd add RI triggers to the set to drop remotely, which doesn't work
because their names have OIDs embedded, and in the remote system those
are different.

One curious thing is that I had to add a hack that if an object has a
reverse flag in the ObjectAddresses array, also set the normal
output flag.  (Another possibility would have been to add a reverse
output flag, but there doesn't seem to be a use for that --- it seems to
expose internals unnecessarily.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 17538,17543  FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
--- 17538,17556 
  entryObject sub-id (e.g. attribute number for columns)/entry
 /row
 row
+ entryliteraloriginal/literal/entry
+ entrytypebool/type/entry
+ entryFlag used to identify the root object of the deletion/entry
+/row
+row
+ entry/literalnormal/literal/entry
+ entrytypebool/type/entry
+ entry
+  Flag indicating that there's a normal dependency relationship
+  in the dependency graph leading to this object
+ /entry
+/row
+row
  entryliteralobject_type/literal/entry
  entrytypetext/type/entry
  entryType of the object/entry
***
*** 17567,17572  FOR EACH ROW EXECUTE PROCEDURE suppress_redundant_updates_trigger();
--- 17580,17601 
   identifier present in the identity is quoted if necessary.
  /entry
 /row
+row
+ entryliteraladdress_names/literal/entry
+ entrytypetext[]/type/entry
+ entry
+  An array that, together with literaladdress_args/literal,
+  can be used by the C-language function getObjectAddress() to
+  recreate the object address in a remote server containing a similar object.
+ entry
+/row
+row
+ entryliteraladdress_args/literal/entry
+ entrytypetext[]/type/entry
+ entry
+  See literaladdress_names/literal above.
+ /entry
+/row
/tbody
   /tgroup
  /informaltable
*** a/src/backend/catalog/dependency.c
--- b/src/backend/catalog/dependency.c
***
*** 203,218  deleteObjectsInList(ObjectAddresses *targetObjects, Relation *depRel,
  	/*
  	 * Keep track of objects for event triggers, if necessary.
  	 */
! 	if (trackDroppedObjectsNeeded())
  	{
  		for (i = 0; i  targetObjects-numrefs; i++)
  		{
! 			ObjectAddress *thisobj = targetObjects-refs + i;
! 
! 			if ((!(flags  PERFORM_DELETION_INTERNAL)) 
! EventTriggerSupportsObjectClass(getObjectClass(thisobj)))
  			{
! EventTriggerSQLDropAddObject(thisobj);
  			}
  		}
  	}
--- 203,227 
  	/*
  	 * Keep track of objects for event triggers, if necessary.
  	 */
! 	if (trackDroppedObjectsNeeded()  !(flags  PERFORM_DELETION_INTERNAL))
  	{
  		for (i = 0; i  targetObjects-numrefs; i++)
  		{
! 			const ObjectAddress *thisobj = targetObjects-refs[i];
! 			const ObjectAddressExtra *extra = targetObjects-extras[i];
! 			bool	original = false;
! 			bool	normal = false;
! 
! 			if (extra-flags  DEPFLAG_ORIGINAL)
! original = true;
! 			if (extra-flags  DEPFLAG_NORMAL)
! normal = true;
! 			if (extra-flags  DEPFLAG_REVERSE)
! normal = true;
! 
! 			if (EventTriggerSupportsObjectClass(getObjectClass(thisobj)))
  			{
! EventTriggerSQLDropAddObject(thisobj, original, normal);
  			}
  		}
  	}
*** a/src/backend/catalog/objectaddress.c
--- b/src/backend/catalog/objectaddress.c
***
*** 417,422  static const ObjectPropertyType ObjectProperty[] =
--- 417,513 
  	}
  };
  
+ /*
+  * This struct maps the object types as returned by getObjectTypeDescription
+  * into ObjType enum values.  Note that some enum values can be obtained by
+  * different names, and that some string object types do not have corresponding
+  * values in the enum.  The user of this map must be careful to test for
+  * invalid values being returned.
+  *
+  * This follows the order of getObjectTypeDescription.
+  */
+ static const struct object_type_map
+ {
+ 	const char *tm_name;
+ 	ObjectType	tm_type;
+ }
+ ObjectTypeMap[] =
+ {
+ 	/* OCLASS_CLASS */
+ 	{ table, OBJECT_TABLE },
+ 	{ index, OBJECT_INDEX },
+ 	{ sequence, OBJECT_SEQUENCE },
+ 	{ toast table, -1 },		/* unmapped */
+ 	{ view, OBJECT_VIEW },
+ 	{ materialized view, OBJECT_MATVIEW },
+ 	{ composite type, OBJECT_COMPOSITE },
+ 	{ foreign table, OBJECT_FOREIGN_TABLE 

Re: [HACKERS] ALTER TABLESPACE MOVE command tag tweak

2014-06-13 Thread Alvaro Herrera
Tom Lane wrote:
 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  The ALTER TABLESPACE MOVE command affects tables, not tablespaces; and
  as such, I think event triggers should support that command.  I'm not
  proposing to change event triggers at this stage, but since IMO we will
  want to do that in 9.5, we need it to have a different command tag than
  plain ALTER TABLESPACE.  This is so that check_ddl_tag() can compare
  the tag with ALTER TABLESPACE and say unsupported, and ALTER
  TABLESPACE MOVE and say supported.  Both are currently spelled the
  same, which will be a problem.
 
  Therefore I propose the attached patch for 9.4.
 
 Hm.  While the specific change here seems harmless enough, the argument
 for it seems to me to indicate that the very design is broken.  Do you
 expect event triggers to distinguish all the different subflavors of
 ALTER TABLE, for example, on the basis of the command tag?  Backwards
 compatibility is going to prevent us from refining the tag strings
 that much.

Actually, I don't -- I have already implemented ALTER TABLE for event
triggers, and there wasn't any need to tweak the command tags there.
The problem in this particular case is that TABLESPACE is a global
object, thus not supported; but ALTER TABLESPACE MOVE is a command that
modifies tables (which *are* supported), not tablespaces.

ALTER TABLESPACE MOVE is a glorified ALTER TABLE.  If ALTER TABLESPACE
MOVE returned ALTER TABLE as a tag, I think it'd work well too; but not
ALTER TABLESPACE.  Individually, since the implementation works by
calling AlterTableInternal(), it already works.

Now if you state that the current design in event_triggers that works by
slicing CommandTag and comparing the pieces is broken, I don't disagree
and I think I have now (in the patch posted in a nearby thread) some
more infrastructure to do it differently.  But even if we do that, I
think we're going to need a way to differentiate ALTER TABLESPACE MOVE
from other forms of ALTER TABLESPACE.  I haven't given this much
thought, though.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLESPACE MOVE command tag tweak

2014-06-13 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 The problem in this particular case is that TABLESPACE is a global
 object, thus not supported; but ALTER TABLESPACE MOVE is a command that
 modifies tables (which *are* supported), not tablespaces.

 ALTER TABLESPACE MOVE is a glorified ALTER TABLE.  If ALTER TABLESPACE
 MOVE returned ALTER TABLE as a tag, I think it'd work well too; but not
 ALTER TABLESPACE.  Individually, since the implementation works by
 calling AlterTableInternal(), it already works.

 Now if you state that the current design in event_triggers that works by
 slicing CommandTag and comparing the pieces is broken, I don't disagree
 and I think I have now (in the patch posted in a nearby thread) some
 more infrastructure to do it differently.  But even if we do that, I
 think we're going to need a way to differentiate ALTER TABLESPACE MOVE
 from other forms of ALTER TABLESPACE.  I haven't given this much
 thought, though.

Yeah, I'd not paid much attention to it either.  Now that I look at it,
ALTER TABLESPACE MOVE seems like a pretty unfortunate choice of naming
all around, because (unless I'm misunderstanding) it doesn't actually
alter any property of the tablespace itself.  It might be a bit late
to propose this, but I wonder if some syntax along the lines of

   MOVE ALL [ TABLES | INDEXES | ... ] IN TABLESPACE foo TO bar

wouldn't be less confusing.  Not sure what we'd use as command tag
for it though (not MOVE, since that's taken).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Something flaky in the relfilenode mapping infrastructure

2014-06-13 Thread Noah Misch
On Thu, Jun 12, 2014 at 10:50:44PM -0400, Tom Lane wrote:
 Alternatively, we could do something like you suggest but adjust the
 second join so that it suppresses only rows in which mapped_oid is null
 *and* there's no longer a matching OID in pg_class.  That would provide
 additional confidence that the null result is a valid indicator of a
 just-dropped table.

Can't hurt; I adjusted it along those lines and committed.  For the record,
the failure on prairiedog has appeared a third time:
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedogdt=2014-06-13%2005%3A19%3A35

-- 
Noah Misch
EnterpriseDB http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Built-in binning functions

2014-06-13 Thread Petr Jelinek

Hello,

here is a patch implementing varwidth_bucket (naming is up for 
discussion) function which does binning with variable bucket width. The 
use-cases are same as for width_bucket (=data analytics, mainly 
histograms), the difference is that width_bucket uses buckets of same 
width but the varwidth_bucket accepts an sorted array of right-bound 
thresholds to define the individual buckets.


Currently applications implement this with long CASE statements which 
are quite hard to read/maintain and are much slower than this 
implementation which uses binary search.


There are 3 actual functions, one generic and two faster versions for 
the int8 and float8 input that take advantage of the static width of 
those types.



The research leading to these results has received funding from the 
European Union's Seventh Framework Programme (FP7/2007-2013) under grant 
agreement n° 318633


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 5c906f3..ceff2ae 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -920,6 +920,38 @@
entryliteralwidth_bucket(5.35, 0.024, 10.06, 5)/literal/entry
entryliteral3/literal/entry
   /row
+
+  row
+   entry
+indexterm
+ primaryvarwidth_bucket/primary
+/indexterm
+literalfunctionvarwidth_bucket(parameterop/parameter typeanyelemnt/type, parameterthresholds/parameter typeanyarray/type)/function/literal
+   /entry
+   entrytypeint/type/entry
+   entryreturn the bucket to which parameteroperand/ would
+   be assigned based on right-bound bucket parameterthresholds/./entry
+   entryliteralvarwidth_bucket(5.35::numeric, ARRAY[1, 3, 4, 6]::numeric)/literal/entry
+   entryliteral3/literal/entry
+  /row
+
+  row
+   entryliteralfunctionvarwidth_bucket(parameterop/parameter typedp/type, parameterthresholds/parameter typedp[]/type)/function/literal/entry
+   entrytypeint/type/entry
+   entryreturn the bucket to which parameteroperand/ would
+   be assigned based on right-bound bucket parameterthresholds/./entry
+   entryliteralvarwidth_bucket(5.35, ARRAY[1, 3, 4, 6])/literal/entry
+   entryliteral3/literal/entry
+  /row
+
+  row
+   entryliteralfunctionvarwidth_bucket(parameterop/parameter typebigint/type, parameterthresholds/parameter typebigint[]/type)/function/literal/entry
+   entrytypeint/type/entry
+   entryreturn the bucket to which parameteroperand/ would
+   be assigned based on right-bound bucket parameterthresholds/./entry
+   entryliteralvarwidth_bucket(5, ARRAY[1, 3, 4, 6])/literal/entry
+   entryliteral3/literal/entry
+  /row
  /tbody
 /tgroup
/table
diff --git a/src/backend/utils/adt/arrayfuncs.c b/src/backend/utils/adt/arrayfuncs.c
index f8e94ec..6f03206 100644
--- a/src/backend/utils/adt/arrayfuncs.c
+++ b/src/backend/utils/adt/arrayfuncs.c
@@ -5502,3 +5502,116 @@ array_replace(PG_FUNCTION_ARGS)
    fcinfo);
 	PG_RETURN_ARRAYTYPE_P(array);
 }
+
+/*
+ * Implements the generic version of the varwidth_bucket() function.
+ * See also varwidth_bucket_float8() and varwidth_bucket_int8().
+ *
+ * 'thresholds' is an array (must be sorted from smallest to biggest value)
+ * containing right-bounds for each bucket, varwidth_bucket() returns
+ * integer indicating the bucket number that 'operand' belongs to. An operand
+ * greater than the upper bound is assigned to an additional bucket.
+ */
+Datum
+varwidth_bucket_generic(PG_FUNCTION_ARGS)
+{
+	Datum		operand = PG_GETARG_DATUM(0);
+	ArrayType  *thresholds_in = PG_GETARG_ARRAYTYPE_P(1);
+	char	   *thresholds;
+	Oid			collation = PG_GET_COLLATION();
+	Oid			element_type = get_fn_expr_argtype(fcinfo-flinfo, 0);
+	TypeCacheEntry *typentry;
+	int			typlen;
+	bool		typbyval;
+	char		typalign;
+	int32		left;
+	int32		mid;
+	int32		right;
+	FunctionCallInfoData locfcinfo;
+
+	/* Verify input */
+	if (ARR_NDIM(thresholds_in) != 1)
+		ereport(ERROR,
+			(errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+			 errmsg(thresholds must be one dimensional array )));
+
+	if (ARR_HASNULL(thresholds_in))
+		ereport(ERROR,
+			(errcode(ERRCODE_NULL_VALUE_NOT_ALLOWED),
+			 errmsg(thresholds array must not contain NULLs)));
+
+	/* Make sure val and thresholds use same types so we can be sure they can be compared. */
+	if (element_type != ARR_ELEMTYPE(thresholds_in))
+		ereport(ERROR,
+(errcode(ERRCODE_DATATYPE_MISMATCH),
+ errmsg(cannot calculate buckets for operand using thresholds of different type)));
+
+	/* Fetch information about the input type */
+	typentry = (TypeCacheEntry *) fcinfo-flinfo-fn_extra;
+	if (typentry == NULL ||
+		typentry-type_id != element_type)
+	{
+		typentry = lookup_type_cache(element_type,
+	 TYPECACHE_CMP_PROC_FINFO);
+		if (!OidIsValid(typentry-cmp_proc_finfo.fn_oid))
+			ereport(ERROR,
+	

Re: [HACKERS] Something flaky in the relfilenode mapping infrastructure

2014-06-13 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 On Thu, Jun 12, 2014 at 10:50:44PM -0400, Tom Lane wrote:
 Alternatively, we could do something like you suggest but adjust the
 second join so that it suppresses only rows in which mapped_oid is null
 *and* there's no longer a matching OID in pg_class.  That would provide
 additional confidence that the null result is a valid indicator of a
 just-dropped table.

 Can't hurt; I adjusted it along those lines and committed.

Looks good.

 For the record,
 the failure on prairiedog has appeared a third time:
 http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=prairiedogdt=2014-06-13%2005%3A19%3A35

Yeah, I saw that.  I wonder if we recently changed something that improved
the odds of the timing being just so.  Two failures in two days seems out
of line with the critter's previous history.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Set new system identifier using pg_resetxlog

2014-06-13 Thread Petr Jelinek

Hello,

attached is a simple patch which makes it possible to change the system 
identifier of the cluster in pg_control. This is useful for 
individualization of the instance that is started on top of data 
directory produced by pg_basebackup - something that's helpful for 
logical replication setup where you need to easily identify each node 
(it's used by Bidirectional Replication for example).


--
 Petr Jelinek  http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
diff --git a/doc/src/sgml/ref/pg_resetxlog.sgml b/doc/src/sgml/ref/pg_resetxlog.sgml
index 34b0606..39ae574 100644
--- a/doc/src/sgml/ref/pg_resetxlog.sgml
+++ b/doc/src/sgml/ref/pg_resetxlog.sgml
@@ -30,6 +30,7 @@ PostgreSQL documentation
arg choice=optoption-m/option replaceable class=parametermxid/replaceable,replaceable class=parametermxid/replaceable/arg
arg choice=optoption-O/option replaceable class=parametermxoff/replaceable/arg
arg choice=optoption-l/option replaceable class=parameterxlogfile/replaceable/arg
+   arg choice=optoption-s/option [replaceable class=parametersysid/replaceable]/arg
arg choice=plainreplaceabledatadir/replaceable/arg
   /cmdsynopsis
  /refsynopsisdiv
@@ -184,6 +185,13 @@ PostgreSQL documentation
   /para
 
   para
+   The option-s/ option instructs commandpg_resetxlog/command to 
+   set the system identifier of the cluster to specified replaceablesysid/.
+   If the replaceablesysid/ is not provided new one will be generated using
+   same algorithm commandinitdb/ uses.
+  /para
+
+  para
The option-V/ and option--version/ options print
the applicationpg_resetxlog/application version and exit.  The
options option-?/ and option--help/ show supported arguments,
diff --git a/src/bin/pg_resetxlog/pg_resetxlog.c b/src/bin/pg_resetxlog/pg_resetxlog.c
index 915a1ed..3660ee1 100644
--- a/src/bin/pg_resetxlog/pg_resetxlog.c
+++ b/src/bin/pg_resetxlog/pg_resetxlog.c
@@ -67,7 +67,9 @@ static MultiXactId set_mxid = 0;
 static MultiXactOffset set_mxoff = (MultiXactOffset) -1;
 static uint32 minXlogTli = 0;
 static XLogSegNo minXlogSegNo = 0;
+static uint64 set_sysid = 0;
 
+static uint64 GenerateSystemIdentifier(void);
 static bool ReadControlFile(void);
 static void GuessControlValues(void);
 static void PrintControlValues(bool guessed);
@@ -111,7 +113,7 @@ main(int argc, char *argv[])
 	}
 
 
-	while ((c = getopt(argc, argv, fl:m:no:O:x:e:)) != -1)
+	while ((c = getopt(argc, argv, fl:m:no:O:x:e:s::)) != -1)
 	{
 		switch (c)
 		{
@@ -227,6 +229,22 @@ main(int argc, char *argv[])
 XLogFromFileName(optarg, minXlogTli, minXlogSegNo);
 break;
 
+			case 's':
+if (optarg)
+{
+	if (sscanf(optarg, UINT64_FORMAT, set_sysid) != 1)
+	{
+		fprintf(stderr, _(%s: invalid argument for option -s\n), progname);
+		fprintf(stderr, _(Try \%s --help\ for more information.\n), progname);
+		exit(1);
+	}
+}
+else
+{
+	set_sysid = GenerateSystemIdentifier();
+}
+break;
+
 			default:
 fprintf(stderr, _(Try \%s --help\ for more information.\n), progname);
 exit(1);
@@ -354,6 +372,9 @@ main(int argc, char *argv[])
 	if (minXlogSegNo  newXlogSegNo)
 		newXlogSegNo = minXlogSegNo;
 
+	if (set_sysid != 0)
+		ControlFile.system_identifier = set_sysid;
+
 	/*
 	 * If we had to guess anything, and -f was not given, just print the
 	 * guessed values and exit.  Also print if -n is given.
@@ -395,6 +416,26 @@ main(int argc, char *argv[])
 
 
 /*
+ * Create a new unique installation identifier.
+ *
+ * See notes in xlog.c about the algorithm.
+ */
+static uint64
+GenerateSystemIdentifier(void)
+{
+	uint64			sysidentifier;
+	struct timeval	tv;
+
+	gettimeofday(tv, NULL);
+	sysidentifier = ((uint64) tv.tv_sec)  32;
+	sysidentifier |= ((uint64) tv.tv_usec)  12;
+	sysidentifier |= getpid()  0xFFF;
+
+	return sysidentifier;
+}
+
+
+/*
  * Try to read the existing pg_control file.
  *
  * This routine is also responsible for updating old pg_control versions
@@ -475,9 +516,6 @@ ReadControlFile(void)
 static void
 GuessControlValues(void)
 {
-	uint64		sysidentifier;
-	struct timeval tv;
-
 	/*
 	 * Set up a completely default set of pg_control values.
 	 */
@@ -489,14 +527,10 @@ GuessControlValues(void)
 
 	/*
 	 * Create a new unique installation identifier, since we can no longer use
-	 * any old XLOG records.  See notes in xlog.c about the algorithm.
+	 * any old XLOG records.
 	 */
-	gettimeofday(tv, NULL);
-	sysidentifier = ((uint64) tv.tv_sec)  32;
-	sysidentifier |= ((uint64) tv.tv_usec)  12;
-	sysidentifier |= getpid()  0xFFF;
 
-	ControlFile.system_identifier = sysidentifier;
+	ControlFile.system_identifier = GenerateSystemIdentifier();
 
 	ControlFile.checkPointCopy.redo = SizeOfXLogLongPHD;
 	ControlFile.checkPointCopy.ThisTimeLineID = 1;
@@ -1087,6 +1121,7 @@ usage(void)
 	printf(_(  -o OID   set next OID\n));
 	printf(_(  -O OFFSETset next multitransaction offset\n));
 	printf(_(