Re: [HACKERS] patch: make_timestamp function

2013-12-18 Thread Pavel Stehule
Hello

updated patch - optional time zone is as text.

postgres=# SELECT make_timetz(8, 15, 55.333) = '8:15:55.333'::timetz;
 ?column?
--
 t
(1 row)

postgres=# SELECT make_timetz(8, 15, 55.333, 'HKT') = '8:15:55.333
HKT'::timetz;
 ?column?
--
 t
(1 row)

postgres=# SELECT make_timetz(8, 15, 55.333, '+1:30') = '8:15:55.333
+1:30'::timetz;
 ?column?
--
 t
(1 row)

postgres=# SELECT make_timetz(8, 15, 55.333, '-1:30') = '8:15:55.333
-1:30'::timetz;
 ?column?
--
 t
(1 row)

Regards

Pavel


2013/12/17 Tom Lane t...@sss.pgh.pa.us

 Alvaro Herrera alvhe...@2ndquadrant.com writes:
  Yeah, I think a constructor should allow a text timezone.

 Yes.  I think a numeric timezone parameter is about 99% useless,
 and if you do happen to need that behavior you can just cast the
 numeric to text no?

 regards, tom lane

commit 680a244e13730a4f95d08d4e1ae338510c3ab404
Author: Pavel Stehule pavel.steh...@gooddata.com
Date:   Thu Dec 12 18:08:47 2013 +0100

initial implementation make_timestamp

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index a411e3a..50518ac 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -6735,6 +6735,78 @@ SELECT SUBSTRING('XY1234Z', 'Y*?([0-9]{1,3})');
row
 entry
  indexterm
+  primarymake_timetz/primary
+ /indexterm
+ literal
+  function
+   make_timetz(parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type,
+   optional parametertimezone/parameter typetext/type /optional)
+  /function
+ /literal
+/entry
+entrytypetime with time zone/type/entry
+entry
+ Create time with time zone from hour, minute and seconds fields
+/entry
+entryliteralmake_timetz(8, 15, 23.5)/literal/entry
+entryliteral08:15:23.5+01/literal/entry
+   /row
+
+   row
+entry
+ indexterm
+  primarymake_timestamp/primary
+ /indexterm
+ literal
+  function
+   make_timestamp(parameteryear/parameter typeint/type,
+   parametermonth/parameter typeint/type,
+   parameterday/parameter typeint/type,
+   parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type)
+  /function
+ /literal
+/entry
+entrytypetimestamp/type/entry
+entry
+ Create timestamp from year, month, day, hour, minute and seconds fields
+/entry
+entryliteralmake_timestamp(1-23, 7, 15, 8, 15, 23.5)/literal/entry
+entryliteral2013-07-15 08:15:23.5/literal/entry
+   /row
+
+   row
+entry
+ indexterm
+  primarymake_timestamptz/primary
+ /indexterm
+ literal
+  function
+   make_timestamptz(parameteryear/parameter typeint/type,
+   parametermonth/parameter typeint/type,
+   parameterday/parameter typeint/type,
+   parameterhour/parameter typeint/type,
+   parametermin/parameter typeint/type,
+   parametersec/parameter typedouble precision/type,
+   optional parametertimezone/parameter typetext/type /optional)
+  /function
+ /literal
+/entry
+entrytypetimestamp with time zone/type/entry
+entry
+ Create timestamp with time zone from year, month, day, hour, minute
+ and seconds fields
+/entry
+entryliteralmake_timestamp(1-23, 7, 15, 8, 15, 23.5)/literal/entry
+entryliteral2013-07-15 08:15:23.5+01/literal/entry
+   /row
+
+   row
+entry
+ indexterm
   primarynow/primary
  /indexterm
  literalfunctionnow()/function/literal
diff --git a/src/backend/utils/adt/date.c b/src/backend/utils/adt/date.c
index fe091da..fedf261 100644
--- a/src/backend/utils/adt/date.c
+++ b/src/backend/utils/adt/date.c
@@ -1246,38 +1246,113 @@ timetypmodout(PG_FUNCTION_ARGS)
 }
 
 /*
- *		make_time			- time constructor
+ * time constructor used for make_time and make_timetz
  */
-Datum
-make_time(PG_FUNCTION_ARGS)
+static TimeADT
+make_time_internal(int hour, int min, double sec)
 {
-	int			tm_hour = PG_GETARG_INT32(0);
-	int			tm_min = PG_GETARG_INT32(1);
-	double		sec = PG_GETARG_FLOAT8(2);
 	TimeADT		time;
 
 	/* This should match the checks in DecodeTimeOnly */
-	if (tm_hour  0 || tm_min  0 || tm_min  MINS_PER_HOUR - 1 ||
+	if (hour  0 || min  0 || min  MINS_PER_HOUR - 1 ||
 		sec  0 || sec  SECS_PER_MINUTE ||
-		tm_hour  HOURS_PER_DAY ||
+		hour  HOURS_PER_DAY ||
 	/* test for  24:00:00 */
-		(tm_hour == HOURS_PER_DAY  (tm_min  0 || sec  0)))
+		(hour == HOURS_PER_DAY  (min  0 || sec  0)))
 		ereport(ERROR,
 (errcode(ERRCODE_DATETIME_FIELD_OVERFLOW),
  errmsg(time field value out of range: 

Re: [HACKERS] 9.3 reference constraint regression

2013-12-18 Thread Andres Freund
Hi,

On 2013-12-17 18:27:26 -0300, Alvaro Herrera wrote:
  Well, it would help if those cases weren't dead code.  Neither
  heap_update nor heap_delete are ever called in the no wait case at
  all.

Yea, that sucks, maybe we ought to change that in HEAD. But there very
well might be external callers that use it, I don't think we can just
break the API in a stable release.

  Only heap_lock_tuple is, and I can't see any misbehavior there
  either, even with HeapTupleBeingUpdated returned when there's a
  non-local locker, or when there's a MultiXact as xmax, regardless of its
  status.
 
 I spent some more time trying to generate a test case that would show a
 problem with the changed return values here, and was unable to.
 
 I intend to apply this patch soon.

I have to say, it makes me really uncomfortable to take such
shortcuts. In other branches we are doing liveliness checks with
MultiXactIdIsRunning() et al. Why isn't that necessary here? And how
likely is that this won't cause breakage somewhere down the line because
somebody doesn't know of that subtlety?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Craig Ringer
On 12/18/2013 02:21 AM, Simon Riggs wrote:
 On 16 December 2013 14:36, Craig Ringer cr...@2ndquadrant.com wrote:
 
 - Decide on and implement a structure for row-security functionality its
 self. I'm persuaded by Robert's comments here, that whatever we expose
 must be significantly more usable than a DIY on top of views (with the
 fix for updatable security barrier views to make that possible). I
 outlined the skeleton of a possible design in my earlier post, with the
 heirachical and non-heirachical access labels. Should be implemented
 using the same API we expose for extensions (like SEPostgresql RLS).
 
 That part isn't clear why we must do better than that.
 
 Having declarative security is a huge step forward, in just the same
 way that updateable views were. They save the need for writing scripts
 to implement things, rather than just having a useful default.

In my view the proposed patch doesn't offer a significant improvement in
declarative security, beyond what we can get by just adding update
support to s.b. views and using search_path to control whether a user
sees the view or the base table.

It's a lot like Oracle Virtual Private Database (VPD): A set of low
level building blocks you can build your own flexible row security model
with. One where you have to very carefully write security-sensitive
predicates and define all your security model tables, etc yourself.

That's why I'm now of the opinion that we should make it possible to
achieve the same thing with s.b. views and the search_path (by adding
update support)... then build a declarative row-security system that
doesn't require the user fiddling with delicate queries and sensitive
scripts on top of that.

 Personally, I see no reason not to commit the syntax we have now. So
 people can see what we'll be supporting, whenever that is.

Do you mean commit the syntax without the supporting functionality, so
it doesn't operate / just ERROR's initially? Or are you suggesting that
the project commit the current patch with the known issues with user ID
changes and the concerns about its internal structure and operation,
then clean it up down the track?

 If there is a vision for that, lets see the vision. And then decide
 whether its worth waiting for.

OK, I'll put an example together as a follow-up to this post - the how
it should look part. Below I explain the *why* part.

What the current patch does is add something a bit like a simplified and
cut-down version of Oracle VPD.

Oracle's VPD predicates are generated by a PL/SQL procedure that emits
the SQL for a predicate. It has varying levels of caching to mitigate
performance effects. The proposed patch for Pg doesn't do that. We'll
have the predicate fixed and stored in the catalogs. It'll be like VPD
with policy level SHARED_STATIC locked in as the only option.

See
http://docs.oracle.com/cd/B28359_01/network.111/b28531/vpd.htm#DBSEG98250 for
details on Oracle VPD.

It's essentially a few tools you can build your own row security model
on, making the job a little easier. The DB designer / DBA still does a
lot of work to define security on each table and do a bunch of procedure
writing and setup to make it work. They have to get potentially security
sensitive queries just right for each table.

IMO it really isn't much different to just having an updatable security
barrier view. Differences arise when you look at VPD-exempt user acess
rights, at policy groups, etc, but we don't have any of that anyway.

The only big differences are that you can apply a VPD policy to an
existing table w/o having to refresh all views/functions that point to
it, and that some users can be made exempt.

(Note: I have not used VPD or Label Security myself; my comments are
intentionally based only on public documentation and on discussions with
others).



Oracle then built Oracle Label Security (their row-security solution)
*on* *top* of VPD. It's much more of a ready-to-go, configure it and use
it product, where you don't have to write procedures and define security
tables and generally build most of it yourself.

See: http://docs.oracle.com/cd/B14117_01/network.101/b10774.pdf

The Teradata implementation Robert pointed out skips the VPD-like layer
entirely. It just implements a fairly simple and flexible label security
model directly.  You don't have to muck about with the details, you
don't have to deal with the guts of the row security implementation, you
just say here's what I want.

I'd prefer to avoid adding syntax and interfaces for a feature that'll
likely become mostly implementation detail for the feature users
actually want - declarative security.

That's part of why I've been focused on upd. s. b. views. It'll provide
a way for users to do flexible DIY row-security without introducing a
big chunk of new syntax and functionality we'll be stuck with
supporting. It'll also be a useful building block for declarative row
security.






-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 

Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication

2013-12-18 Thread Christian Kruse
Hi,

On 17/12/13 12:08, Robert Haas wrote:
 Please add your patch here so we don't lose track of it:
 
 https://commitfest.postgresql.org/action/commitfest_view/open

Thanks. I nearly forgot that.

Regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgp5adohHq0q6.pgp
Description: PGP signature


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Tatsuo Ishii
Hi,

I have looked into this because it's marked as ready for committer.

 On Tue, Nov 19, 2013 at 2:06 PM, Haribabu kommi
 haribabu.ko...@huawei.com wrote:
 On 19 November 2013 09:59 Amit Kapila wrote:
 On Mon, Nov 18, 2013 at 8:31 PM, Haribabu kommi
 haribabu.ko...@huawei.com wrote:
  On 18 November 2013 20:01 Amit Kapila wrote:
   Code changes are fine.
   If two or three errors are present in the configuration file, it
  prints the last error

 Ok fine I marked the patch as ready for committer.
 
Thanks for review.
 
 With Regards,
 Amit Kapila.
 EnterpriseDB: http://www.enterprisedb.com

It looks like working as advertised. Great! However I have noticed a
few minor issues.

1) validate_conf_option

+/*
+ * Validates configuration parameter and value, by calling check hook functions
+ * depending on record's vartype. It validates if the parameter
+ * value given is in range of expected predefined value for that parameter.
+ *
+ * freemem - true indicates memory for newval and newextra will be
+ *  freed in this function, false indicates it will be 
freed
+ *  by caller.
+ * Return value:
+ * 1: the value is valid
+ * 0: the name or value is invalid
+ */
+int
+validate_conf_option(struct config_generic * record, const char *name,
+const char *value, GucSource source, 
int elevel,
+bool freemem, void *newval, void 
**newextra)

Is there any reason for the function returns int as it always returns
0 or 1. Maybe returns bool is better?

2) initdb.c

+   strcpy(tempautobuf, # Do not edit this file manually! \n);
+   autoconflines[0] = pg_strdup(tempautobuf);
+   strcpy(tempautobuf, # It will be overwritten by the ALTER SYSTEM 
command. \n);
+   autoconflines[1] = pg_strdup(tempautobuf);

Is there any reason to use tempautobuf here? I think we can simply change to 
this:

+   autoconflines[0] = pg_strdup(# Do not edit this file manually! \n);
+   autoconflines[1] = pg_strdup(# It will be overwritten by the ALTER 
SYSTEM command. \n);

3) initdb.c

It seems the memory allocated for autoconflines[0] and
autoconflines[1] by pg_strdup is never freed.

(I think there's similar problem with conflines as well, though it
was not introduced by the patch).

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [v9.4] row level security

2013-12-18 Thread Yeb Havinga

On 2013-12-14 05:40, Craig Ringer wrote:

I find the heirachical and non-heirachical label security model used in
Teradata to be extremely interesting and worthy of study.

The concept is that there are heirachical label policies (think
classified, unclassified, etc) or non-heirachical (mutually
exclusive labels). A user can see a row if they have all the policies
used in a table, and each of the user's policy values is sufficient to
see the row. For non-heiracical policies that means the user and row
must have at least one label in common; for heiracical policies it means
the user label must be greater than or equal to the row label.

That model lets you quite flexibly and easily build complex security
models. It'd work well with PostgreSQL, too: We could implement
something like non-heirachical policies using a system varbit column
that's added whenever a non-heirachical policy gets added to a table,
and simply AND it with the varbit on the user's policy to see if they
can access the row. Or use a compact fixed-width bitfield. Heirachical
policies would use a 1 or 2 byte int system column to store the label.


Is is necessary to fix the attribute type of the security label? The 
label model described above seems to restrictive to support e.g. the 
labels described on p18 of the 'healthcare classification system' (HCS) 
(http://www.hl7.org/documentcenter/public/wg/secure/3.%20HCS%20Guide%20Final%202013%200322%20JMD.pdf). 
The HCS security label is based on the NIST's FIPS188 standard / 
http://www.itl.nist.gov/fipspubs/fip188.htm


regards,
Yeb Havinga



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: variant of regclass

2013-12-18 Thread Tatsuo Ishii
 For the pgpool-II use case, I'm happy to follow you because pgpool-II
 always does a grammatical check (using raw parser) on a query first
 and such syntax problem will be pointed out, thus never reaches to
 the state where calling toregclass.

 One concern is, other users expect toregclass to detect such syntax
 problems. Anybody?
 
 It seems fine to me if the new function ignores the specific error of
 relation does not exist while continuing to throw other errors.

Thanks. Here is the revised conceptual patch. I'm going to post a
concrete patch and register it to 2014-01 Commit Fest.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp
diff --git a/src/backend/utils/adt/regproc.c b/src/backend/utils/adt/regproc.c
index c24a2c1..0406c30 100644
--- a/src/backend/utils/adt/regproc.c
+++ b/src/backend/utils/adt/regproc.c
@@ -45,6 +45,7 @@ static char *format_operator_internal(Oid operator_oid, bool force_qualify);
 static char *format_procedure_internal(Oid procedure_oid, bool force_qualify);
 static void parseNameAndArgTypes(const char *string, bool allowNone,
 	 List **names, int *nargs, Oid *argtypes);
+static Datum regclass_guts(char *class_name_or_oid, bool raiseError);
 
 
 /*
@@ -804,21 +805,50 @@ Datum
 regclassin(PG_FUNCTION_ARGS)
 {
 	char	   *class_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result;
+
+	result = regclass_guts(class_name_or_oid, true);
+	PG_RETURN_OID(result);
+}
+
+/*
+ * toregclass		- converts classname to class OID
+ *
+ * Diffrence from rgclassin is, this does not throw an error if the classname
+ * does not exist.
+ * Note: this is not an I/O function.
+ */
+Datum
+toregclass(PG_FUNCTION_ARGS)
+{
+	char	   *class_name_or_oid = PG_GETARG_CSTRING(0);
+	Oid			result;
+
+	result = regclass_guts(class_name_or_oid, false);
+	PG_RETURN_OID(result);
+}
+
+/*
+ * Guts of regclassin and toregclass.
+ * If raiseError is false, returns InvalidOid upon error.
+ */
+static Datum regclass_guts(char *class_name_or_oid, bool raiseError)
+{
 	Oid			result = InvalidOid;
 	List	   *names;
 
 	/* '-' ? */
 	if (strcmp(class_name_or_oid, -) == 0)
-		PG_RETURN_OID(InvalidOid);
+		return result;
 
 	/* Numeric OID? */
 	if (class_name_or_oid[0] = '0' 
 		class_name_or_oid[0] = '9' 
 		strspn(class_name_or_oid, 0123456789) == strlen(class_name_or_oid))
 	{
-		result = DatumGetObjectId(DirectFunctionCall1(oidin,
+ 		result = DatumGetObjectId(DirectFunctionCall1(oidin,
 		CStringGetDatum(class_name_or_oid)));
-		PG_RETURN_OID(result);
+		return result;
 	}
 
 	/* Else it's a name, possibly schema-qualified */
@@ -848,16 +878,19 @@ regclassin(PG_FUNCTION_ARGS)
 		if (HeapTupleIsValid(tuple = systable_getnext(sysscan)))
 			result = HeapTupleGetOid(tuple);
 		else
-			ereport(ERROR,
-	(errcode(ERRCODE_UNDEFINED_TABLE),
-			   errmsg(relation \%s\ does not exist, class_name_or_oid)));
+			if (raiseError)
+ereport(ERROR,
+		(errcode(ERRCODE_UNDEFINED_TABLE),
+		 errmsg(relation \%s\ does not exist, class_name_or_oid)));
+			else
+return InvalidOid;
 
 		/* We assume there can be only one match */
 
 		systable_endscan(sysscan);
 		heap_close(hdesc, AccessShareLock);
 
-		PG_RETURN_OID(result);
+		return result;
 	}
 
 	/*
@@ -865,11 +898,16 @@ regclassin(PG_FUNCTION_ARGS)
 	 * pg_class entries in the current search path.
 	 */
 	names = stringToQualifiedNameList(class_name_or_oid);
+	if (names == NIL)
+		return InvalidOid;
 
 	/* We might not even have permissions on this relation; don't lock it. */
-	result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, false);
+	if (raiseError)
+		result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, false);
+	else
+		result = RangeVarGetRelid(makeRangeVarFromNameList(names), NoLock, true);
 
-	PG_RETURN_OID(result);
+	return result;
 }
 
 /*
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 0117500..472ccad 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3304,13 +3304,14 @@ DATA(insert OID = 2218 (  regclassin		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2
 DESCR(I/O);
 DATA(insert OID = 2219 (  regclassout		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2275 2205 _null_ _null_ _null_ _null_ regclassout _null_ _null_ _null_ ));
 DESCR(I/O);
+DATA(insert OID = 3179 (  toregclass		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2205 2275 _null_ _null_ _null_ _null_ toregclass _null_ _null_ _null_ ));
+DESCR(convert classname to regclass);
 DATA(insert OID = 2220 (  regtypein			PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2206 2275 _null_ _null_ _null_ _null_ regtypein _null_ _null_ _null_ ));
 DESCR(I/O);
 DATA(insert OID = 2221 (  regtypeout		PGNSP PGUID 12 1 0 0 0 f f f f t f s 1 0 2275 2206 _null_ _null_ _null_ _null_ regtypeout _null_ _null_ _null_ ));
 DESCR(I/O);
 DATA(insert OID = 1079 (  regclass			PGNSP 

Re: [HACKERS] Problem with displaying wide tables in psql

2013-12-18 Thread Sergey Muraviov
Hello

2013/12/18 Sameer Thakur samthaku...@gmail.com

 On Wed, Dec 11, 2013 at 11:13 PM, Sergey Muraviov
 sergey.k.murav...@gmail.com wrote:
  Hi.
 
  I've improved the patch.
  It works in expanded mode when either format option is set to wrapped
 (\pset
  format wrapped), or we have no pager, or pager doesn't chop long lines
 (so
  you can still use the trick).
  Target output width is taken from either columns option (\pset columns
 70),
  or environment variable $COLUMNS, or terminal size.
  And it's also compatible with any border style (\pset border 0|1|2).
 
  Here are some examples:
 
  postgres=# \x 1
  postgres=# \pset format wrapped
  postgres=# \pset border 0
  postgres=# select * from wide_table;
  * Record 1
  value afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa df
  sadfsadfa
sd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f sadf sad fadsf
  * Record 2
  value afadsafasd fasdf asdfasd
 
  postgres=# \pset border 1
  postgres=# \pset columns 70
  postgres=# select * from wide_table;
  -[ RECORD 1 ]-
  value | afadsafasd fasdf asdfasd fsad fas df sadf sad f sadf  sadf sa
| df sadfsadfasd fsad fsa df sadf asd fa sfd sadfsadf asdf sad f
|  sadf sad fadsf
  -[ RECORD 2 ]-
  value | afadsafasd fasdf asdfasd
 
  postgres=# \pset border 2
  postgres=# \pset columns 60
  postgres=# select * from wide_table;
  +-[ RECORD 1 ]-+
  | value | afadsafasd fasdf asdfasd fsad fas df sadf sad f  |
  |   | sadf  sadf sa df sadfsadfasd fsad fsa df sadf as |
  |   | d fa sfd sadfsadf asdf sad f sadf sad fadsf  |
  +-[ RECORD 2 ]-+
  | value | afadsafasd fasdf asdfasd |
  +---+--+
 
  Regards,
  Sergey
 

 The patch  applies and compile cleanly. I tried the following
 \pset format wrapped
 \pset columns 70.
 Not in expanded mode
 select * from wide_table works fine.
 select * from pg_stats has problems in viewing. Is it that pg_stats
 can be viewed easily only in expanded mode i.e. if columns displayed
 are wrapped then there is no way to view results in non expanded mode?
 regards
 Sameer


The problem with non expanded mode is that all column headers have to be
displayed on one line.
Otherwise, it is difficult to bind values to columns.
And I have no idea how to solve this problem.

-- 
Best regards,
Sergey Muraviov


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Craig Ringer
On 12/12/2013 02:51 AM, Tom Lane wrote:
 The thing that I'm wondering is why the database would be the right place
 to be measuring it at all.  If you've got a network usage problem,
 aggregate usage across everything on the server is probably what you
 need to be worried about, and PG can't tell you that.

I suspect this feature would be useful for when you want to try to drill
down and figure out what's having network issues - specifically, to
associate network behaviour with individual queries, individual users,
application_name, etc.

One sometimes faces the same issue with I/O: I know PostgreSQL is doing
lots of I/O, but what exactly is causing the I/O? Especially if you
can't catch it at the time it happens, it can be quite tricky to go from
there's lots of I/O to this query changed from using synchronized
seqscans to doing an index-only scan that's hammering the cache.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Dimitri Fontaine
Simon Riggs si...@2ndquadrant.com writes:
 On 17 December 2013 23:42, Tom Lane t...@sss.pgh.pa.us wrote:
 We aim to have the simplest implementation that meets the stated need
 and reasonable extrapolations of that. Text in a catalog table is the
 simplest implementation. That is not a reason to reject it, especially
 when we aren't suggesting a viable alternative.

 The first part of this assertion is debatable, and the claim that no
 viable alternative has been suggested is outright wrong.

With due respect, it's only wrong when you buy into implementing
something new rather than improving extensions.

 Sounds like we have a way forward for this feature then, just not with
 the current patch.

 Can someone attempt to summarise the way forward, with any caveats and
 necessary restrictions? It would save further column inches of debate.

Here's my attempt:

# Inline Extension, Extension Templates

The problem with *Inline Extension* is the dump and restore policy. The
contents of an extensions are not be found in a `pg_dump` script, ever.

The problem with the *Extension Templates* is that we store the extension
scripts (plain text blobs) in the catalogs, where we already have the full
SQL objects and tools (such as `pg_dump` and `pg_depends`) to manipulate and
introspect them.

# The new thing™

A set of SQL objects that can be managed wholesale, with a version string
attached to it. Objects are part of `pg_dump` output, the whole set can be
relocatable, and has a version string attached.

Name:

  - not `PACKAGE`, Oracle
  - not `MODULE`, that's already the name of a .so file
  - not `SYSTEM`, already something else
  - `BUNDLE`
  - `LIBRARY`
  - `UNIT`

I'll pick UNIT here.
  
Commands:

CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...];

WITH UNIT name;
  commands
END UNIT name;

ALTER UNIT name OWNER TO role;
ALTER UNIT name ADD object definition;
ALTER UNIT name DROP object definition;
ALTER UNIT name SET SCHEMA new schema;
ALTER UNIT name UPDATE TO version string;
ALTER UNIT name SET [ NOT ] RELOCATABLE;
ALTER UNIT name REQUIRE a, b, c;

COMMENT ON UNIT name IS '';

DROP UNIT name [ CASCADE ];

The `UPDATE TO` command only sets a new version string.

# Implementation details

We need a new `pg_unit` catalog, that looks almost exactly like the
`pg_extension` one, except for the `extconfig` and `extcondition` fields.

We need a way to `recordDependencyOnCurrentUnit()`, so another pair of
static variables `creating_unit` and `CurrentUnitObject`. Each and every
command we do support for creating objects must be made aware of the new
`UNIT` concept, including `CREATE EXTENSION`.

The `pg_dump` dependencies have to be set so that all the objects are
restored independently first, as of today, and only then issue `CREATE
UNIT` and a bunch of `ALTER UNIT ADD` commands, one per object.

# Event Trigger support

Event Triggers are to be provided for all the `UNIT` commands.

# Life with Extensions and Units

PostgreSQL now includes two different ways to package SQL objects, with
about the same feature set. The only difference is the `pg_restore`
behavior: *Extensions* are re-created from external resources, *Units* are
re-created from what's in the dump.

The smarts about `ALTER EXTENSION ... UPDATE` are not available when dealing
with *UNITS*, leaving the user or the client scripts to care about that
entirely on their own.

In principle, a client can prepare a SQL script from a PGXN distribution and
apply it surrounded by `WITH UNIT` and `END UNIT` commands.

Upgrade scripts, once identified, can be run as straight SQL, adding a
simple `ALTER UNIT ... UPDATE TO ...` command before the `COMMIT` at the
end of the script. Identifying the upgrade script(s) may require
implementing current Extension update smarts into whatever client side
program is going to be built to support installing from PGXN etc.

# Conclusion

The main advantage of the `UNIT` proposal is that it copes very well with
relations and other usual schema objects, as the data are preserved at
`pg_restore` time.

A `UNIT` can also entirely replace an `EXTENSION`, including when it needs a
*module*, provided that the *module* is made available on the server's file
system before creating the functions in `LANGUAGE C` that depend on it.

It is possible to write a *UNIT distribution network* where a client
software drives the installation of SQL objects within an UNIT, and this
client software needs to include UNIT update smarts too. It's possible also
to build that software as a set of Event Triggers on the `CREATE UNIT` and
`ALTER UNIT UPDATE TO` commands.

# Analysis

The main drawback is that rather than building on extensions, both in a
technical way and in building user trust, we are basically going to
deprecate extensions entirely, giving them a new name an an incompatible way
to manage them.

Only *contribs* are going to be shipped as extensions, as they are 

[HACKERS] hstore ng index for @ ?

2013-12-18 Thread Kaare Rasmussen

Hi

In many ways the new hstore (and perhaps json) format looks very 
exciting. But will there ever be (GIN/GIST) index support for @ ?




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-18 Thread Dimitri Fontaine
Tom Lane t...@sss.pgh.pa.us writes:
 I keep telling you this, and it keeps not sinking in.

How can you say that? I've been spending a couple of years on designing
and implementing and arguing for a complete feature set where dealing
with modules is avoided at all costs.

The problem we have now is that I'm being told that the current feature
is rejected if it includes anything about modules, and not interesting
enough if it's not dealing with modules.

I tried my best to make it so that nothing in-core changes wrt modules,
yet finding out-of-core solutions to still cope with that. It's a
failure, ok.

I think we need a conclusion on this thread: Extension specs are frozen.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 02:59 AM, Josh Berkus wrote:

On 12/17/2013 01:42 PM, Kevin Grittner wrote:

Josh Berkus j...@agliodbs.com wrote:

Going back over this patch, I haven't seen any further discussion of the
point Heikki raises above, which seems like a bit of a showstopper.

Heikki, did you have specific ideas on how to solve this?  Right now my
mind boggles.


It works fine as long as you set default_transaction_isolation =
'serializable' and never override that.  :-)  Of course, it sure
would be nice to have a way to prohibit overrides, but that's
another issue.

Otherwise it is hard to see how to make it work in a general way
without a mutually exclusive lock mode on the table for the
duration of any transaction which modifies the table.


Serializable or not, *what* do we lock for assertions?  It's not rows.
Tables?  Which tables?  What if the assertion is an interpreted language
function?  Does the SSI reference counter really take care of all of this?


SSI does make everything rosy, as long as all the transactions 
participate in it. The open questions revolve around what happens if a 
transaction is not running in SSI mode.


If you force everyone to run in SSI as soon as you create even a single 
assertion in your database, that's kind of crappy for performance. Also, 
if one user creates an assertion, it becomes a funny kind of a partial 
denial of service attack to other users, if you force other user's to 
also run in SSI mode.


If you don't force everything to run in SSI mode, then you have to 
somehow figure out what parts do need to run in SSI mode to enforce the 
assertion. For example, if the assertion refers tables A and B, perhaps 
you can get away without predicate locks on table C?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 02:59 AM, Josh Berkus wrote:

On 12/17/2013 01:42 PM, Kevin Grittner wrote:

It works fine as long as you set default_transaction_isolation =
'serializable' and never override that.  :-)  Of course, it sure
would be nice to have a way to prohibit overrides, but that's
another issue.

Otherwise it is hard to see how to make it work in a general way
without a mutually exclusive lock mode on the table for the
duration of any transaction which modifies the table.


Serializable or not, *what* do we lock for assertions?  It's not rows.
Tables?  Which tables?  What if the assertion is an interpreted language
function?  Does the SSI reference counter really take care of all of this?


Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes 
with that snapshot. Now, there's a race condition, if another 
transaction is committing at the same time, and performs the same check 
concurrently. That race condition can be eliminated by holding an 
exclusive lock while running the assertion, until commit, effectively 
allowing the assertion to be checked by only one transaction at a time.


I think that would work, and would be simple, although it wouldn't scale 
too well.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
 Here's another idea that doesn't involve SSI:
 
 At COMMIT, take a new snapshot and check that the assertion still passes
 with that snapshot. Now, there's a race condition, if another transaction is
 committing at the same time, and performs the same check concurrently. That
 race condition can be eliminated by holding an exclusive lock while running
 the assertion, until commit, effectively allowing the assertion to be
 checked by only one transaction at a time.
 
 I think that would work, and would be simple, although it wouldn't scale too
 well.

It probably would also be very prone to deadlocks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Example query causing param_info to be set in plain rel path

2013-12-18 Thread Ashutosh Bapat
I got an example where paths for plain rel require param_info i.e. plain
rel scans require to take care of the lateral references. Here's the
example from PG regression

explain verbose select v.* from
  (int8_tbl x left join (select q1,(select coalesce(q2,0)) q2 from
int8_tbl) y on x.q2 = y.q1)
  left join int4_tbl z on z.f1 = x.q2,
  lateral (select x.q1,y.q1 from dual union all select x.q2,y.q2 from dual)
v(vx,vy);

There is note in create_scan_plan(), which says,
 324  * If it's a parameterized otherrel, there might be lateral
references
 325  * in the tlist, which need to be replaced with Params.  This
cannot
 326  * happen for regular baserels, though.  Note
use_physical_tlist()
 327  * always fails for otherrels, so we don't need to check this
above.
 328  */

Although it doesn't say why this can not happen for regular baserels.

So, we do have a testcase which tests this code path as well.


On Mon, Oct 28, 2013 at 12:30 PM, Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote:

 No adding OFFSET there too didn't give the expected result. The lateral
 was handled in subquery and passed as param to the underlying table scan.

 I am particularly interested in tables (unlike functions or subqueries)
 since, the table scans are shipped to the datanodes and I wanted to test
 the effect of lateral in such cases. OTH, functions involving access to the
 tables or subqueries are initiated on the coordinators, where lateral gets
 executed in the same way as PostgreSQL.

 If it's so hard to come up with an example query which would cause
 lateral_relids to be set in RelOptInfo of a table, then it's very likely
 that relevant code is untested in PostgreSQL.


 On Fri, Oct 25, 2013 at 7:11 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 Ashutosh Bapat ashutosh.ba...@enterprisedb.com writes:
  In order to test various cases of LATERAL join in Postgres-XC, I am
 trying
  to find a query where RelOptInof-lateral_relids would get set for plain
  base relations.

 I think you need a lateral reference in a function or VALUES FROM-item.
 As you say, plain sub-selects are likely to get flattened.  (Possibly
 if you stuck in a flattening fence such as OFFSET 0, you could get the
 case to happen with a sub-select FROM item, but I'm too lazy to check.)

 regards, tom lane




 --
 Best Wishes,
 Ashutosh Bapat
 EnterpriseDB Corporation
 The Postgres Database Company




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Alexander Korotkov
On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 12/17/2013 12:22 AM, Alexander Korotkov wrote:

 On Mon, Dec 16, 2013 at 3:30 PM, Heikki Linnakangas 
 hlinnakan...@vmware.com

 wrote:


  On 12/12/2013 06:44 PM, Alexander Korotkov wrote:

   When values are packed into small groups, we have to either insert

 inefficiently encoded value or re-encode whole right part of values.


 It would probably be simplest to store newly inserted items uncompressed,
 in a separate area in the page. For example, grow the list of
 uncompressed
 items downwards from pg_upper, and the compressed items upwards from
 pg_lower. When the page fills up, re-encode the whole page.


 I hacked together an implementation of a variant of Simple9, to see how it
 performs. Insertions are handled per the above scheme.

 In a limited pg_trgm test case I've been using a lot for this, this
 reduces the index size about 20%, compared to varbyte encoding. It might be
 possible to squeeze it a bit more, I handcrafted the selectors in the
 encoding algorithm to suite our needs, but I don't actually have a good
 idea of how to choose them optimally. Also, the encoding can encode 0
 values, but we never need to do that, so you could take advantage of that
 to pack items tighter.

 Compression and decompression speed seems to be about the same.

 Patch attached if you want to play with it. WAL replay is still broken,
 and there are probably bugs.


  Good idea. But:
 1) We'll still need item indexes in the end of page for fast scan.


 Sure.


  2) Storage would be easily extendable to hold additional information as
 well.
 Better compression shouldn't block more serious improvements.


 I'm not sure I agree with that. For all the cases where you don't care
 about additional information - which covers all existing users for example
 - reducing disk size is pretty important. How are you planning to store the
 additional information, and how does using another encoding gets in the way
 of that?


I was planned to store additional information datums between
varbyte-encoded tids. I was expected it would be hard to do with PFOR.
However, I don't see significant problems in your implementation of Simple9
encoding. I'm going to dig deeper in your version of patch.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:39 PM, Andres Freund wrote:

On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:

Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes
with that snapshot. Now, there's a race condition, if another transaction is
committing at the same time, and performs the same check concurrently. That
race condition can be eliminated by holding an exclusive lock while running
the assertion, until commit, effectively allowing the assertion to be
checked by only one transaction at a time.

I think that would work, and would be simple, although it wouldn't scale too
well.


It probably would also be very prone to deadlocks.


Hmm, since this would happen at commit, you would know all the 
assertions that need to be checked at that point. You could check them 
e.g in Oid order to avoid deadlocks.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hstore ng index for @ ?

2013-12-18 Thread Alexander Korotkov
Hi!

On Wed, Dec 18, 2013 at 2:35 PM, Kaare Rasmussen ka...@jasonic.dk wrote:

 In many ways the new hstore (and perhaps json) format looks very exciting.
 But will there ever be (GIN/GIST) index support for @ ?


It looks not hard to do with GiST. About GIN I don't have promising ideas:
likely we can't effectively use GIN for such kind of queries.
I believe it's not implemented because wasn't requested yet. Could you
share your use-case?

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:46:59 +0200, Heikki Linnakangas wrote:
 On 12/18/2013 01:39 PM, Andres Freund wrote:
 On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
 Here's another idea that doesn't involve SSI:
 
 At COMMIT, take a new snapshot and check that the assertion still passes
 with that snapshot.
 I think that would work, and would be simple, although it wouldn't scale too
 well.
 
 It probably would also be very prone to deadlocks.
 
 Hmm, since this would happen at commit, you would know all the assertions
 that need to be checked at that point. You could check them e.g in Oid order
 to avoid deadlocks.

I think real problem are the lock upgrades, because eventual DML will
have acquired less heavy locks.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 6:39 PM, Martijn van Oosterhout
klep...@svana.orgwrote:

 On Sat, Dec 14, 2013 at 06:21:18PM +0400, Alexander Korotkov wrote:
   Is that actually all that beneficial when sorting with a bog standard
   qsort() since that doesn't generally benefit from data being
 pre-sorted?
   I think we might need to switch to a different algorithm to really
   benefit from mostly pre-sorted input.
  
 
  In this patch I don't do full sort of dataset. For instance, index
 returns
  data ordered by first column and we need to order them also by second
  column. Then this node sorts groups (assumed to be small) where values of
  the first column are same by value of second column. And with limit
 clause
  only required number of such groups will be processed. But, I don't think
  we should expect pre-sorted values of second column inside a group.

 Nice. I imagine this would be mostly beneficial for fast-start plans,
 since you no longer need to sort the whole table prior to returning the
 first tuple.

 Reduced memory usage might be a factor, especially for large sorts
 where you otherwise might need to spool to disk.

 You can now use an index on (a) to improve sorting for (a,b).

 Cost of sorting n groups of size l goes from O(nl log nl) to just O(nl
 log l), useful for large n.


Agree. Your reasoning looks correct.


 Minor comments:

 I find cmpTuple a bad name. That's what it's doing but perhaps
 cmpSkipColumns would be clearer.

 I think it's worthwhile adding a seperate path for the skipCols = 0
 case, to avoid extra copies.


Thanks. I'll take care about.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 7:04 PM, Andres Freund and...@2ndquadrant.comwrote:

 Hi,

 Limit  (cost=69214.06..69214.08 rows=10 width=16) (actual
time=0.097..0.099 rows=10 loops=1)
   -  Sort  (cost=69214.06..71714.06 rows=100 width=16) (actual
time=0.096..0.097 rows=10 loops=1)
 Sort Key: v1, v2
 Sort Method: top-N heapsort  Memory: 25kB
 -  Index Scan using test_v1_idx on test
  (cost=0.42..47604.42
rows=100 width=16) (actual time=0.017..0.066 rows=56 loops=1)
 Total runtime: 0.125 ms
(6 rows)
  
   Is that actually all that beneficial when sorting with a bog standard
   qsort() since that doesn't generally benefit from data being
 pre-sorted?
   I think we might need to switch to a different algorithm to really
   benefit from mostly pre-sorted input.
  
 
  In this patch I don't do full sort of dataset. For instance, index
 returns
  data ordered by first column and we need to order them also by second
  column.

 Ah, that makes sense.

  But, I don't think we should expect pre-sorted values of second column
  inside a group.

 Yes, if you do it that way, there doesn't seem to any need to assume
 that any more than we usually do.

 I think you should make the explain output reflect the fact that we're
 assuming v1 is presorted and just sorting v2. I'd be happy enough with:
 Sort Key: v1, v2
 Partial Sort: v2
 or even just
 Partial Sort Key: [v1,] v2
 but I am sure others disagree.


Sure, I just didn't change explain output yet. It should look like what you
propose.

   *partial-knn-1.patch*

   Rechecking from the heap means adding a sort node though, which I don't
   see here? Or am I misunderstanding something?

  KNN-GiST contain RB-tree of scanned items. In this patch item is
 rechecked
  inside GiST and reinserted into same RB-tree. It appears to be much
 easier
  implementation for PoC and also looks very efficient. I'm not sure what
 is
  actually right design for it. This is what I like to discuss.

 I don't have enough clue about gist to say wether it's the right design,
 but it doesn't look wrong to my eyes. It'd probably be useful to export
 the knowledge that we are rechecking and how often that happens to the
 outside.
 While I didn't really look into the patch, I noticed in passing that you
 pass a all_dead variable to heap_hot_search_buffer without using the
 result - just pass NULL instead, that performs a bit less work.


Useful notice, thanks.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:50 PM, Andres Freund wrote:

On 2013-12-18 13:46:59 +0200, Heikki Linnakangas wrote:

On 12/18/2013 01:39 PM, Andres Freund wrote:

On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:

Here's another idea that doesn't involve SSI:

At COMMIT, take a new snapshot and check that the assertion still passes
with that snapshot.
I think that would work, and would be simple, although it wouldn't scale too
well.


It probably would also be very prone to deadlocks.


Hmm, since this would happen at commit, you would know all the assertions
that need to be checked at that point. You could check them e.g in Oid order
to avoid deadlocks.


I think real problem are the lock upgrades, because eventual DML will
have acquired less heavy locks.


Ah, I see. You don't need to block anyone else from modifying the table, 
you just need to block anyone else from committing a transaction that 
had modified the table. So the locks shouldn't interfere with regular 
table locks. A ShareUpdateExclusiveLock on the assertion should do it.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Alexander Korotkov
On Sat, Dec 14, 2013 at 11:47 PM, Andreas Karlsson andr...@proxel.sewrote:

 On 12/14/2013 10:59 AM, Alexander Korotkov wrote:

 This patch allows to use index for order-by if order-by clause and index
 has non-empty common prefix. So, index gives right ordering for first n
 order-by columns. In order to provide right order for rest m columns,
 sort node is inserted. This sort node sorts groups of tuples where
 values of first n order-by columns are equal.


 I recently looked at the same problem. I see that you solved the
 rescanning problem by simply forcing the sort to be redone on
 ExecReScanSort if you have done a partial sort.


Naturally, I'm sure I solved it at all :) I just get version of patch
working for very limited use-cases.


 My idea for a solution was to modify tuplesort to allow storing the
 already sorted keys in either memtuples or the sort result file, but
 setting a field so it does not sort thee already sorted tuples again. This
 would allow the rescan to work as it used to, but I am unsure how clean or
 ugly this code would be. Was this something you considered?


I'm not sure. I believe that best answer depends on particular parameter:
how much memory we've for sort, how expensive is underlying node and how it
performs rescan, how big are groups in partial sort.

--
With best regards,
Alexander Korotkov.


Re: [HACKERS] commit fest 2013-11 final report

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 7:14 PM, Peter Eisentraut pete...@gmx.net wrote:
 On 12/17/13, 10:19 AM, Tom Lane wrote:
 Perhaps we should just move all the Needs Review and RFC patches forward
 to the next fest, so we don't forget about them?

 This was done the last few times, but it has caused some controversy.
 One problem was that a number of patches arrived in this commit fest
 without either the author or the reviewers knowing about it, which
 caused the already somewhat stale patch to become completely abandoned.

 I think what I'll do is send an email to each of the affected patch
 threads describing the situation.  But I'd like someone involved in the
 patch, either author or reviewer, to make the final call about moving
 the patch forward.

+1.  And thanks for your work on this CommitFest.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 2:05 PM, Tatsuo Ishii is...@postgresql.org wrote:
 Hi,

 I have looked into this because it's marked as ready for committer.
   Thank you.
 It looks like working as advertised. Great! However I have noticed a
 few minor issues.

 1) validate_conf_option

 +/*
 + * Validates configuration parameter and value, by calling check hook 
 functions
 + * depending on record's vartype. It validates if the parameter
 + * value given is in range of expected predefined value for that parameter.
 + *
 + * freemem - true indicates memory for newval and newextra will be
 + *  freed in this function, false indicates it will be 
 freed
 + *  by caller.
 + * Return value:
 + * 1: the value is valid
 + * 0: the name or value is invalid
 + */
 +int
 +validate_conf_option(struct config_generic * record, const char *name,
 +const char *value, GucSource source, 
 int elevel,
 +bool freemem, void *newval, void 
 **newextra)

 Is there any reason for the function returns int as it always returns
 0 or 1. Maybe returns bool is better?

No, return type should be bool, I have changed the same in attached patch.

 2) initdb.c

 +   strcpy(tempautobuf, # Do not edit this file manually! \n);
 +   autoconflines[0] = pg_strdup(tempautobuf);
 +   strcpy(tempautobuf, # It will be overwritten by the ALTER SYSTEM 
 command. \n);
 +   autoconflines[1] = pg_strdup(tempautobuf);

 Is there any reason to use tempautobuf here? I think we can simply change 
 to this:

 +   autoconflines[0] = pg_strdup(# Do not edit this file manually! \n);
 +   autoconflines[1] = pg_strdup(# It will be overwritten by the ALTER 
 SYSTEM command. \n);

You are right, I have changed code as per your suggestion.

 3) initdb.c

 It seems the memory allocated for autoconflines[0] and
 autoconflines[1] by pg_strdup is never freed.

I think, it gets freed in writefile() in below code.

for (line = lines; *line != NULL; line++)
{
if (fputs(*line, out_file)  0)
{
fprintf(stderr, _(%s: could not write file \%s\: %s\n),
progname, path, strerror(errno));
exit_nicely();
}
free(*line);
}


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


alter_system_v13.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Sergey Muraviov
Hi

I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20
and met with a label regression test failure.

PS
I've got some warning during build process.

-- 
Best regards,
Sergey Muraviov


regression.out
Description: Binary data


regression.diffs
Description: Binary data


warnings
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Kohei KaiGai
Could you show me semodule -l on your environment?
I believe security policy has not been changed between F19 and F20...

Thanks,

2013/12/18 Sergey Muraviov sergey.k.murav...@gmail.com:
 Hi

 I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20 and
 met with a label regression test failure.

 PS
 I've got some warning during build process.

 --
 Best regards,
 Sergey Muraviov


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] sepgsql: label regression test failed

2013-12-18 Thread Sergey Muraviov
# semodule -l | grep sepgslq
sepgsql-regtest 1.07

Full list of modules is in attachment.


2013/12/18 Kohei KaiGai kai...@kaigai.gr.jp

 Could you show me semodule -l on your environment?
 I believe security policy has not been changed between F19 and F20...

 Thanks,

 2013/12/18 Sergey Muraviov sergey.k.murav...@gmail.com:
  Hi
 
  I've tried to test postgres 9.3.2 and 9.4devel with selinux on Fedora 20
 and
  met with a label regression test failure.
 
  PS
  I've got some warning during build process.
 
  --
  Best regards,
  Sergey Muraviov
 
 
  --
  Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
  To make changes to your subscription:
  http://www.postgresql.org/mailpref/pgsql-hackers
 



 --
 KaiGai Kohei kai...@kaigai.gr.jp




-- 
Best regards,
Sergey Muraviov


modules
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Stephen Frost
* Craig Ringer (cr...@2ndquadrant.com) wrote:
 On 12/12/2013 02:51 AM, Tom Lane wrote:
  The thing that I'm wondering is why the database would be the right place
  to be measuring it at all.  If you've got a network usage problem,
  aggregate usage across everything on the server is probably what you
  need to be worried about, and PG can't tell you that.
 
 I suspect this feature would be useful for when you want to try to drill
 down and figure out what's having network issues - specifically, to
 associate network behaviour with individual queries, individual users,
 application_name, etc.
 
 One sometimes faces the same issue with I/O: I know PostgreSQL is doing
 lots of I/O, but what exactly is causing the I/O? Especially if you
 can't catch it at the time it happens, it can be quite tricky to go from
 there's lots of I/O to this query changed from using synchronized
 seqscans to doing an index-only scan that's hammering the cache.

Agreed.  My other thought on this is that there's a lot to be said for
having everything you need available through one tool- kinda like how
Emacs users rarely go outside of it.. :)  And then there's also the
consideration that DBAs may not have access to the host system at all,
or not to the level needed to do similar analysis there.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 03:35 AM, Tatsuo Ishii wrote:

3) initdb.c

It seems the memory allocated for autoconflines[0] and
autoconflines[1] by pg_strdup is never freed.

(I think there's similar problem with conflines as well, though it
was not introduced by the patch).




Why would we care? initdb doesn't run for very long, and the memory will 
be freed when it exits, usually within a few seconds. My recollection 
from back when I originally rewrote initdb in C was that cleaning up the 
memory leaks wasn't worth the trouble.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 06:00 AM, Heikki Linnakangas wrote:



If you don't force everything to run in SSI mode, then you have to 
somehow figure out what parts do need to run in SSI mode to enforce 
the assertion. For example, if the assertion refers tables A and B, 
perhaps you can get away without predicate locks on table C?






But the assertion might simply run a function. For non-trivial cases 
that's what I would expect people to do.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Stephen Frost
* Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote:
 Here's my attempt:
 
 # Inline Extension, Extension Templates
 
 The problem with *Inline Extension* is the dump and restore policy. The
 contents of an extensions are not be found in a `pg_dump` script, ever.

You keep coming back to this and I think you're taking too narraw a view
to the comments made on the prior threads.  No, we don't really want
extensions which have .sql files out on disk somewhere as part of
them to be dumped out through pg_dump because then it becomes unclear
which set of scripts should be used during restore.  What we're talking
about here is intended to not have that issue by using a different
namespace, a flag, something which identifies these extensions as being
defined through the catalog instead.

 # The new thing™
 
 A set of SQL objects that can be managed wholesale, with a version string
 attached to it. Objects are part of `pg_dump` output, the whole set can be
 relocatable, and has a version string attached.

I'd like to see more than just a single version string included and I
think that'd be beneficial for extensions too.

 Name:
[...]
 I'll pick UNIT here.

We can figure that later.

 Commands:
 
 CREATE UNIT name [ SCHEMA ... ] [ [ NOT ] RELOCATABLE ] [ REQUIRE ...];
 
 WITH UNIT name;
   commands
 END UNIT name;

Interesting approach- I had considered something similar by having a
'fake' schema created into which you built up the 'UNIT'.  The reason I
was thinking schema instead of begin/end style commands, as you have
above, is because of questions around transactions.  Do you think the
syntax you have here would require the definition to be all inside of a
single transaction?  Do we feel that would even be an issue or perhaps
that it *should* be done that way?  I don't currently have any strong
feelings one way or the other on this and I'm curious what others
think.

 The `UPDATE TO` command only sets a new version string.

So, one of the things I had been wondering about is if we could provide
a 'diff' tool.  Using your 'WITH UNIT' syntax above, an author might
need to only write their initial implementation, build up a 'UNIT' based
on it, then adjust that implementation with another 'WITH UNIT' clause
and then ask PG for the differences.  It's not clear if we could make
that work but there is definitely a set of desireable capabilities out
there, which some other databases have, around automated upgrade script
building and doing schema differences.

 # Implementation details
 # Event Trigger support

Not sure we're really ready to get into these yet.

 The main drawback is that rather than building on extensions, both in a
 technical way and in building user trust, we are basically going to
 deprecate extensions entirely, giving them a new name an an incompatible way
 to manage them.

I don't see this as ending up deprecating extensions, even if we build a
new thing with a new name.  I would argue that properly supported
extensions, such as those in contrib and the other 'main' ones, like
PostGIS, and others that have any external dependencies (eg: FDWs) would
almost certainly continue as extensions and would be packaged through
the normal OS packaging systems.  While you plan to use the event
trigger mechanism to build something on top of this which tries to act
like extenisons-but-not, I think that's an extremely narrow and limited
use-case that very few people would have any interest in or use.

 Basically with building `UNIT` we realise with hindsight that we failed to
 build a proper `EXTENSION` system, and we send that message to our users.

Little difficult to draw conclusions about what out 'hindsight' will
look like.

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] GIN improvements part 1: additional information

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 01:45 PM, Alexander Korotkov wrote:

On Tue, Dec 17, 2013 at 2:49 AM, Heikki Linnakangas hlinnakan...@vmware.com

wrote:



On 12/17/2013 12:22 AM, Alexander Korotkov wrote:
  2) Storage would be easily extendable to hold additional information as

well.
Better compression shouldn't block more serious improvements.



I'm not sure I agree with that. For all the cases where you don't care
about additional information - which covers all existing users for example
- reducing disk size is pretty important. How are you planning to store the
additional information, and how does using another encoding gets in the way
of that?


I was planned to store additional information datums between
varbyte-encoded tids. I was expected it would be hard to do with PFOR.
However, I don't see significant problems in your implementation of Simple9
encoding. I'm going to dig deeper in your version of patch.


Ok, thanks.

I had another idea about the page format this morning. Instead of having 
the item-indexes at the end of the page, it would be more flexible to 
store a bunch of self-contained posting list segments on the page. So 
I propose that we get rid of the item-indexes, and instead store a bunch 
of independent posting lists on the page:


typedef struct
{
ItemPointerData first;   /* first item in this segment (unpacked) */
uint16  nwords;  /* number of words that follow */
uint64  words[1];/* var length */
} PostingListSegment;

Each segment can be encoded and decoded independently. When searching 
for a particular item (like on insertion), you skip over segments where 
'first'  the item you're searching for.


This format offers a lot more flexibility compared to the separate item 
indexes. First, we don't need to have another fixed sized area on the 
page, which simplifies the page format. Second, we can more easily 
re-encode only one segment on the page, on insertion or vacuum. The 
latter is particularly important with the Simple-9 encoding, which 
operates one word at a time rather than one item at a time; removing or 
inserting an item in the middle can require a complete re-encoding of 
everything that follows. Third, when a page is being inserted into and 
contains only uncompressed items, you don't waste any space for unused 
item indexes.


While we're at it, I think we should use the above struct in the inline 
posting lists stored directly in entry tuples. That wastes a few bytes 
compared to the current approach in the patch (more alignment, and 
'words' is redundant with the number of items stored on the tuple 
header), but it simplifies the functions handling these lists.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Tatsuo Ishii
 Is there any reason for the function returns int as it always returns
 0 or 1. Maybe returns bool is better?
 
 No, return type should be bool, I have changed the same in attached patch.

Confirmed.

 2) initdb.c

 +   strcpy(tempautobuf, # Do not edit this file manually! \n);
 +   autoconflines[0] = pg_strdup(tempautobuf);
 +   strcpy(tempautobuf, # It will be overwritten by the ALTER SYSTEM 
 command. \n);
 +   autoconflines[1] = pg_strdup(tempautobuf);

 Is there any reason to use tempautobuf here? I think we can simply change 
 to this:

 +   autoconflines[0] = pg_strdup(# Do not edit this file manually! \n);
 +   autoconflines[1] = pg_strdup(# It will be overwritten by the ALTER 
 SYSTEM command. \n);
 
 You are right, I have changed code as per your suggestion.

Confirmed.

 3) initdb.c

 It seems the memory allocated for autoconflines[0] and
 autoconflines[1] by pg_strdup is never freed.
 
 I think, it gets freed in writefile() in below code.

Oh, I see. Sorry for noise.

I have committed your patches. Thanks.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 12:35 PM, Jeff Janes jeff.ja...@gmail.com wrote:
 Since it doesn't use directIO, you can't warm the PG buffers without also
 warming FS cache as a side effect.  That is why I like 'buffer' as the
 default--if the data fits in shared_buffers, it warm those, otherwise it at
 least warms the FS.  If you want to only warm the FS cache, you can use
 either the 'prefetch' or 'read' modes instead.

All right, here is an updated patch.  I swapped the second and third
arguments, because I think overriding the prewarm mode will be a lot
more common than overriding the relation fork.  I also added defaults,
so you can do this:

SELECT pg_prewarm('pgbench_accounts');

Or this:

SELECT pg_prewarm('pgbench_accounts', 'read');

I also fixed some oversights in the error checks.

I'm not inclined to wait for the next CommitFest to commit this,
because it's a very simple patch and has already had a lot more field
testing than most patches get before they're committed.  And it's just
a contrib module, so the damage it can do if there is in fact a bug is
pretty limited.  All that having been said, any review is appreciated.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
diff --git a/contrib/Makefile b/contrib/Makefile
index 8a2a937..dd2683b 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -32,6 +32,7 @@ SUBDIRS = \
 		pg_archivecleanup \
 		pg_buffercache	\
 		pg_freespacemap \
+		pg_prewarm	\
 		pg_standby	\
 		pg_stat_statements \
 		pg_test_fsync	\
diff --git a/contrib/pg_prewarm/Makefile b/contrib/pg_prewarm/Makefile
new file mode 100644
index 000..176a29a
--- /dev/null
+++ b/contrib/pg_prewarm/Makefile
@@ -0,0 +1,18 @@
+# contrib/pg_prewarm/Makefile
+
+MODULE_big = pg_prewarm
+OBJS = pg_prewarm.o
+
+EXTENSION = pg_prewarm
+DATA = pg_prewarm--1.0.sql
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/pg_prewarm
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
diff --git a/contrib/pg_prewarm/pg_prewarm--1.0.sql b/contrib/pg_prewarm/pg_prewarm--1.0.sql
new file mode 100644
index 000..2bec776
--- /dev/null
+++ b/contrib/pg_prewarm/pg_prewarm--1.0.sql
@@ -0,0 +1,14 @@
+/* contrib/pg_prewarm/pg_prewarm--1.0.sql */
+
+-- complain if script is sourced in psql, rather than via CREATE EXTENSION
+\echo Use CREATE EXTENSION pg_prewarm to load this file. \quit
+
+-- Register the function.
+CREATE FUNCTION pg_prewarm(regclass,
+		   mode text default 'buffer',
+		   fork text default 'main',
+		   first_block int8 default null,
+		   last_block int8 default null)
+RETURNS int8
+AS 'MODULE_PATHNAME', 'pg_prewarm'
+LANGUAGE C;
diff --git a/contrib/pg_prewarm/pg_prewarm.c b/contrib/pg_prewarm/pg_prewarm.c
new file mode 100644
index 000..10317f3
--- /dev/null
+++ b/contrib/pg_prewarm/pg_prewarm.c
@@ -0,0 +1,205 @@
+/*-
+ *
+ * pg_prewarm.c
+ *		  prewarming utilities
+ *
+ * Copyright (c) 2010-2012, PostgreSQL Global Development Group
+ *
+ * IDENTIFICATION
+ *		  contrib/pg_prewarm/pg_prewarm.c
+ *
+ *-
+ */
+#include postgres.h
+
+#include sys/stat.h
+#include unistd.h
+
+#include access/heapam.h
+#include catalog/catalog.h
+#include fmgr.h
+#include miscadmin.h
+#include storage/bufmgr.h
+#include storage/smgr.h
+#include utils/acl.h
+#include utils/builtins.h
+#include utils/lsyscache.h
+#include utils/rel.h
+
+PG_MODULE_MAGIC;
+
+extern Datum pg_prewarm(PG_FUNCTION_ARGS);
+
+PG_FUNCTION_INFO_V1(pg_prewarm);
+
+typedef enum
+{
+	PREWARM_PREFETCH,
+	PREWARM_READ,
+	PREWARM_BUFFER
+} PrewarmType;
+
+static char blockbuffer[BLCKSZ];
+
+/*
+ * pg_prewarm(regclass, mode text, fork text,
+ *			  first_block int8, last_block int8)
+ *
+ * The first argument is the relation to be prewarmed; the second controls
+ * how prewarming is done; legal options are 'prefetch', 'read', and 'buffer'.
+ * The third is the name of the relation fork to be prewarmed.  The fourth
+ * and fifth arguments specify the first and last block to be prewarmed.
+ * If the fourth argument is NULL, it will be taken as 0; if the fifth argument
+ * is NULL, it will be taken as the number of blocks in the relation.  The
+ * return value is the number of blocks successfully prewarmed.
+ */
+Datum
+pg_prewarm(PG_FUNCTION_ARGS)
+{
+	Oid		relOid;
+	text   *forkName;
+	text   *type;
+	int64	first_block;
+	int64	last_block;
+	int64	nblocks;
+	int64	blocks_done = 0;
+	int64		block;
+	Relation	rel;
+	ForkNumber	forkNumber;
+	char   *forkString;
+	char   *ttype;
+	PrewarmType	ptype;
+	AclResult	aclresult;
+
+	/* Basic sanity checking. */
+	if (PG_ARGISNULL(0))
+		ereport(ERROR,
+(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg(relation cannot be null)));
+	relOid = PG_GETARG_OID(0);
+	

Re: SQL objects UNITs (was: [HACKERS] Extension Templates S03E11)

2013-12-18 Thread Alvaro Herrera
Stephen Frost escribió:
 * Dimitri Fontaine (dimi...@2ndquadrant.fr) wrote:

  Basically with building `UNIT` we realise with hindsight that we failed to
  build a proper `EXTENSION` system, and we send that message to our users.
 
 Little difficult to draw conclusions about what out 'hindsight' will
 look like.

I haven't been keeping very close attention to this, but I fail to see
why extensions are so much of a failure.  Surely we can invent a new
kind of extensions, ones whose contents specifically are dumped by
pg_dump.  Regular extensions, the kind we have today, still wouldn't,
but we could have a flag, say CREATE EXTENSION ... (WITH DUMP) or
something.  That way you don't have to come up with UNIT at all (or
whatever).  A whole new set of catalogs just to fix up a minor issue
with extensions sounds a bit too much to me; we can just add this new
thing on top of the existing infrastructure.

I didn't much like the WITH UNIT/END UNIT thingy.  What's wrong with
CREATE foo; ALTER EXTENSION ADD foo?  There's a bit of a problem that if 
you create the object and die before being able to add it to the
extension, it would linger unreferenced; but that's easily fixable by
doing the creation in a transaction, I think.  (Alternatively, we could
have a single command that creates the extension and the contained
objects in one fell swoop, similar to how CREATE SCHEMA can do it; but
I'm not sure that's all that much better, and from a grammar POV it
probably sucks.)

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 9:03 PM, KONDO Mitsumasa
kondo.mitsum...@lab.ntt.co.jp wrote:
 (2013/12/18 5:33), Robert Haas wrote:
 Sounds like it might be worth dusting the patch off again...

 I'd like to request you to add all_index option and usage_count option.
 When all_index option is selected, all index become rewarm nevertheless user
 doesn't input relation name. And usage_count option adds usage_copunt in
 shared_buffers. Useful buffers will remain long and not to be thrown easly.
 I think these are easy to implements and useful. So please if you like.

Prewarming indexes is useful, but I don't think we need to complicate
the API for that.  With the version I just posted, you can simply do
something like this:

SELECT pg_prewarm(indexrelid) FROM pg_index WHERE indrelid =
'pgbench_accounts'::regclass;

I seriously doubt whether being able to set the usage count is actually useful.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 1:27 PM, Simon Riggs si...@2ndquadrant.com wrote:
 Not sure I'd say required, but its certainly desirable to have
 updateable security barrier views in themselves. And it comes across
 to me as a cleaner and potentially more performant way of doing the
 security checks for RLS.

Yes, that's how I'm thinking of it.  It's required in the sense that
if we don't do it as a separate patch, we'll need to fold many of
changes into the RLS patch, which IMV is not desirable.  We'd end up
with more complexity and less functionality with no real upside that I
can see.

But I think we are basically saying the same thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 3:30 AM, Craig Ringer cr...@2ndquadrant.com wrote:
 In my view the proposed patch doesn't offer a significant improvement in
 declarative security, beyond what we can get by just adding update
 support to s.b. views and using search_path to control whether a user
 sees the view or the base table.

 It's a lot like Oracle Virtual Private Database (VPD): A set of low
 level building blocks you can build your own flexible row security model
 with. One where you have to very carefully write security-sensitive
 predicates and define all your security model tables, etc yourself.

 That's why I'm now of the opinion that we should make it possible to
 achieve the same thing with s.b. views and the search_path (by adding
 update support)... then build a declarative row-security system that
 doesn't require the user fiddling with delicate queries and sensitive
 scripts on top of that.

To be clear, I wasn't advocating for a declarative approach; I think
predicates are fine.  There are usability issues to worry about either
way, and my concern is that we address those.  A declarative approach
would certainly be valuable in that, for those people for whom it is
sufficiently flexible, it's probably quite a lot easier than writing
predicates.  But I fear that some people will want a lot more
generality than a label-based system can accommodate.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 10:22 PM, Amit Kapila amit.kapil...@gmail.com wrote:
 Me either; at the very least, it's short an underscore: wal_log_hint_bits
 would be more readable.  But how about just wal_log_hints?

 +1 for wal_log_hints, it sounds better.

+1.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 reference constraint regression

2013-12-18 Thread Alvaro Herrera
Andres Freund wrote:

 I have to say, it makes me really uncomfortable to take such
 shortcuts. In other branches we are doing liveliness checks with
 MultiXactIdIsRunning() et al. Why isn't that necessary here? And how
 likely is that this won't cause breakage somewhere down the line because
 somebody doesn't know of that subtlety?

I came up with the attached last night, which should do the right thing
in both cases.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services
*** a/src/backend/access/transam/multixact.c
--- b/src/backend/access/transam/multixact.c
***
*** 1275,1280  retry:
--- 1275,1313 
  }
  
  /*
+  * MultiXactHasRunningRemoteMembers
+  * 		Does the given multixact have still-live members from
+  * 		transactions other than our own?
+  */
+ bool
+ MultiXactHasRunningRemoteMembers(MultiXactId multi)
+ {
+ 	MultiXactMember *members;
+ 	int			nmembers;
+ 	int			i;
+ 
+ 	nmembers = GetMultiXactIdMembers(multi, members, true);
+ 	if (nmembers = 0)
+ 		return false;
+ 
+ 	for (i = 0; i  nmembers; i++)
+ 	{
+ 		/* not interested in our own members */
+ 		if (TransactionIdIsCurrentTransactionId(members[i].xid))
+ 			continue;
+ 
+ 		if (TransactionIdIsInProgress(members[i].xid))
+ 		{
+ 			pfree(members);
+ 			return true;
+ 		}
+ 	}
+ 
+ 	pfree(members);
+ 	return false;
+ }
+ 
+ /*
   * mxactMemberComparator
   *		qsort comparison function for MultiXactMember
   *
*** a/src/backend/utils/time/tqual.c
--- b/src/backend/utils/time/tqual.c
***
*** 686,693  HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
  			if (tuple-t_infomask  HEAP_XMAX_INVALID)	/* xid invalid */
  return HeapTupleMayBeUpdated;
  
! 			if (HEAP_XMAX_IS_LOCKED_ONLY(tuple-t_infomask))	/* not deleter */
! return HeapTupleMayBeUpdated;
  
  			if (tuple-t_infomask  HEAP_XMAX_IS_MULTI)
  			{
--- 686,721 
  			if (tuple-t_infomask  HEAP_XMAX_INVALID)	/* xid invalid */
  return HeapTupleMayBeUpdated;
  
! 			if (HEAP_XMAX_IS_LOCKED_ONLY(tuple-t_infomask))
! 			{
! TransactionId	xmax;
! 
! xmax = HeapTupleHeaderGetRawXmax(tuple);
! 
! /*
!  * Careful here: even though this tuple was created by our own
!  * transaction, it might be locked by other transactions, if
!  * the original version was key-share locked when we updated
!  * it.
!  */
! 
! if (tuple-t_infomask  HEAP_XMAX_IS_MULTI)
! {
! 	if (MultiXactHasRunningRemoteMembers(xmax))
! 		return HeapTupleBeingUpdated;
! 	else
! 		return HeapTupleMayBeUpdated;
! }
! 
! /* if locker is gone, all's well */
! if (!TransactionIdIsInProgress(xmax))
! 	return HeapTupleMayBeUpdated;
! 
! if (!TransactionIdIsCurrentTransactionId(xmax))
! 	return HeapTupleBeingUpdated;
! else
! 	return HeapTupleMayBeUpdated;
! 			}
  
  			if (tuple-t_infomask  HEAP_XMAX_IS_MULTI)
  			{
***
*** 700,706  HeapTupleSatisfiesUpdate(HeapTupleHeader tuple, CommandId curcid,
--- 728,738 
  
  /* updating subtransaction must have aborted */
  if (!TransactionIdIsCurrentTransactionId(xmax))
+ {
+ 	if (MultiXactHasRunningRemoteMembers(xmax))
+ 		return HeapTupleBeingUpdated;
  	return HeapTupleMayBeUpdated;
+ }
  else
  {
  	if (HeapTupleHeaderGetCmax(tuple) = curcid)
*** a/src/include/access/multixact.h
--- b/src/include/access/multixact.h
***
*** 89,94  extern bool MultiXactIdIsRunning(MultiXactId multi);
--- 89,95 
  extern void MultiXactIdSetOldestMember(void);
  extern int GetMultiXactIdMembers(MultiXactId multi, MultiXactMember **xids,
  	  bool allow_old);
+ extern bool MultiXactHasRunningRemoteMembers(MultiXactId multi);
  extern bool MultiXactIdPrecedes(MultiXactId multi1, MultiXactId multi2);
  extern bool MultiXactIdPrecedesOrEquals(MultiXactId multi1,
  			MultiXactId multi2);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Heikki Linnakangas hlinnakan...@vmware.com wrote:
On 12/18/2013 01:39 PM, Andres Freund wrote:
 On 2013-12-18 13:07:51 +0200, Heikki Linnakangas wrote:
 Here's another idea that doesn't involve SSI:

 At COMMIT, take a new snapshot and check that the assertion still passes
 with that snapshot. Now, there's a race condition, if another transaction is
 committing at the same time, and performs the same check concurrently. That
 race condition can be eliminated by holding an exclusive lock while running
 the assertion, until commit, effectively allowing the assertion to be
 checked by only one transaction at a time.

 I think that would work, and would be simple, although it wouldn't scale too
 well.

 It probably would also be very prone to deadlocks.

 Hmm, since this would happen at commit, you would know all the
 assertions that need to be checked at that point. You could check them
 e.g in Oid order to avoid deadlocks.

So you would actually serialize all COMMITs, or at least all which
had done anything which could affect the truth of an assertion? 
That seems like it would work, but I suspect that it would be worse
for performance than SSI in many workloads, and better than SSI in
other workloads.  Maybe a GUC to determine which strategy is used?

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Sawada Masahiko
2013/12/14 0:14 Tom Lane t...@sss.pgh.pa.us:

 Heikki Linnakangas hlinnakan...@vmware.com writes:
  I'm not totally satisfied with the name of the GUC, wal_log_hintbits.

 Me either; at the very least, it's short an underscore: wal_log_hint_bits
 would be more readable.  But how about just wal_log_hints?


+1 too.
it's readble.

Masahiko Sawada


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:

 Serializable or not, *what* do we lock for assertions?  It's not
 rows.  Tables?  Which tables?  What if the assertion is an
 interpreted language function?  Does the SSI reference counter
 really take care of all of this?

The simple answer is that, without adding any additional blocking,
SSI guarantees that the behavior of running any set of concurrent
serializable transactions is consistent with some serial
(one-at-a-time) execution of those transactions.  If the assertion
is run as part of the transaction, it is automatically handled,
regardless of the issues you are asking about.

The longer answer is in the README and the papers it references:

http://git.postgresql.org/gitweb/?p=postgresql.git;a=blob;f=src/backend/storage/lmgr/README-SSI;hb=master

For practical examples of how it works, this Wiki page includes
examples of maintaining a multi-table invariant using triggers:

http://wiki.postgresql.org/wiki/SSI

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Alvaro Herrera
Heikki Linnakangas wrote:

 Ah, I see. You don't need to block anyone else from modifying the
 table, you just need to block anyone else from committing a
 transaction that had modified the table. So the locks shouldn't
 interfere with regular table locks. A ShareUpdateExclusiveLock on
 the assertion should do it.

Causing serialization of transaction commit just because a single
assertion exists in the database seems too much of a hit, so looking for
optimization opportunities seems appropriate.  Here are some ideas for
brainstorming.

It might prove useful to note that any given assertion involves tables
A, B, C.  If a transaction doesn't modify any of those tables then
there's no need to run the assertion when the transaction commits,
skipping acquisition of the assertion lock.

Probably this can only be implemented for SQL-language assertions, but
even so it might be worth considering.  (Assertions in any other
language would be checked by every transaction.)

Another thought: at the initial run of the assertion, note which tables
it locked, and record this as an OID array in the catalog row for the
assertion; consider running the assertion only when those tables are
touched.  This doesn't work if the assertion code locks some tables when
run under certain conditions and other tables under different
conditions.  But then this can be checked too: if an assertion lists in
its catalog row that it involves tables A, B, C and then, under
different conditions, it tries to acquire lock on table D, have the
whole thing fail indicating that the assertion is misdeclared.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] RFC: Async query processing

2013-12-18 Thread Florian Weimer

On 11/04/2013 02:51 AM, Claudio Freire wrote:

On Sun, Nov 3, 2013 at 3:58 PM, Florian Weimer fwei...@redhat.com wrote:

I would like to add truly asynchronous query processing to libpq, enabling
command pipelining.  The idea is to to allow applications to auto-tune to
the bandwidth-delay product and reduce the number of context switches when
running against a local server.

...

If the application is not interested in intermediate query results, it would
use something like this:

...

If there is no need to exit from the loop early (say, because errors are
expected to be extremely rare), the PQgetResultNoWait call can be left out.


It doesn't seem wise to me making such a distinction. It sounds like
you're oversimplifying, and that's why you need modes, to overcome
the evidently restrictive limits of the simplified interface, and that
it would only be a matter of (a short) time when some other limitation
requires some other mode.


I need modes because I want to avoid unbound buffering, which means that 
result data has to be consumed in the order queries are issued.



   PGAsyncMode oldMode = PQsetsendAsyncMode(conn, PQASYNC_RESULT);
   bool more_data;
   do {
  more_data = ...;
  if (more_data) {
int ret = PQsendQueryParams(conn,
  INSERT ... RETURNING ..., ...);
if (ret == 0) {
  // handle low-level error
}
  }
  // Consume all pending results.
  while (1) {
PGresult *res;
if (more_data) {
  res = PQgetResultNoWait(conn);
} else {
  res = PQgetResult(conn);
}


Somehow, that code looks backwards. I mean, really backwards. Wouldn't
that be !more_data?


No, if more data is available to transfer to the server, the no-wait 
variant has to be used to avoid a needless synchronization with the server.



In any case, pipelining like that, without a clear distinction, in the
wire protocol, of which results pertain to which query, could be a
recipe for trouble when subtle bugs, either in lib usage or
implementation, mistakenly treat one query's result as another's.


We already use pipelining in libpq (see pqFlush, PQsendQueryGuts and 
pqParseInput3), the server is supposed to support it, and there is a 
lack of a clear tit-for-tat response mechanism anyway because of 
NOTIFY/LISTEN and the way certain errors are reported.



Instead of buffering the results, we could buffer the encoded command
messages in PQASYNC_RESULT mode.  This means that PQsendQueryParams would
not block when it cannot send the (complete) command message, but store in
the connection object so that the subsequent PQgetResultNoWait and
PQgetResult would send it.  This might work better with single-tuple result
mode.  We cannot avoid buffering either multiple queries or multiple
responses if we want to utilize the link bandwidth, or we'd risk deadlocks.


This is a non-solution. Such an implementation, at least as described,
would not remove neither network latency nor context switches, it
would be a purely API change with no externally visible behavior
change.


Ugh, why?


An effective solution must include multi-command packets. Without
knowing the wire protocol in detail, something like:

PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: INSERT blah
BIND: args
EXECUTE with DISCARD
PARSE: SELECT  blah
BIND: args
EXECUTE with FETCH ALL

All in one packet, would be efficient and error-free (IMO).


No, because this doesn't scale automatically with the bandwidth-delay 
product.  It also requires that the client buffers queries and their 
parameters even though the network has to do that anyway.


In any case, I don't want to change the wire protocol, I just want to 
enable libpq clients to use more of its capabilities.


--
Florian Weimer / Red Hat Product Security Team


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 reference constraint regression

2013-12-18 Thread Alvaro Herrera
Alvaro Herrera wrote:
 Andres Freund wrote:
 
  I have to say, it makes me really uncomfortable to take such
  shortcuts. In other branches we are doing liveliness checks with
  MultiXactIdIsRunning() et al. Why isn't that necessary here? And how
  likely is that this won't cause breakage somewhere down the line because
  somebody doesn't know of that subtlety?
 
 I came up with the attached last night, which should do the right thing
 in both cases.

That one had a silly bug, which I fixed and pushed now.

Thanks for the feedback,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mardi 17 décembre 2013 17:45:51, Robert Haas a écrit :
 On Tue, Dec 17, 2013 at 11:02 AM, Jim Nasby j...@nasby.net wrote:
  On 12/17/13, 8:34 AM, Robert Haas wrote:
  On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila amit.kapil...@gmail.com
  
  wrote:
  I have used pg_prewarm during some of work related to Buffer Management
  and
  other performance related work. It is quite useful utility.
  +1 for reviving this patch for 9.4
  
  Any other votes?
  
  We've had to manually code something that runs EXPLAIN ANALYZE SELECT *
  from a bunch of tables to warm our caches after a restart, but there's
  numerous flaws to that approach obviously.
  
  Unfortunately, what we really need to warm isn't the PG buffers, it's the
  FS cache, which I suspect this won't help. But I still see where just
  pg_buffers would be useful for a lot of folks, so +1.
 
 It'll do either one.  For the FS cache, on Linux, you can also use
 pgfincore.

on Linux, *BSD (including OS X). like what's in postgresql. Only Windows is 
out of scope so far. and there is a solution for windows too, there is just no 
requirement from pgfincore users. 

Maybe you can add the windows support in PostgreSQL now ?

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mardi 17 décembre 2013 21:14:44, Josh Berkus a écrit :
 On 12/17/2013 06:34 AM, Robert Haas wrote:
  On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila amit.kapil...@gmail.com 
wrote:
  I have used pg_prewarm during some of work related to Buffer Management
  and other performance related work. It is quite useful utility.
  +1 for reviving this patch for 9.4
  
  Any other votes?
 
 I still support this patch (as I did originally), and don't think that
 the overlap with pgFincore is of any consequence.  pgFincore does more
 than pgrewarm ever will, but it's also platform-specific, so it still
 makes sense for both to exist.

Just for information, pgFincore is NOT limited to linux (the most interesting 
part, the memory snapshot, works also on BSD based kernels with mincore() 
syscall).
Like for the PostgreSQL effective_io_concurrency (and pg_warm) it just doesn't 
work when posix_fadvise is not available.

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] Dynamic Shared Memory stuff

2013-12-18 Thread Robert Haas
On Tue, Dec 10, 2013 at 6:26 PM, Tom Lane t...@sss.pgh.pa.us wrote:
 Noah Misch n...@leadboat.com writes:
 On Tue, Dec 10, 2013 at 07:50:20PM +0200, Heikki Linnakangas wrote:
 Let's not add more cases like that, if we can avoid it.

 Only if we can avoid it for a modicum of effort and feature compromise.
 You're asking for PostgreSQL to reshape its use of persistent resources so 
 you
 can throw around killall -9 postgres; rm -rf $PGDATA without so much as a
 memory leak.  That use case, not PostgreSQL, has the defect here.

 The larger point is that such a shutdown process has never in the history
 of Postgres been successful at removing shared-memory (or semaphore)
 resources.  I do not feel a need to put a higher recoverability standard
 onto the DSM code than what we've had for fifteen years with SysV shmem.

 But, by the same token, it shouldn't be any *less* recoverable.  In this
 context that means that starting a new postmaster should recover the
 resources, at least 90% of the time (100% if we still have the old
 postmaster lockfile).  I think the idea of keeping enough info in the
 SysV segment to permit recovery of DSM resources is a good one.  Then,
 any case where the existing logic is able to free the old SysV segment
 is guaranteed to also work for DSM.

So I'm taking a look at this.  There doesn't appear to be anything
intrinsically intractable here, but it seems important to time the
order of operations so as to minimize the chances that things fail in
awkward places.  The point where we remove the old shared memory
segment from the previous postmaster invocation is here:

/*
 * The segment appears to be from a dead Postgres process, or from a
 * previous cycle of life in this same process.  Zap it, if possible.
 * This probably shouldn't fail, but if it does, assume the segment
 * belongs to someone else after all, and continue quietly.
 */
shmdt(memAddress);
if (shmctl(shmid, IPC_RMID, NULL)  0)
continue;

My first thought was to remember the control segment ID from the
header just before the shmdt() there, and then later when the DSM
module is starting, do cleanup.  But that doesn't seem like a good
idea, because then if startup fails after we remove the old shared
memory segment and before we get to the DSM initialization code, we'll
have lost the information on what control segment needs to be cleaned
up.  A subsequent startup attempt won't see the old shm again, because
it's already gone.  I'm fairly sure that this would be a net reduction
in reliability vs. the status quo.

So now what I'm thinking is that we ought to actually perform the DSM
cleanup before detaching the old segment and trying to remove it.
That shouldn't fail, but if it does, or if we get killed before
completing it, the next run will hopefully find the same old shm and
finish the cleanup.  But that kind of flies in the face of the comment
above: if we perform DSM cleanup and then discover that the segment
wasn't ours after all, that means we just stomped all over somebody
else's stuff.  That's not too good. But trying to remove the segment
first and then perform the cleanup creates a window where, if we get
killed, the next restart will have lost information about how to
finish cleaning up.  So it seems that some kind of logic tweak is
needed here, but I'm not sure exactly what.  As I see it, the options
are:

1. Make failure to remove the shared memory segment we thought was
ours an error.  This will definitely show up any problems, but only
after we've burned down some other processes's dynamic shared memory
segments.  The most likely outcome is creating failure-to-start
problems that don't exist today.

2. Make it a warning.  We'll still burn down somebody else's DSMs, but
at least we'll still start ourselves.  Sadly, WARNING: You have just
done a bad thing.  It's too late to fix it.  Sorry! is not very
appealing.

3. Remove the old shared memory segment first, then perform the
cleanup immediately afterwards.  If we get killed before completing
the cleanup, we'll leak the un-cleaned-up stuff.  Decide that's OK and
move on.

4. Adopt some sort of belt-and-suspenders approach, keeping the state
file we have now and backstopping it with this mechanism, so that we
really only need this to work when $PGDATA has been blown away and
recreated.  This seems pretty inelegant, and I'm not sure who it'll
benefit other than those (few?) people who kill -9 the postmaster (or
it segfaults or otherwise dies without running the code to remove
shared memory segments) and then remove $PGDATA and then re-initdb and
then expect cleanup to find the old DSMs.

5. Give up on this approach.  We could keep what we have now, or make
the DSM control segment land at a well-known address as we do for the
main segment.

What do people prefer?

The other thing I'm a bit squidgy about is injecting more code that
runs before we get the new shared memory segment up and 

Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Tue, Dec 17, 2013 at 7:05 PM, Cédric Villemain ced...@2ndquadrant.fr wrote:
 Le mardi 17 décembre 2013 21:14:44, Josh Berkus a écrit :
 On 12/17/2013 06:34 AM, Robert Haas wrote:
  On Tue, Dec 17, 2013 at 12:09 AM, Amit Kapila amit.kapil...@gmail.com
 wrote:
  I have used pg_prewarm during some of work related to Buffer Management
  and other performance related work. It is quite useful utility.
  +1 for reviving this patch for 9.4
 
  Any other votes?

 I still support this patch (as I did originally), and don't think that
 the overlap with pgFincore is of any consequence.  pgFincore does more
 than pgrewarm ever will, but it's also platform-specific, so it still
 makes sense for both to exist.

 Just for information, pgFincore is NOT limited to linux (the most interesting
 part, the memory snapshot, works also on BSD based kernels with mincore()
 syscall).
 Like for the PostgreSQL effective_io_concurrency (and pg_warm) it just doesn't
 work when posix_fadvise is not available.

This is a fair point, and I should not have implied that it was
Linux-only.  What I really meant was does not support Windows,
because that's a really big part of our user base.  However, I
shouldn't have phrased it in a way that slights BSD and other UNIX
variants.

Now that we have dynamic background workers, I've been thinking that
it might be possible to write a background worker to do asynchronous
prefetch on systems where we don't have OS support.  We could store a
small ring buffer in shared memory, say big enough for 1k entries.
Each entry would consist of a relfilenode, a starting block number,
and a block count.  We'd also store a flag indicating whether the
prefetch worker has been registered with the postmaster, and a pointer
to the PGPROC of any running worker. When a process wants to do a
prefetch, it locks the buffer, adds its prefetch request to the queue
(overwriting the oldest existing request if the queue is already
full), and checks the flag.  If the flag is not set, it also registers
the background worker.  Then, it releases the lock and sets the latch
of any running worker (whose PGPROC it remembered before releasing the
lock).

When the prefetch process starts up, it services requests from the
queue by reading the requested blocks (or block ranges).  When the
queue is empty, it sleeps.  If it receives no requests for some period
of time, it unregisters itself and exits.  This is sort of a souped-up
version of the hibernation facility we already have for some auxiliary
processes, in that we don't just make the process sleep for a longer
period of time but actually get rid of it altogether.

All of this might be overkill; we could also do it with a permanent
auxiliary process.  But it's sort of a shame to run an extra process
for a facility that might never get used, or might be used only
rarely.  And I'm wary of cluttering things up with a thicket of
auxiliary processes each of which caters only to a very specific, very
narrow situation.  Anyway, just thinking out loud here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Josh Berkus
On 12/18/2013 08:44 AM, Alvaro Herrera wrote:
 Another thought: at the initial run of the assertion, note which tables
 it locked, and record this as an OID array in the catalog row for the
 assertion; consider running the assertion only when those tables are
 touched.  This doesn't work if the assertion code locks some tables when
 run under certain conditions and other tables under different
 conditions.  But then this can be checked too: if an assertion lists in
 its catalog row that it involves tables A, B, C and then, under
 different conditions, it tries to acquire lock on table D, have the
 whole thing fail indicating that the assertion is misdeclared.

This sounds like you're re-inventing SSI.

SERIALIZABLE mode *exists* in order to be able to enforce constraints
which potentially involve more than one transaction.  Balance can never
go below 0, for example. The whole reason we have this really cool and
unique SSI mode is so that we can do such things without killing
performance.  These sorts of requirements are ideally suited to
Assertions, so it's logically consistent to require Serializable mode in
order to use Assertions.

I'm leaning towards the alternative that Assertions require SERIALIZABLE
mode, and throw a WARNING at the user and the log every time we create,
modify, or trigger an assertion while not in SERIALIZABLE mode.   And
beyond, that, we don't guarantee the integrity of Assertions if people
choose to run in READ COMMITTED anyway.

This is consistent with how we treat the interaction of constraints and
triggers; under some circumstances, we allow triggers to violate CHECK
and FK constraints.

Alternately, we add a GUC assertion_serializable_mode, which can be
off, warn or error.  If it's set to error, and the user triggers
an assertion while in READ COMMITTED mode, an exception occurs.  If it's
set to off, then assertions are disabled, in order to deal with buggy
assertions.

Now, it would be even better if we could prevent users from switching
transaction mode, but that's a MUCH bigger and  more complicated patch.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Hi,

 

We recently observed ~15% performance regression with dbt2 from PG 9.3.

We narrowed down on testing master between 9.2 cut and 9.3 cut.

It seems that 0ac5ad5134f2769ccbaefec73844f8504c4d6182 is the culprit
commit.

We did several runs and perf profiling comparing it against its parent
(f925c79b9f36c54b67053ade5ad225a75b8dc803).

Also tested a 12/16 commit on the master
(3b97e6823b949624afdc3ce4c92b29a80429715f) once, it performed similarly as
0ac..

 

Regards,

Dong

 

 

Results:

f92: 53k-56k'ish notpm

0ac: 47k-48k'ish notpm

 

Server SUT:

HP ML350 G6

Two  Xeon E5520 (4c/p, 8 cores total, hyper-threading disabled)

12GB DRAM

HP P410i RAID controller (256MB battery-backed cache)

- three 10k-rpm SAS: /

- three 10k-rpm SAS: /pgdata

- one 15k-rpm SAS: /pgxlog

- ext4 (rw,relatime,data=ordered) on all mounts. 

 

Fedora 19 (3.11.10-200.fc19.x86_64)

 

max_connections=100

shared_buffers=8192MB

effective_cache_size=10GB

temp_buffers=8186kB

work_mem=4093kB

maintenance_work_mem=399MB

wal_buffers=-1

checkpoint_segments=300

checkpoint_completion_target=0.9

logging_collector=on

log_timezone=UTC

datestyle='iso, mdy'

lc_messages=C

lc_monetary=C

lc_numeric=C

lc_time=C

default_text_search_config='pg_catalog.english'

listen_addresses='*'

log_destination=csvlog

log_directory=pg_log

log_filename='pg-%a'

log_rotation_age=1440

log_truncate_on_rotation=on

 

Client and workload:

Dell 390. Two core. Direct connect with the Server SUT.

dbt2 (ToT)

40 warehouse

8 terminals, 8 connections

zero think/key time

12-min run

 

Flat perf profiles of two such runs look like:

f92:

Samples: 608K of event 'cycles', Event count (approx.): 6679607097416


+   4.04%   postgres  postgres  [.]
heap_hot_search_buffer

+   3.63%   postgres  postgres  [.] AllocSetAlloc


+   3.37%   postgres  postgres  [.]
hash_search_with_hash_value   

+   2.85%   postgres  postgres  [.] _bt_compare


+   2.67%   postgres  postgres  [.] SearchCatCache


+   2.46%   postgres  postgres  [.] LWLockAcquire


+   2.16%   postgres  postgres  [.] XLogInsert


+   2.08%   postgres  postgres  [.] PinBuffer


+   1.32%   postgres  postgres  [.] ExecInitExpr


+   1.31%   postgres  libc-2.17.so  [.] _int_malloc


+   1.29%swapper  [kernel.kallsyms] [k] intel_idle


+   1.23%   postgres  postgres  [.]
MemoryContextAllocZeroAligned 

+   1.13%   postgres  postgres  [.]
heap_page_prune_opt   

+   1.06%   postgres  libc-2.17.so  [.]
__memcpy_ssse3_back   

+   1.02%   postgres  postgres  [.] LWLockRelease


+   0.94%   postgres  postgres  [.] copyObject


+   0.89%   postgres  postgres  [.]
fmgr_info_cxt_security

+   0.82%   postgres  postgres  [.] _bt_checkkeys


+   0.81%   postgres  postgres  [.] hash_any


+   0.73%   postgres  postgres  [.] FunctionCall2Coll


+   0.69%   postgres  libc-2.17.so  [.]
__strncpy_sse2_unaligned  

+   0.67%   postgres  postgres  [.]
HeapTupleSatisfiesMVCC

+   0.66%   postgres  postgres  [.] MemoryContextAlloc


+   0.65%   postgres  postgres  [.]
expression_tree_walker

+   0.59%   postgres  postgres  [.] check_stack_depth


+   0.57%   postgres  libc-2.17.so  [.] __printf_fp


+   0.56%   postgres  libc-2.17.so  [.] _int_free


+   0.52%   postgres  postgres  [.] base_yyparse

 

0ac:

Samples: 706K of event 'cycles', Event count (approx.): 6690377376522


+   3.82% postgres  postgres  [.]
GetMultiXactIdMembers   

+   3.43% postgres  postgres  [.] LWLockAcquire


+   3.31% postgres  postgres  [.]
hash_search_with_hash_value 

+   3.09% postgres  postgres  [.]
heap_hot_search_buffer  

+   3.00% postgres  postgres  [.] AllocSetAlloc


+   2.56% postgres  postgres  [.] _bt_compare


+   2.19% postgres  postgres  [.] PinBuffer


+   2.13% postgres  postgres  [.] SearchCatCache


+   1.99% postgres  postgres  [.] 

Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Andres Freund
Hello,

On 2013-12-18 10:24:56 -0800, Dong Ye wrote:
 It seems that 0ac5ad5134f2769ccbaefec73844f8504c4d6182 is the culprit
 commit.

How long does a run take to verify the problem? Could you retry with the
patch attached to
http://www.postgresql.org/message-id/20131201114514.gg18...@alap2.anarazel.de
? Based on the theory that it creates many superflous multixacts.

 Flat perf profiles of two such runs look like:

Those aren't really flat profiles tho ;)

 0ac:
 
 Samples: 706K of event 'cycles', Event count (approx.): 6690377376522
 
 
 +   3.82% postgres  postgres  [.]
 GetMultiXactIdMembers   

Could you expland that one some levels, so we see the callers?

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 13:44:15 -0300, Alvaro Herrera wrote:
 Heikki Linnakangas wrote:
 
  Ah, I see. You don't need to block anyone else from modifying the
  table, you just need to block anyone else from committing a
  transaction that had modified the table. So the locks shouldn't
  interfere with regular table locks. A ShareUpdateExclusiveLock on
  the assertion should do it.
 
 Causing serialization of transaction commit just because a single
 assertion exists in the database seems too much of a hit, so looking for
 optimization opportunities seems appropriate.

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into major
performance problems.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 11:57 AM, Josh Berkus wrote:

On 12/18/2013 08:44 AM, Alvaro Herrera wrote:

Another thought: at the initial run of the assertion, note which tables
it locked, and record this as an OID array in the catalog row for the
assertion; consider running the assertion only when those tables are
touched.  This doesn't work if the assertion code locks some tables when
run under certain conditions and other tables under different
conditions.  But then this can be checked too: if an assertion lists in
its catalog row that it involves tables A, B, C and then, under
different conditions, it tries to acquire lock on table D, have the
whole thing fail indicating that the assertion is misdeclared.


This sounds like you're re-inventing SSI.

SERIALIZABLE mode *exists* in order to be able to enforce constraints
which potentially involve more than one transaction.  Balance can never
go below 0, for example. The whole reason we have this really cool and
unique SSI mode is so that we can do such things without killing
performance.  These sorts of requirements are ideally suited to
Assertions, so it's logically consistent to require Serializable mode in
order to use Assertions.


The flip-side is that now you can get serialization failures, and I think 
there's a ton of software that has no clue how to deal with that. So now you 
don't get to use assertions at all unless you re-engineer your application (but 
see below).

Now, if Postgres could re-attempt serialization failures, maybe that would 
become a non-issue... (though, there's probably a lot of dangers in doing 
that...)


This is consistent with how we treat the interaction of constraints and
triggers; under some circumstances, we allow triggers to violate CHECK
and FK constraints.


We do? Under what circumstances?


Alternately, we add a GUC assertion_serializable_mode, which can be
off, warn or error.  If it's set to error, and the user triggers
an assertion while in READ COMMITTED mode, an exception occurs.  If it's
set to off, then assertions are disabled, in order to deal with buggy
assertions.

Now, it would be even better if we could prevent users from switching
transaction mode, but that's a MUCH bigger and  more complicated patch.


Another possibility is to allow for two different types of assertions, one 
based on SSI and one based on locking.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 10:44 AM, Alvaro Herrera wrote:

It might prove useful to note that any given assertion involves tables
A, B, C.  If a transaction doesn't modify any of those tables then
there's no need to run the assertion when the transaction commits,
skipping acquisition of the assertion lock.

Probably this can only be implemented for SQL-language assertions, but
even so it might be worth considering.  (Assertions in any other
language would be checked by every transaction.)


This is another case where it would be very useful to restrict what relations a 
transaction (or in this case, a substransaction) can access. If we had the 
ability to make that restriction then we could force assertions that aren't 
plain SQL to explicitly specify what tables the assert is going to hit, and if 
the assert tries to do something different then we throw an error.

The ability to restrict object access within a transaction would also benefit 
VACUUM and possibly the Changeset stuff. From 
http://www.postgresql.org/message-id/52acaafd.6060...@nasby.net:

This would be useful when you have some tables that see very frequent 
updates/deletes in a database that also has to support long-running transactions that 
don't hit those tables. You'd explicitly limit the tables your long-running transaction 
will touch and that way vacuum can ignore the long-running XID when calculating minimum 
tuple age for the heavy-hit tables.
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Alvaro Herrera
Andres Freund wrote:
 On 2013-12-18 13:44:15 -0300, Alvaro Herrera wrote:
  Heikki Linnakangas wrote:
  
   Ah, I see. You don't need to block anyone else from modifying the
   table, you just need to block anyone else from committing a
   transaction that had modified the table. So the locks shouldn't
   interfere with regular table locks. A ShareUpdateExclusiveLock on
   the assertion should do it.
  
  Causing serialization of transaction commit just because a single
  assertion exists in the database seems too much of a hit, so looking for
  optimization opportunities seems appropriate.
 
 It would only force serialization for transactions that modify tables
 covered by the assert, that doesn't seem to bad. Anything covered by an
 assert shoulnd't be modified frequently, otherwise you'll run into major
 performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Jim Nasby j...@nasby.net wrote:

 This is another case where it would be very useful to restrict
 what relations a transaction (or in this case, a substransaction)
 can access. If we had the ability to make that restriction then
 we could force assertions that aren't plain SQL to explicitly
 specify what tables the assert is going to hit, and if the assert
 tries to do something different then we throw an error.

 The ability to restrict object access within a transaction would
 also benefit VACUUM and possibly the Changeset stuff.

I'm pretty sure that SSI could also optimize based on that,
although there are probably about 10 other optimizations that would
be bigger gains before getting to that.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andres Freund
On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:
 Andres Freund wrote:
  It would only force serialization for transactions that modify tables
  covered by the assert, that doesn't seem to bad. Anything covered by an
  assert shoulnd't be modified frequently, otherwise you'll run into major
  performance problems.
 
 Well, as presented there is no way (for the system) to tell which tables
 are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Jim Nasby

On 12/18/13, 1:42 PM, Kevin Grittner wrote:

Jim Nasby j...@nasby.net wrote:


This is another case where it would be very useful to restrict
what relations a transaction (or in this case, a substransaction)
can access. If we had the ability to make that restriction then
we could force assertions that aren't plain SQL to explicitly
specify what tables the assert is going to hit, and if the assert
tries to do something different then we throw an error.

The ability to restrict object access within a transaction would
also benefit VACUUM and possibly the Changeset stuff.


I'm pretty sure that SSI could also optimize based on that,
although there are probably about 10 other optimizations that would
be bigger gains before getting to that.


Any ideas how hard this would be? My thought is that we might be able to 
perform this check in the functions that do catalog lookups, but I'm worried 
that that wouldn't allow us to support subtransaction checks (which we'd need 
for assertions), and it runs the risk of long-lasting object references 
spanning the transaction (or subtransaction) and thereby thwarting the check.

Another option would be in heap accessor functions, but that means we could 
only restrict access to tables. For assertions, it would be nice to also 
disallow access to functions that could have unintended consequences that could 
violate the assertion (like dblink).
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
~20 minutes each run with binary.
Try your patch now..
You are right I used -g in perf record. But what I reported was flat (meant
as a start).

Expand GetMultiXactIdMembers:

 3.82% postgres  postgres  [.]
GetMultiXactIdMembers
   |
   |--9.09%-- GetMultiXactIdMembers
   |
   |--0.84%-- 0x48fb894853f58948
   |  |
   |  |--0.74%-- 0x4296e0004296c
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.64%-- 0x52f8d00052f8d
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.64%-- 0xf6ce8000f6ce8
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.62%-- 0x41de300041de1
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.61%-- 0xf2c77000f2c71
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.60%-- 0x3127700031275
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0x10c98b0010c987
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0x31df31df0
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.59%-- 0xbefbd000befbd
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0xfe97c000fe976
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x82501000824f9
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x3a4410003a43c
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x3b0cf0003b0c3
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x5325f0005325b
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.58%-- 0x7b6b80007b6b8
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x52e9b00052e9b
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0xf3d45000f3d40
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x27afd00027afa
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.57%-- 0x3244d0003244d
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x53e0d00053e06
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0xb64c6000b64bc
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x423f1000423ef
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0xc18f2000c18ed
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.56%-- 0x6bdcf0006bdcd
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xc6d25000c6d25
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xf6534000f6534
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0x10bba80010bba0
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xb5a76000b5a6e
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0x2d3c10002d3b5
   |  |  GetMultiXactIdMembers
   |  |
   |  |--0.55%-- 0xcc095000cc095
   |  |  GetMultiXactIdMembers
   |  |
 

Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Andres Freund
Hi,

On 2013-12-18 14:59:45 -0500, Dong Ye wrote:
 ~20 minutes each run with binary.
 Try your patch now..
 You are right I used -g in perf record. But what I reported was flat (meant
 as a start).
 
 Expand GetMultiXactIdMembers:
 
  3.82% postgres  postgres  [.]
 GetMultiXactIdMembers

That looks like a postgres compiled without
-fno-omit-frame-pointer. Without that hierarchical profiles are
meaningless. Very new perfs can also do it using dwarf, but it's unusabl
slow...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
HI

On Wed, Dec 18, 2013 at 3:03 PM, Andres Freund and...@2ndquadrant.comwrote:

 That looks like a postgres compiled without
 -fno-omit-frame-pointer. Without that hierarchical profiles are
 meaningless. Very new perfs can also do it using dwarf, but it's unusabl
 slow...

 Let me complete current test without the flag (to serve as a checkpoint as
well). I will do a run with the flag after.

Thanks,
Dong


Re: [HACKERS] SQL objects UNITs

2013-12-18 Thread Jim Nasby

On 12/18/13, 4:22 AM, Dimitri Fontaine wrote:

 ALTER UNIT name SET SCHEMA new schema;


FWIW, with the units that we've developed we use schemas to differentiate between 
public objects and internal (private or protected) objects. So single-schema stuff 
becomes a PITA. Of course, since extensions already work that way I suppose that ship has sailed, 
but I thought I'd mention it.

For those who are curious... we make the distinction of public/protected/private via schemas 
because we don't want general users to need to wade through that stuff when looking at objects. So 
the convention we settled on is that public objects go in one schema, protected objects go in a 
schema of the same name that's prepended with _, and private objects are in the 
protjected schema but also prepend _ to their names. IE:

CREATE SCHEMA awesome_feature;
CREATE VIEW awesome_feature.have_some_data

CREATE SCHEMA _awesome_feature; -- Protected / private stuff
CREATE VIEW _awesome_feature.stuff_for_database_code_to_see_but_not_users
CREATE FUNCTION 
_awesome_feature._do_not_run_this_function_anywhere_outside_of_awesome_feature()
--
Jim C. Nasby, Data Architect   j...@nasby.net
512.569.9461 (cell) http://jim.nasby.net


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Hi,

Applied your patch (but not using -fno-omit-frame-pointer). It seems
recover the perf loss: 55659.72 notpm.
FWIW, the profile is below. I will do a run with the flag..
Samples: 598K of event 'cycles', Event count (approx.): 6568957160059


+   4.03%   postgres  postgres  [.]
hash_search_with_hash_value
+   3.74%   postgres  postgres  [.]
heap_hot_search_buffer
+   3.17%   postgres  postgres  [.] LWLockAcquire
+   2.92%   postgres  postgres  [.] _bt_compare
+   2.64%   postgres  postgres  [.] PinBuffer
+   2.62%   postgres  postgres  [.] AllocSetAlloc
+   2.34%   postgres  postgres  [.] XLogInsert
+   2.27%   postgres  postgres  [.] SearchCatCache
+   1.81%   postgres  postgres  [.]
HeapTupleSatisfiesMVCC
+   1.52%   postgres  postgres  [.] ExecInitExpr
+   1.45%   postgres  postgres  [.] heap_page_prune_opt
+   1.41%swapper  [kernel.kallsyms] [k] intel_idle
+   1.30%   postgres  postgres  [.] LWLockRelease
+   1.12%   postgres  postgres  [.] heapgetpage
+   0.96%   postgres  libc-2.17.so  [.] _int_malloc
+   0.86%   postgres  libc-2.17.so  [.] __memcpy_ssse3_back
+   0.84%   postgres  postgres  [.] hash_any
+   0.81%   postgres  postgres  [.]
MemoryContextAllocZeroAligned
+   0.76%   postgres  postgres  [.] XidInMVCCSnapshot
+   0.74%   postgres  postgres  [.] _bt_checkkeys
+   0.70%   postgres  postgres  [.]
fmgr_info_cxt_security
+   0.70%   postgres  postgres  [.] FunctionCall2Coll
+   0.66%   postgres  libc-2.17.so  [.]
__strncpy_sse2_unaligned
+   0.63%   postgres  postgres  [.] tbm_iterate
+   0.58%   postgres  postgres  [.] base_yyparse
+   0.58%   postgres  libc-2.17.so  [.] __printf_fp
+   0.55%   postgres  postgres  [.] palloc
+   0.51%   postgres  postgres  [.]
TransactionIdPrecedes
+   0.51%   postgres  postgres  [.] slot_deform_tuple
+   0.50%   postgres  postgres  [.] ReadBuffer_common


[HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

Hi,

Attached is a patch to add support for array_length(anyarray), which 
only works for one-dimensional arrays, returns 0 for empty arrays and 
complains if the array's lower bound isn't 1.  In other words, does the 
right thing when used with the arrays people use 99% of the time.


I'll add this to the next commit fest, but feel free to discuss it 
before that.




Regards,
Marko Tiikkaja
*** a/doc/src/sgml/array.sgml
--- b/doc/src/sgml/array.sgml
***
*** 338,343  SELECT array_length(schedule, 1) FROM sal_emp WHERE name = 
'Carol';
--- 338,354 
  2
  (1 row)
  /programlisting
+ 
+  The single-argument overload of functionarray_length/function can be used
+  to get the length of a one-dimensional array:
+ programlisting
+ SELECT array_length(schedule) FROM sal_emp WHERE name = 'Carol';
+ 
+  array_length
+ --
+ 2
+ (1 row)
+ /programlisting
   /para
   /sect2
  
*** a/doc/src/sgml/func.sgml
--- b/doc/src/sgml/func.sgml
***
*** 11096,11101  SELECT NULLIF(value, '(none)') ...
--- 11096,2 
 row
  entry
   literal
+   functionarray_length/function(typeanyarray/type)
+  /literal
+ /entry
+ entrytypeint/type/entry
+ entryreturns the length of the array (array must be 
one-dimensional)/entry
+ entryliteralarray_length(array[1,2,3])/literal/entry
+ entryliteral3/literal/entry
+/row
+row
+ entry
+  literal
functionarray_lower/function(typeanyarray/type, 
typeint/type)
   /literal
  /entry
*** a/src/backend/utils/adt/arrayfuncs.c
--- b/src/backend/utils/adt/arrayfuncs.c
***
*** 1740,1745  array_length(PG_FUNCTION_ARGS)
--- 1740,1779 
  }
  
  /*
+  * array_length_single:
+  *Returns the length of a single-dimensional array.  The array 
must be
+  *single-dimensional or empty and its lower bound must be 1.
+  */
+ Datum
+ array_length_single(PG_FUNCTION_ARGS)
+ {
+   ArrayType  *v = PG_GETARG_ARRAYTYPE_P(0);
+   int result;
+   int*lb;
+   int*dimv;
+ 
+   /* empty array */
+   if (ARR_NDIM(v) == 0)
+   PG_RETURN_INT32(0);
+ 
+   if (ARR_NDIM(v) != 1)
+   ereport(ERROR, 
+   (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+errmsg(input array is not 
single-dimensional)));
+ 
+   lb = ARR_LBOUND(v);
+   if (lb[0] != 1)
+   ereport(ERROR,
+   (errcode(ERRCODE_ARRAY_SUBSCRIPT_ERROR),
+errmsg(lower bound of array is not 1)));
+ 
+   dimv = ARR_DIMS(v);
+   result = dimv[0];
+   PG_RETURN_INT32(result);
+ }
+ 
+ 
+ /*
   * array_ref :
   *  This routine takes an array pointer and a subscript array and returns
   *  the referenced item as a Datum.  Note that for a pass-by-reference
*** a/src/include/catalog/pg_proc.h
--- b/src/include/catalog/pg_proc.h
***
*** 840,845  DATA(insert OID = 2092 (  array_upper PGNSP PGUID 12 1 0 0 
0 f f f f t f i 2
--- 840,847 
  DESCR(array upper dimension);
  DATA(insert OID = 2176 (  array_length   PGNSP PGUID 12 1 0 0 0 f f f 
f t f i 2 0 23 2277 23 _null_ _null_ _null_ _null_ array_length _null_ _null_ 
_null_ ));
  DESCR(array length);
+ DATA(insert OID = 3179 (  array_length   PGNSP PGUID 12 1 0 0 0 f f f 
f t f i 1 0 23 2277 _null_ _null_ _null_ _null_ array_length_single _null_ 
_null_ _null_ ));
+ DESCR(array length);
  DATA(insert OID = 378 (  array_appendPGNSP PGUID 12 1 0 0 0 f f f f f f i 
2 0 2277 2277 2283 _null_ _null_ _null_ _null_ array_push _null_ _null_ 
_null_ ));
  DESCR(append element onto end of array);
  DATA(insert OID = 379 (  array_prepend   PGNSP PGUID 12 1 0 0 0 f f f 
f f f i 2 0 2277 2283 2277 _null_ _null_ _null_ _null_ array_push _null_ 
_null_ _null_ ));
*** a/src/include/utils/array.h
--- b/src/include/utils/array.h
***
*** 204,209  extern Datum array_dims(PG_FUNCTION_ARGS);
--- 204,210 
  extern Datum array_lower(PG_FUNCTION_ARGS);
  extern Datum array_upper(PG_FUNCTION_ARGS);
  extern Datum array_length(PG_FUNCTION_ARGS);
+ extern Datum array_length_single(PG_FUNCTION_ARGS);
  extern Datum array_larger(PG_FUNCTION_ARGS);
  extern Datum array_smaller(PG_FUNCTION_ARGS);
  extern Datum generate_subscripts(PG_FUNCTION_ARGS);
*** a/src/test/regress/expected/arrays.out
--- b/src/test/regress/expected/arrays.out
***
*** 1455,1460  select array_length(array[[1,2,3], [4,5,6]], 3);
--- 1455,1482 
   
  (1 row)
  
+ select array_length(NULL::int[]);
+  array_length 
+ --
+  
+ (1 row)
+ 
+ select array_length(array[1,2,3]);
+  array_length 
+ --
+ 3
+ (1 row)
+ 
+ select array_length('{}'::int[]);
+  

Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Jim Nasby j...@nasby.net wrote:
 On 12/18/13, 1:42 PM, Kevin Grittner wrote:
 Jim Nasby j...@nasby.net wrote:

 This is another case where it would be very useful to restrict
 what relations a transaction (or in this case, a substransaction)
 can access. If we had the ability to make that restriction then
 we could force assertions that aren't plain SQL to explicitly
 specify what tables the assert is going to hit, and if the assert
 tries to do something different then we throw an error.

 The ability to restrict object access within a transaction would
 also benefit VACUUM and possibly the Changeset stuff.

 I'm pretty sure that SSI could also optimize based on that,
 although there are probably about 10 other optimizations that would
 be bigger gains before getting to that.

 Any ideas how hard this would be?

If we had a list to check against, I think it would be possible to
do this during parse analysis and AcquireRewriteLocks().  (One or
the other happens before query rewrite.)  The hard part seems to me
to be defining a sane way to get the list.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 8:47 AM, Stephen Frost sfr...@snowman.net wrote:
 Agreed.  My other thought on this is that there's a lot to be said for
 having everything you need available through one tool- kinda like how
 Emacs users rarely go outside of it.. :)  And then there's also the
 consideration that DBAs may not have access to the host system at all,
 or not to the level needed to do similar analysis there.

I completely agree with this, and yet I still think we should reject
the patch, because I think the overhead is going to be intolerable.

Now, the fact is, the monitoring facilities we have in PostgreSQL
today are not nearly good enough.  Other products do better.  I cringe
every time I tell someone to attach strace to a long-running autovac
process to find out what block number it's currently on, so we can
estimate when it will finish; or every time we need data about lwlock
contention and the only way to get it is to use perf, or recompile
with LWLOCK_STATS defined.  These are not fun conversations to have
with customers who are in production.

On the other hand, there's not much value in adding monitoring
features that are going to materially harm performance, and a lot of
the monitoring features that get proposed die on the vine for exactly
that reason.  I think the root of the problem is that our stats
infrastructure is a streaming pile of crap.  A number of people have
worked diligently to improve it and that work has not been fruitless,
but the current situation is still not very good.  In many ways, this
situation reminds me of the situation with EXPLAIN a few years ago.
People kept proposing useful extensions to EXPLAIN which we did not
adopt because they required creating (and perhaps reserving) far too
many keywords.  Now that we have the extensible options syntax,
EXPLAIN has options for COSTS, BUFFERS, TIMING, and FORMAT, all of
which have proven to be worth their weight in code, at least IMHO.

I am really not sure what a better infrastructure for stats collection
should look like, but I know that until we get one, a lot of
monitoring patches that would be really nice to have are going to get
shot down because of concerns about performance, and specifically
stats file bloat.  Fixing that problem figures to be unglamorous, but
I'll buy whoever does it a beer (or another beverage of your choice).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into major
performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.



Umm, that's really a major limitation in utility. We need to come up 
with a better answer than this, which would essentially hobble the facility.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
 On Wed, Dec 18, 2013 at 8:47 AM, Stephen Frost sfr...@snowman.net wrote:
  Agreed.  My other thought on this is that there's a lot to be said for
  having everything you need available through one tool- kinda like how
  Emacs users rarely go outside of it.. :)  And then there's also the
  consideration that DBAs may not have access to the host system at all,
  or not to the level needed to do similar analysis there.
 
 I completely agree with this, and yet I still think we should reject
 the patch, because I think the overhead is going to be intolerable.

That's a fair point and I'm fine with rejecting it on the grounds that
the overhead is too much.  Hopefully that encourages the author to go
back and review Tom's comments and consider how the overhead could be
reduced or eliminated.  We absolutely need better monitoring and I have
had many of the same strace-involving conversations.  perf is nearly out
of the question as it's often not even installed and can be terribly
risky (I once had to get a prod box hard-reset after running perf on it
for mere moments because it never came back enough to let us do a clean
restart).

 I think the root of the problem is that our stats
 infrastructure is a streaming pile of crap.

+1

Thanks,

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Heikki Linnakangas

On 12/18/2013 11:04 PM, Andrew Dunstan wrote:


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered by an
assert shoulnd't be modified frequently, otherwise you'll run into
major
performance problems.

Well, as presented there is no way (for the system) to tell which tables
are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.


Umm, that's really a major limitation in utility.


The query can be SELECT is_my_assertion_true(), and the function can 
do anything.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Hi,

Attached is a patch to add support for array_length(anyarray), which 
only works for one-dimensional arrays, returns 0 for empty arrays and 
complains if the array's lower bound isn't 1.  In other words, does 
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 04:09 PM, Heikki Linnakangas wrote:

On 12/18/2013 11:04 PM, Andrew Dunstan wrote:


On 12/18/2013 02:45 PM, Andres Freund wrote:

On 2013-12-18 16:39:58 -0300, Alvaro Herrera wrote:

Andres Freund wrote:

It would only force serialization for transactions that modify tables
covered by the assert, that doesn't seem to bad. Anything covered 
by an

assert shoulnd't be modified frequently, otherwise you'll run into
major
performance problems.
Well, as presented there is no way (for the system) to tell which 
tables

are covered by an assertion, is there?  That's my point.

Well, the patch's syntax seems to only allow to directly specify a SQL
query to check - we could iterate over the querytree to gather all
related tables and reject any function we do not understand.


Umm, that's really a major limitation in utility.


The query can be SELECT is_my_assertion_true(), and the function can 
do anything.





OK, but isn't that what Andres is suggesting we reject?

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Josh Berkus
On 12/18/2013 11:26 AM, Jim Nasby wrote:
 The flip-side is that now you can get serialization failures, and I
 think there's a ton of software that has no clue how to deal with that.
 So now you don't get to use assertions at all unless you re-engineer
 your application (but see below).

Well, the software will need to deal with an Assertion failure, which I
doubt it's prepared to do right now either.

 This is consistent with how we treat the interaction of constraints and
 triggers; under some circumstances, we allow triggers to violate CHECK
 and FK constraints.
 
 We do? Under what circumstances?

AFTER triggers are allowed to ignore constraints sometimes.  For
example, if you have a tree table with an FK to other rows in the same
table, and you have an AFTER trigger on it, the AFTER trigger is allowed
to violate the self-FK.  That's the one I ran across, but I vaguely
remember other cases, and there's some documentation on this in the
order of application of triggers in the main docs.

 Another possibility is to allow for two different types of assertions,
 one based on SSI and one based on locking.

The locking version would have to pretty much lock on a table basis (or
even a whole-database basis) every time an assertion executed, no?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and the 
last one is [array_length()].  Which is what 99% of the code using 
array_length(anyarray, int) does anyway.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:19, I wrote:

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and the
last one is [array_length()].  Which is what 99% of the code using
array_length(anyarray, int) does anyway.


Just to clarify, I mean that array_lower(.., 1) must be 1.  Whatever 
that's called.  The lower bound of the only dimension of the array?



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Andrew Dunstan


On 12/18/2013 04:19 PM, Marko Tiikkaja wrote:

On 2013-12-18 22:13, Andrew Dunstan wrote:

On 12/18/2013 03:27 PM, Marko Tiikkaja wrote:

Attached is a patch to add support for array_length(anyarray), which
only works for one-dimensional arrays, returns 0 for empty arrays and
complains if the array's lower bound isn't 1.  In other words, does
the right thing when used with the arrays people use 99% of the time.


Why the heck would it complain if the lower bound isn't 1?


Because then you're free to assume that the first element is [1] and 
the last one is [array_length()].  Which is what 99% of the code using 
array_length(anyarray, int) does anyway.






You're not really free to assume it - you'll need an exception handler 
for the other-than-1 case, or your code might blow up.


This seems to be codifying a bad pattern, which should be using 
array_lower() and array_upper() instead.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 2013-12-18 22:32, Andrew Dunstan wrote:

You're not really free to assume it - you'll need an exception handler
for the other-than-1 case, or your code might blow up.

This seems to be codifying a bad pattern, which should be using
array_lower() and array_upper() instead.


That's the entire point -- I *want* my code to blow up.  If someone 
passes a multi-dimensional array to a function that assumes its input is 
one-dimensional and its indexes start from 1, I want it to be obvious 
that the caller did something wrong.  Now I either copy-paste lines and 
lines of codes to always test for the weird cases or my code breaks in 
subtle ways.


This is no different from an Assert() somewhere -- if the caller breaks 
the documented interface, it's his problem, not mine.  And I don't want 
to waste my time coding around the fact that this simple thing is so 
hard to do in PG.




Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Extension Templates S03E11

2013-12-18 Thread Greg Stark
Yeah I think this whole discussion is happening at the wrong level. The
problem you're having, despite appearances, is not that people disagree
about the best way to accomplish your goals.

The problem is that not everyone is convinced your goals are a good idea.
Either they just don't understand the goals or they do understand them but
don't agree that they're a good idea.

Personally I'm in the former category and would welcome a detailed
explanation of the goals of the feature and what use cases those goals
enable.

I think Tom is in the later category and needs a very good argument for why
those goals are important enough to outweigh the downsides.

I don't think loading shared libraries from RAM or a temp download
directory is a *complete* show stopper the way Tom says but it would
require a pretty compelling use case to make it worth the difficulties it
would cause.

-- 
greg


Re: [HACKERS] 9.3 regression with dbt2

2013-12-18 Thread Dong Ye
Applying your patch plus adding -fno-omit-frame-pointer, I got 54526.90
notpm.

The profile (part) below:
# Samples: 610K of event 'cycles'
# Event count (approx.): 6686532056450
#
# Overhead Command  Shared Object
   Symbol
#   ..  .
 ..
#
 4.08%postgres  postgres   [.] hash_search_with_hash_value

  |
  --- hash_search_with_hash_value
 |
  --2.87%-- BufTableLookup
|
 --2.86%-- ReadBuffer_common
   ReadBufferExtended
   |
   |--0.86%-- index_fetch_heap
   |  |
   |   --0.62%-- index_getnext
   | IndexNext
   | ExecScan
   | ExecProcNode
   |
   |--0.71%-- BitmapHeapNext
   |  ExecScan
   |  ExecProcNode
   |
--0.57%-- _bt_relandgetbuf
  |
   --0.57%-- _bt_search

 3.75%postgres  postgres   [.] heap_hot_search_buffer

  |
  --- heap_hot_search_buffer
 |
 |--1.92%-- BitmapHeapNext
 |  ExecScan
 |  ExecProcNode
 |  |
 |   --1.12%-- standard_ExecutorRun
 | _SPI_execute_plan
 | SPI_execute_plan
 | |
 |  --0.70%-- payment
 |ExecMakeFunctionResult
 |ExecProject
 |ExecResult
 |ExecProcNode
 |standard_ExecutorRun
 |PortalRunSelect
 |PortalRun
 |PostgresMain
 |ServerLoop
 |PostmasterMain
 |main
 |__libc_start_main
 |
  --1.74%-- index_fetch_heap
|
 --1.46%-- index_getnext
   |
--1.45%-- IndexNext
  ExecScan
  ExecProcNode
  |
   --0.96%--
standard_ExecutorRun

 _SPI_execute_plan

 SPI_execute_plan
 |
  --0.50%--
new_order

ExecMakeFunctionResult

ExecProject

ExecResult

ExecProcNode

standard_ExecutorRun

PortalRunSelect

PortalRunFetch

PerformPortalFetch

standard_ProcessUtility

PortalRunUtility

FillPortalStore

PortalRun

PostgresMain

ServerLoop

PostmasterMain
main

__libc_start_main

 3.15%postgres  postgres   [.] LWLockAcquire

  |
  --- LWLockAcquire
 |
  --1.65%-- ReadBuffer_common
ReadBufferExtended

 2.74%postgres  postgres   [.] PinBuffer

  |
  --- PinBuffer
 |
  --2.72%-- ReadBuffer_common
ReadBufferExtended
|
|--0.71%-- index_fetch_heap
|  |
|   --0.51%-- index_getnext
| IndexNext
| ExecScan
| ExecProcNode
|
|--0.67%-- BitmapHeapNext
|  ExecScan
|  ExecProcNode
|
 

Re: [HACKERS] shared memory message queues

2013-12-18 Thread Andres Freund
On 2013-12-18 15:23:23 -0500, Robert Haas wrote:
 It sounds like most people who have looked at this stuff are broadly
 happy with it, so I'd like to push on toward commit soon, but it'd be
 helpful, Andres, if you could review the comment additions to
 shm-mq-v2.patch and see whether those address your concerns.

FWIW, I haven't looked carefully enough at the patch to consider myself
having reviewed it. I am not saying that it's not ready for committer,
just that I don't know whether it is.

I will have a look at the comment improvements, and if you don't beat me
to it, give the patch a read-over.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread David Johnston
Marko Tiikkaja-4 wrote
 On 2013-12-18 22:32, Andrew Dunstan wrote:
 You're not really free to assume it - you'll need an exception handler
 for the other-than-1 case, or your code might blow up.

 This seems to be codifying a bad pattern, which should be using
 array_lower() and array_upper() instead.
 
 That's the entire point -- I *want* my code to blow up.  If someone 
 passes a multi-dimensional array to a function that assumes its input is 
 one-dimensional and its indexes start from 1, I want it to be obvious 
 that the caller did something wrong.  Now I either copy-paste lines and 
 lines of codes to always test for the weird cases or my code breaks in 
 subtle ways.
 
 This is no different from an Assert() somewhere -- if the caller breaks 
 the documented interface, it's his problem, not mine.  And I don't want 
 to waste my time coding around the fact that this simple thing is so 
 hard to do in PG.

1) Why cannot we just make the second argument of the current function
optional and default to 1?
2) How about providing a function that returns the 1-dim/lower=1 input
array or raise/exception if the input array does not conform?

not tested/psuedo-code
CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray
$$
begin
if (empty(arr)) return arr;
if (ndim(arr)  1) raise exception;
if (array_lower()  1) raise exception
return arr;
end;
$$

I can also see wanting 1-dimensional enforced without having to require the
lower-bound to be 1 so maybe a separate function for that.

Usage:

SELECT array_length(array_normal(input_array))

I could see this being especially useful for a domain and/or column
constraint definition and also allowing for a textbook case of separation of
concerns.

I am torn, but mostly opposed, to making an array_length(anyarray) function
with these limitations enforced - especially if other similar functions are
not created at the same time.  I fully agree that array_length(anyarray)
should be a valid call without requiring the user to specify , 1 by rote.

Tangential Question:

Is there any way to define a non-1-based array without using array-literal
syntax but by using ARRAY[1,2,3] syntax?


David J.





--
View this message in context: 
http://postgresql.1045698.n5.nabble.com/array-length-anyarray-tp5783950p5783972.html
Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-18 Thread Andres Freund
Hi,

On 2013-12-17 16:00:14 -0500, Robert Haas wrote:
 @@ -5874,19 +5858,27 @@ heap_prepare_freeze_tuple(HeapTupleHeader tuple, 
 TransactionId cutoff_xid,
  void
  heap_execute_freeze_tuple(HeapTupleHeader tuple, xl_heap_freeze_tuple *frz)
  {
 + tuple-t_infomask = frz-t_infomask;
 + tuple-t_infomask2 = frz-t_infomask2;
 +
   if (frz-frzflags  XLH_FREEZE_XMIN)
 - HeapTupleHeaderSetXmin(tuple, FrozenTransactionId);
 + HeapTupleHeaderSetXminFrozen(tuple);
  
   HeapTupleHeaderSetXmax(tuple, frz-xmax);
  
   if (frz-frzflags  XLH_FREEZE_XVAC)
 + {
   HeapTupleHeaderSetXvac(tuple, FrozenTransactionId);
 + /* If we somehow haven't hinted the tuple previously, do it 
 now. */
 + HeapTupleHeaderSetXminCommitted(tuple);
 + }

What's the reasoning behind adding HeapTupleHeaderSetXminCommitted()
here?

 @@ -823,14 +823,14 @@ lazy_scan_heap(Relation onerel, LVRelStats *vacrelstats,
* NB: Like with per-tuple hint bits, 
 we can't set the
* PD_ALL_VISIBLE flag if the inserter 
 committed
* asynchronously. See SetHintBits for 
 more info. Check
 -  * that the HEAP_XMIN_COMMITTED hint 
 bit is set because of
 -  * that.
 +  * that the tuple is hinted 
 xmin-committed hint bit because
 +  * of that.
*/

Looks like there's a hint bit too much here.

 @@ -65,6 +65,9 @@ manage to be a conflict it would merely mean that one 
 bit-update would
  be lost and need to be done again later.  These four bits are only hints
  (they cache the results of transaction status lookups in pg_clog), so no
  great harm is done if they get reset to zero by conflicting updates.
 +Note, however, that a tuple is frozen by setting both HEAP_XMIN_INVALID
 +and HEAP_XMIN_COMMITTED; this is a critical update and accordingly requires
 +an exclusive buffer lock.

I think it'd be appropriate to mention that this needs to be preserved
via WAL logging, it sounds like it's suficient to set a hint bit without
persistenc guarantees.

(not sure if I already wrote this, but whatever)

Looking good.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in base backup code path

2013-12-18 Thread Andres Freund
Hi Magnus,

It looks to me like the path to do_pg_start_backup() outside of a
transaction context comes from your initial commit of the base backup
facility.
The problem is that you're not allowed to do anything leading to a
syscache/catcache lookup in those contexts.

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] stats for network traffic WIP

2013-12-18 Thread Bruce Momjian
On Wed, Dec 18, 2013 at 03:41:24PM -0500, Robert Haas wrote:
 On the other hand, there's not much value in adding monitoring
 features that are going to materially harm performance, and a lot of
 the monitoring features that get proposed die on the vine for exactly
 that reason.  I think the root of the problem is that our stats
 infrastructure is a streaming pile of crap.  A number of people have

streaming?  I can't imagine what that looks like.  ;-)

I think the larger point is that network is only one of many things we
need to address, so this needs a holistic approach that looks at all
needs and creates infrastructure to address it.
 
-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] array_length(anyarray)

2013-12-18 Thread Marko Tiikkaja

On 12/19/13, 12:01 AM, David Johnston wrote:

Marko Tiikkaja-4 wrote

On 2013-12-18 22:32, Andrew Dunstan wrote:

You're not really free to assume it - you'll need an exception handler
for the other-than-1 case, or your code might blow up.

This seems to be codifying a bad pattern, which should be using
array_lower() and array_upper() instead.


That's the entire point -- I *want* my code to blow up.  If someone
passes a multi-dimensional array to a function that assumes its input is
one-dimensional and its indexes start from 1, I want it to be obvious
that the caller did something wrong.  Now I either copy-paste lines and
lines of codes to always test for the weird cases or my code breaks in
subtle ways.

This is no different from an Assert() somewhere -- if the caller breaks
the documented interface, it's his problem, not mine.  And I don't want
to waste my time coding around the fact that this simple thing is so
hard to do in PG.


1) Why cannot we just make the second argument of the current function
optional and default to 1?


That still does the wrong thing for the empty array, multidimensional 
arrays and arrays that don't start from index 1.



2) How about providing a function that returns the 1-dim/lower=1 input
array or raise/exception if the input array does not conform?

not tested/psuedo-code
CREATE FUNCTION array_normal(arr anyarray) RETURNS anyarray
$$
begin
 if (empty(arr)) return arr;
 if (ndim(arr)  1) raise exception;
 if (array_lower()  1) raise exception
 return arr;
end;
$$


With this, I would still have to do 
COALESCE(array_length(array_normal($1), 1), 0).  That's pretty stupid 
for the most common use case of arrays, don't you think?



I can also see wanting 1-dimensional enforced without having to require the
lower-bound to be 1 so maybe a separate function for that.


I really don't see the point.  How often have you ever created a 
function that doesn't have a lower bound of 1 on purpose?  What good did 
it serve you?



Usage:

SELECT array_length(array_normal(input_array))

I could see this being especially useful for a domain and/or column
constraint definition and also allowing for a textbook case of separation of
concerns.


What would array_length() in this case be?  With what you suggested 
above, you would still get NULL for an empty array.



I am torn, but mostly opposed, to making an array_length(anyarray) function
with these limitations enforced - especially if other similar functions are
not created at the same time.  I fully agree that array_length(anyarray)
should be a valid call without requiring the user to specify , 1 by rote.


I'm specifically asking for something that is different from 
array_length(anyarray, int), because I personally think it's too full of 
caveats.



Regards,
Marko Tiikkaja


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] SQL assertions prototype

2013-12-18 Thread Kevin Grittner
Josh Berkus j...@agliodbs.com wrote:
 On 12/18/2013 11:26 AM, Jim Nasby wrote:

 Another possibility is to allow for two different types of
 assertions, one based on SSI and one based on locking.

 The locking version would have to pretty much lock on a table
 basis (or even a whole-database basis) every time an assertion
 executed, no?

As far as I can see, if SSI is *not* used, there needs to be a
mutually exclusive lock taken from somewhere inside the COMMIT code
until the transaction is complete -- effectively serializing
assertion processing for transactions which could affect a given
assertion.  Locking on tables would, as previously suggested, be
very prone to deadlocks on the heavyweight locks.  Locking on the
assertions in a predictable order seems more promising, especially
if there could be some way to only do that if the transaction
really might have done something which could affect the truth of
the assertion.

--
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2013-12-18 Thread Bruce Momjian
On Wed, Dec 11, 2013 at 10:22:32AM +, Haribabu kommi wrote:
 The make_absolute_path() function moving to port is changed in similar way as
 Bruce Momjian approach. The psprintf is used to store the error string which
 Occurred in the function. But psprintf is not used for storing the absolute 
 path
 As because it is giving problems in freeing the allocated memory in 
 SelectConfigFiles.
 Because the same memory is allocated in a different code branch from 
 guc_malloc.
 
 After adding the make_absolute_path() function with psprintf stuff in path.c 
 file
 It is giving linking problem in compilation of ecpg. I am not able to find 
 the problem.
 So I added another file abspath.c in port which contains these two functions.

What errors are you seeing?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Cédric Villemain
Le mercredi 18 décembre 2013 18:40:09, Robert Haas a écrit :
 Now that we have dynamic background workers, I've been thinking that
 it might be possible to write a background worker to do asynchronous
 prefetch on systems where we don't have OS support.  We could store a
 small ring buffer in shared memory, say big enough for 1k entries.
 Each entry would consist of a relfilenode, a starting block number,
 and a block count.  We'd also store a flag indicating whether the
 prefetch worker has been registered with the postmaster, and a pointer
 to the PGPROC of any running worker. When a process wants to do a
 prefetch, it locks the buffer, adds its prefetch request to the queue
 (overwriting the oldest existing request if the queue is already
 full), and checks the flag.  If the flag is not set, it also registers
 the background worker.  Then, it releases the lock and sets the latch
 of any running worker (whose PGPROC it remembered before releasing the
 lock).

Good idea.
If the list is full it is probably that the system is busy, I suppose that in 
such case some alternative behavior can be interesting. Perhaps flush a part of 
the ring. Oldest entries are the less interesting, we're talking about 
prefetching after all.

In the case of effective_io_concurrency, however, this may not work as well as 
expected, IIRC it is used to prefetch heap blocks, hopefully the requested 
blocks are contiguous but if there are too much holes it is enough to fill the 
ring very quickly (with the current max value of effective_io_concurrency).

 When the prefetch process starts up, it services requests from the
 queue by reading the requested blocks (or block ranges).  When the
 queue is empty, it sleeps.  If it receives no requests for some period
 of time, it unregisters itself and exits.  This is sort of a souped-up
 version of the hibernation facility we already have for some auxiliary
 processes, in that we don't just make the process sleep for a longer
 period of time but actually get rid of it altogether.

I'm just a bit skeptical about the starting time: backend will ReadBuffer very 
soon after requesting the Prefetch...

 All of this might be overkill; we could also do it with a permanent
 auxiliary process.  But it's sort of a shame to run an extra process
 for a facility that might never get used, or might be used only
 rarely.  And I'm wary of cluttering things up with a thicket of
 auxiliary processes each of which caters only to a very specific, very
 narrow situation.  Anyway, just thinking out loud here.

For windows see the C++ PrefetchVirtualMemory() function.

I really wonder if such a bgworker can improve the prefetching on !windows too 
if ring insert is faster than posix_fadvise call.

If this is true, then effective_io_concurrency can be revisited. Maybe Greg 
Stark already did some benchmark of that...

-- 
Cédric Villemain +33 (0)6 20 30 22 52
http://2ndQuadrant.fr/
PostgreSQL: Support 24x7 - Développement, Expertise et Formation


signature.asc
Description: This is a digitally signed message part.


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Alvaro Herrera
Robert Haas escribió:

 I'm not inclined to wait for the next CommitFest to commit this,
 because it's a very simple patch and has already had a lot more field
 testing than most patches get before they're committed.  And it's just
 a contrib module, so the damage it can do if there is in fact a bug is
 pretty limited.  All that having been said, any review is appreciated.

Looks nice.

Some really minor things I noticed while skimming are that you have some
weird indentation using spaces in some ereport() calls; there's an extra
call to RelationOpenSmgr() in read mode; and the copyright year is 2012.
Please use mdash; in sgml instead of plain dashes, and please avoid the
!strcmp() idiom.

I didn't actually try it out ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PoC: Partial sort

2013-12-18 Thread Andreas Karlsson

On 12/18/2013 01:02 PM, Alexander Korotkov wrote:

My idea for a solution was to modify tuplesort to allow storing the
already sorted keys in either memtuples or the sort result file, but
setting a field so it does not sort thee already sorted tuples
again. This would allow the rescan to work as it used to, but I am
unsure how clean or ugly this code would be. Was this something you
considered?


I'm not sure. I believe that best answer depends on particular
parameter: how much memory we've for sort, how expensive is underlying
node and how it performs rescan, how big are groups in partial sort.


Yes, if one does not need a rescan your solution will use less memory 
and about the same amount of CPU (if the tuplesort does not spill to 
disk). While if we keep all the already sorted tuples in the tuplesort 
rescans will be cheap but more memory will be used with an increased 
chance of spilling to disk.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_rewarm status

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 6:07 PM, Cédric Villemain ced...@2ndquadrant.fr wrote:
 In the case of effective_io_concurrency, however, this may not work as well as
 expected, IIRC it is used to prefetch heap blocks, hopefully the requested
 blocks are contiguous but if there are too much holes it is enough to fill the
 ring very quickly (with the current max value of effective_io_concurrency).

Yeah, we'd need to figure out how big the ring would need to be for
reasonable values of effective_io_concurrency.

 When the prefetch process starts up, it services requests from the
 queue by reading the requested blocks (or block ranges).  When the
 queue is empty, it sleeps.  If it receives no requests for some period
 of time, it unregisters itself and exits.  This is sort of a souped-up
 version of the hibernation facility we already have for some auxiliary
 processes, in that we don't just make the process sleep for a longer
 period of time but actually get rid of it altogether.

 I'm just a bit skeptical about the starting time: backend will ReadBuffer very
 soon after requesting the Prefetch...

Yeah, absolutely.  The first backend that needs a prefetch probably
isn't going to get it in time.  I think that's OK though.  Once the
background process is started, response times will be quicker...
although possibly still not quick enough.  We'd need to benchmark this
to determine how quickly the background process can actually service
requests.  Does anybody have a good self-contained test case that
showcases the benefits of prefetching?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Autoconf 2.69 update

2013-12-18 Thread Peter Eisentraut
On Thu, 2013-11-14 at 22:00 -0500, Peter Eisentraut wrote:
 I'm proposing that we upgrade our Autoconf to 2.69

This has been done.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] preserving forensic information when we freeze

2013-12-18 Thread Robert Haas
On Wed, Dec 18, 2013 at 5:54 PM, Andres Freund and...@2ndquadrant.com wrote:
   if (frz-frzflags  XLH_FREEZE_XVAC)
 + {
   HeapTupleHeaderSetXvac(tuple, FrozenTransactionId);
 + /* If we somehow haven't hinted the tuple previously, do it 
 now. */
 + HeapTupleHeaderSetXminCommitted(tuple);
 + }

 What's the reasoning behind adding HeapTupleHeaderSetXminCommitted()
 here?

I'm just copying the existing logic.  See the final stanza of
heap_prepare_freeze_tuple.

 [ snip ]

Will fix.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER SYSTEM SET command to change postgresql.conf parameters

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 8:25 PM, Tatsuo Ishii is...@postgresql.org wrote:
 Is there any reason for the function returns int as it always returns
 0 or 1. Maybe returns bool is better?


 I have committed your patches. Thanks.

Thank you very much.


With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] clang's -Wmissing-variable-declarations shows some shoddy programming

2013-12-18 Thread Bruce Momjian
On Sat, Dec 14, 2013 at 04:52:28PM +0100, Andres Freund wrote:
 Hi,
 
 Compiling postgres with said option in CFLAGS really gives an astounding
 number of warnings. Except some bison/flex generated ones, none of them
 looks acceptable to me.
 Most are just file local variables with a missing static and easy to
 fix. Several other are actually shared variables, where people simply
 haven't bothered to add the variable to a header. Some of them with
 comments declaring that fact, others adding longer comments, even others
 adding longer comments about that fact.
 
 I've attached the output of such a compilation run for those without
 clang.

Now that pg_upgrade has stabilized, I think it is time to centralize all
the pg_upgrade_support control variables in a single C include file that
can be used by the backend and by pg_upgrade_support.  This will
eliminate the compiler warnings too.

The attached patch accomplishes this.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/contrib/pg_upgrade_support/pg_upgrade_support.c b/contrib/pg_upgrade_support/pg_upgrade_support.c
new file mode 100644
index 99e64c4..6e30deb
*** a/contrib/pg_upgrade_support/pg_upgrade_support.c
--- b/contrib/pg_upgrade_support/pg_upgrade_support.c
***
*** 11,16 
--- 11,17 
  
  #include postgres.h
  
+ #include catalog/binary_upgrade.h
  #include catalog/namespace.h
  #include catalog/pg_type.h
  #include commands/extension.h
***
*** 24,40 
  PG_MODULE_MAGIC;
  #endif
  
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_type_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_array_pg_type_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_type_oid;
- 
- extern PGDLLIMPORT Oid binary_upgrade_next_heap_pg_class_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_index_pg_class_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_toast_pg_class_oid;
- 
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_enum_oid;
- extern PGDLLIMPORT Oid binary_upgrade_next_pg_authid_oid;
- 
  Datum		set_next_pg_type_oid(PG_FUNCTION_ARGS);
  Datum		set_next_array_pg_type_oid(PG_FUNCTION_ARGS);
  Datum		set_next_toast_pg_type_oid(PG_FUNCTION_ARGS);
--- 25,30 
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
new file mode 100644
index 6f2e142..032a20e
*** a/src/backend/catalog/heap.c
--- b/src/backend/catalog/heap.c
***
*** 34,39 
--- 34,40 
  #include access/sysattr.h
  #include access/transam.h
  #include access/xact.h
+ #include catalog/binary_upgrade.h
  #include catalog/catalog.h
  #include catalog/dependency.h
  #include catalog/heap.h
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
new file mode 100644
index aa31429..7ad9720
*** a/src/backend/catalog/index.c
--- b/src/backend/catalog/index.c
***
*** 30,35 
--- 30,36 
  #include access/visibilitymap.h
  #include access/xact.h
  #include bootstrap/bootstrap.h
+ #include catalog/binary_upgrade.h
  #include catalog/catalog.h
  #include catalog/dependency.h
  #include catalog/heap.h
diff --git a/src/backend/catalog/pg_enum.c b/src/backend/catalog/pg_enum.c
new file mode 100644
index 35899b4..23d2a41
*** a/src/backend/catalog/pg_enum.c
--- b/src/backend/catalog/pg_enum.c
***
*** 17,22 
--- 17,23 
  #include access/heapam.h
  #include access/htup_details.h
  #include access/xact.h
+ #include catalog/binary_upgrade.h
  #include catalog/catalog.h
  #include catalog/indexing.h
  #include catalog/pg_enum.h
diff --git a/src/backend/catalog/pg_type.c b/src/backend/catalog/pg_type.c
new file mode 100644
index 23ac3dd..634915b
*** a/src/backend/catalog/pg_type.c
--- b/src/backend/catalog/pg_type.c
***
*** 17,22 
--- 17,23 
  #include access/heapam.h
  #include access/htup_details.h
  #include access/xact.h
+ #include catalog/binary_upgrade.h
  #include catalog/dependency.h
  #include catalog/indexing.h
  #include catalog/objectaccess.h
diff --git a/src/backend/catalog/toasting.c b/src/backend/catalog/toasting.c
new file mode 100644
index 385d64d..f58e434
*** a/src/backend/catalog/toasting.c
--- b/src/backend/catalog/toasting.c
***
*** 16,21 
--- 16,22 
  
  #include access/tuptoaster.h
  #include access/xact.h
+ #include catalog/binary_upgrade.h
  #include catalog/dependency.h
  #include catalog/heap.h
  #include catalog/index.h
***
*** 31,38 
  #include utils/syscache.h
  
  /* Potentially set by contrib/pg_upgrade_support functions */
- extern Oid	binary_upgrade_next_toast_pg_class_oid;
- 
  Oid			binary_upgrade_next_toast_pg_type_oid = InvalidOid;
  
  static bool create_toast_table(Relation rel, Oid toastOid, Oid toastIndexOid,
--- 32,37 
diff --git a/src/backend/commands/typecmds.c b/src/backend/commands/typecmds.c
new file mode 100644
index d4a14ca..959b3f2
*** 

Re: [HACKERS] row security roadmap proposal

2013-12-18 Thread Craig Ringer
On 12/18/2013 11:14 PM, Robert Haas wrote:

 To be clear, I wasn't advocating for a declarative approach; I think
 predicates are fine.  There are usability issues to worry about either
 way, and my concern is that we address those.  A declarative approach
 would certainly be valuable in that, for those people for whom it is
 sufficiently flexible, it's probably quite a lot easier than writing
 predicates.  But I fear that some people will want a lot more
 generality than a label-based system can accommodate.

Having spent some time reading the HL7 spec, I now tend to agree that
labels alone are not going to be sufficient. There are security
decisions made based on:

- Security labels

- User/group/role memberships, both for accessor and target entity owner
(down to the row level)

- Black/white list ACLs, with inheritance and deny rules.

- One-off or narrow authorizations. I permit my GP to examine my mental
health record details, but only over the last year, and the
authorization stands only for today.

- Authorization assertions. I declare that the patient told me it is OK
to access these, let me see them.

- Break glass access. This is an emergency. Give me access and I'll
deal with the consequences later.


So while security labels are an important part of the picture I'm forced
to agree that they are not sufficient, and that a generalized row
security mechanism actually is necessary. We don't have the time or
resources to build all these sorts of things individually, and if we did
it'd still be too inflexible for somebody.

In the end, sometimes I guess there's no replacement for WHERE
call_some_procedure()


In particular, labels are totally orthogonal to entity identity in the
data model, and being able to make row access decisions  based on
information already in the data model is important. FK relationships to
owning entities and relationships between entities must be usable for
security access policy decisions.

So: I'm withdrawing the proposal to go straight to label security; I
concede that it's insufficiently expressive to meet all possible needs,
and we don't have the time to build anything declarative and
user-friendly that would be.

I do want to make sure that whatever we include this time around does
not create problems for a future label security approach, but that's
kind of required by the desire to add SEPostgreSQL RLS down the track
anyway.

Given that: What are your specific usability concerns about the current
approach proposed in KaiGai's patch?

My main worry was that it requires the user to build everything
manually, and is potentially error prone as a result. To address that we
can build convenience features (label security, ACL types and operators,
etc) on top of the same infrastructure later - it doesn't have to go in
right away.

So putting that side, the concerns I have are:

- The current RLS design is restricted to one predicate per table with
no contextual control over when that predicate applies. That means we
can't implement anything like policy groups or overlapping sets of
predicates, anything like that has to be built by the user. We don't
need multiple predicates to start with but I want to make sure there's
room for them later, particularly so that different apps / extensions
that use row-security don't have to fight with each other.

- You can't declare a predicate then apply it to a bunch of tables with
consistent security columns. Updating/changing predicates will be a
pain. We might be able to get around that by recommending that people
use an inlineable SQL function to declare their predicates, but
inlineable can be hard to pin down sometimes. If not, we need
something akin to GRANT ... ALL TABLES IN SCHEMA ... for row security,
and ALTER DEFAULT PRIVILEGES ... too.

- It's too hard to see what tables have row-security and what impact it
has. Needs psql improvements.

- There's no way to control whether or not a client can see the
row-security policy definition.


The other one I want to deal with is secure session variables, but
that's mostly a performance optimisation we can add later.

What's your list?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logging WAL when updating hintbit

2013-12-18 Thread Amit Kapila
On Wed, Dec 18, 2013 at 11:30 AM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Wed, Dec 18, 2013 at 11:22 AM, Amit Kapila amit.kapil...@gmail.com wrote:
 On Fri, Dec 13, 2013 at 7:57 PM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 Thanks, committed with some minor changes:

 Should this patch in CF app be moved to Committed Patches or is there
 something left for this patch?
 Nothing has been forgotten for this patch. It can be marked as committed.

Thanks for confirmation, I have marked it as Committed.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >