Re: [HACKERS] [v9.5] Custom Plan API

2014-10-26 Thread Thom Brown
On 30 September 2014 07:27, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

  On Mon, Sep 29, 2014 at 9:04 AM, Kohei KaiGai kai...@kaigai.gr.jp
 wrote:
   Do we need to match the prototype of wrapper function with callback?
 
  Yes.
 
 OK, I fixed up the patch part-2, to fit declaration of
 GetSpecialCustomVar()
 with corresponding callback.

 Also, a noise in the part-3 patch, by git-pull, was removed.


FYI, patch v12 part 2 no longer applies cleanly.

-- 
Thom


Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT

2014-10-26 Thread Marti Raudsepp
On Wed, Oct 22, 2014 at 1:37 PM, David Rowley dgrowle...@gmail.com wrote:
 I've had a bit of a look at this and here's a couple of things:

Thanks. Sorry I didn't back to you earlier, I almost forgot about the review.

 /*
 +* LIMIT clause can be removed if it's a positive constant or ALL, to
 +* prevent it from being an optimization barrier. It's a common meme 
 to put
 +* LIMIT 1 within EXISTS subqueries.
 +*/

 I think this comment may be better explained along the lines of:

 A subquery which has a LIMIT clause with a positive value is effectively a
 no-op in this scenario. With EXISTS we only care about the first row anyway,
 so any positive limit value will have no behavioral change to the query, so
 we'll simply remove the LIMIT clause. If we're unable to prove that the
 LIMIT value is a positive number then we'd better not touch it.

That's a bit redundant. Here's what I wrote instead, does this sound better?

* A subquery that has a LIMIT clause with a positive value or NULL causes
* no behavioral change to the query. With EXISTS we only care about the
* first row anyway, so we'll simply remove the LIMIT clause. If the LIMIT
* value does not reduce to a constant that's positive or NULL then do not
* touch it.

 + /* Checking for negative values is done later; 0 is just silly */
 + if (! limit-constisnull  DatumGetInt64(limit-constvalue) = 0)

 I'm a bit confused by the comment here. You're saying that we'll check for
 negatives later... but you're checking for = 0 on the next line... Did I
 miss something or is this a mistake?

I meant that the case when a negative value raises an error is checked
later in the execution pass. But you're right, this code has nothing
to do with that so I've removed the mention about negative values. It
now says:

/* If it's not NULL and not positive, keep it as a regular subquery */

 I guess here you're just testing to ensure that you've not broken this and
 that the new code does not accidentally remove the LIMIT clause. I think it
 also might be worth adding some tests to ensure that co-related EXISTS
 clauses with a positive LIMIT value get properly changed into a SEMI JOIN
 instead of remaining as a subplan. And also make sure that a LIMIT 0 or
 LIMIT -1 gets left as a subplan. I'd say such a test would supersede the
 above test, and I think it's also the case you were talking about optimising
 anyway?

Sure, this is a planner-only change so it makes sense to test EXPLAIN output.

The dilemma I always have is: wouldn't this cause failures due to plan
instability, if for example autoanalyze gets around to this table
earlier or later and nudges it from a hash join to merge etc? Or is
this not a problem?

Patch attached with all above changes.

Regards,
Marti
From 28543dda9febe8d8b5fc91060a4323c08f3c4a8a Mon Sep 17 00:00:00 2001
From: Marti Raudsepp ma...@juffo.org
Date: Wed, 1 Oct 2014 02:17:21 +0300
Subject: [PATCH] Simplify EXISTS subqueries containing LIMIT

---
 src/backend/optimizer/plan/subselect.c  | 42 ++-
 src/test/regress/expected/subselect.out | 51 +
 src/test/regress/sql/subselect.sql  | 14 +
 3 files changed, 100 insertions(+), 7 deletions(-)

diff --git a/src/backend/optimizer/plan/subselect.c b/src/backend/optimizer/plan/subselect.c
index 3e7dc85..66d1b90 100644
--- a/src/backend/optimizer/plan/subselect.c
+++ b/src/backend/optimizer/plan/subselect.c
@@ -70,7 +70,7 @@ static Node *convert_testexpr_mutator(Node *node,
 static bool subplan_is_hashable(Plan *plan);
 static bool testexpr_is_hashable(Node *testexpr);
 static bool hash_ok_operator(OpExpr *expr);
-static bool simplify_EXISTS_query(Query *query);
+static bool simplify_EXISTS_query(PlannerInfo *root, Query *query);
 static Query *convert_EXISTS_to_ANY(PlannerInfo *root, Query *subselect,
 	  Node **testexpr, List **paramIds);
 static Node *replace_correlation_vars_mutator(Node *node, PlannerInfo *root);
@@ -452,7 +452,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 	 * If it's an EXISTS subplan, we might be able to simplify it.
 	 */
 	if (subLinkType == EXISTS_SUBLINK)
-		simple_exists = simplify_EXISTS_query(subquery);
+		simple_exists = simplify_EXISTS_query(root, subquery);
 
 	/*
 	 * For an EXISTS subplan, tell lower-level planner to expect that only the
@@ -518,7 +518,7 @@ make_subplan(PlannerInfo *root, Query *orig_subquery,
 		/* Make a second copy of the original subquery */
 		subquery = (Query *) copyObject(orig_subquery);
 		/* and re-simplify */
-		simple_exists = simplify_EXISTS_query(subquery);
+		simple_exists = simplify_EXISTS_query(root, subquery);
 		Assert(simple_exists);
 		/* See if it can be converted to an ANY query */
 		subquery = convert_EXISTS_to_ANY(root, subquery,
@@ -1359,7 +1359,7 @@ convert_EXISTS_sublink_to_join(PlannerInfo *root, SubLink *sublink,
 	 * targetlist, we have to fail, because the pullup operation leaves us
 	 * with 

Re: [HACKERS] TAP test breakage on MacOS X

2014-10-26 Thread Peter Eisentraut
On 10/9/14 3:38 PM, Robert Haas wrote:
 The problem is that running initdb --not-a-valid-option leaves $? set
 to 256, as seems entirely unsurprising.  After running the anonymous
 block passed to it, Test::Builder::subtest calls Test::Build::finalize
 which calls Test::Build::_ending, which sets $real_exit_code to $? -
 i.e. 256.  That function later throws up if $real_exit_code isn't 0,
 hence the failure.
 
 Off-hand, I'm not quite sure why this works for anyone.

Thank you for this analysis.  The problem is that the version of
Test::Simple that ships with Perl 5.12 is broken in this particular way.
 I have committed a fix to skip the tests in that case.

You might want to review whether this is really the Perl installation
that you want to use.  It looks like it is coming from MacPorts, and
they have stopped updating their perl package.

I have run the PostgreSQL build against all Perl major versions between
5.8 and 5.20, and there have been no more unaddressed version-related
issues.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP test breakage on MacOS X

2014-10-26 Thread Andrew Dunstan


On 10/07/2014 01:57 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

I don't much like the idea of doing an install/initdb/start for every
directory in src/bin, though. Can't we at least manage a single
installation directory for all these?

Peter had a patch to eliminate the overhead of multiple subinstalls;
not sure where that stands, but presumably it would address your issue.


Is there any progress on this. I'm reluctant to add this to the 
buildfarm client until it's solved. These tests currently take a heck of 
a lot longer than any other test suite.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP test breakage on MacOS X

2014-10-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 10/07/2014 01:57 PM, Tom Lane wrote:
 Peter had a patch to eliminate the overhead of multiple subinstalls;
 not sure where that stands, but presumably it would address your issue.

 Is there any progress on this. I'm reluctant to add this to the 
 buildfarm client until it's solved. These tests currently take a heck of 
 a lot longer than any other test suite.

While I'd like to see that patch committed to cut the runtime of make
check in contrib, it's hardly the only stumbling block between us and
enabling TAP tests in the buildfarm.

The pathname length problem I noted in
http://www.postgresql.org/message-id/16477.1413831...@sss.pgh.pa.us
seems like a show-stopper as well, since undoubtedly a number of
buildfarm critters are using buildroots with paths long enough to
trigger it.

The larger issue though is that even with both the above things fixed,
the TAP tests would still be an expensive no-op on the majority of
buildfarm members.  AFAICT, I do not own a single machine on which the
current TAP tests will consent to run, and in most cases that's after
going out of my way to fetch CPAN modules that aren't in the vendor Perl
installs.  Peter's mostly been fixing the portability issues by disabling
tests, which I guess is better than no fix at all, but it leaves darn
little useful functionality.

I think we need a serious discussion about choosing a baseline Perl
version on which we need the TAP tests to work, and then some effort
to make the tests actually work (not just skip tests) on all versions
beyond that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-26 Thread Pavel Stehule
Hi

My idea is using new ArrayBuilder optimized for building multidimensional
arrays with own State type. I think so casting to ArrayBuildState is base
of our problems, so I don't would to do. Code in array_agg_* is simple,
little bit more complex code is in nodeSubplan.c. Some schematic changes
are in attachments.

Regards

Pavel



2014-10-25 15:58 GMT+02:00 Ali Akbar the.ap...@gmail.com:

 you can check it? We can test, how performance lost we get. As second
 benefit we can get numbers for introduction new optimized array builder


 array_agg(anyarray) with deconstruct_array, unchanged accumArrayResult and
 makeMdArrayResult:

 INSERT 0 1
 Time: 852,527 ms
 INSERT 0 1
 Time: 844,275 ms
 INSERT 0 1
 Time: 858,855 ms
 INSERT 0 1
 Time: 861,072 ms
 INSERT 0 1
 Time: 952,006 ms
 INSERT 0 1
 Time: 953,918 ms
 INSERT 0 1
 Time: 926,945 ms
 INSERT 0 1
 Time: 923,692 ms
 INSERT 0 1
 Time: 940,916 ms
 INSERT 0 1
 Time: 948,700 ms
 INSERT 0 1
 Time: 933,333 ms
 INSERT 0 1
 Time: 948,869 ms
 INSERT 0 1
 Time: 847,113 ms
 INSERT 0 1
 Time: 908,572 ms



 Total: 12776.83

 Avg: 912,63


 with last patch (v10):

 INSERT 0 1
 Time: 643,339 ms
 INSERT 0 1
 Time: 608,010 ms
 INSERT 0 1
 Time: 610,465 ms
 INSERT 0 1
 Time: 613,931 ms
 INSERT 0 1
 Time: 616,466 ms
 INSERT 0 1
 Time: 634,754 ms
 INSERT 0 1
 Time: 683,566 ms
 INSERT 0 1
 Time: 656,665 ms
 INSERT 0 1
 Time: 630,096 ms
 INSERT 0 1
 Time: 607,564 ms
 INSERT 0 1
 Time: 610,353 ms
 INSERT 0 1
 Time: 626,816 ms
 INSERT 0 1
 Time: 610,450 ms
 INSERT 0 1
 Time: 614,342 ms



 Total: 8842,7
 Avg: 631,6


 It's 30% faster (i tried varlena element - text). I tried several times
 and it's consistent in +/- 30%.


 quick  dirty non-optimized patch and the test script attached.

 Regards,
 --
 Ali Akbar

ExecScanSubPlan(...)

   ArrayBuildState *astate = NULL;
   MdArrayBuildState *mdastate = NULL;
   bool use_md_array_builder;

   if (subLinkType == MULTIEXPR_SUBLINK)
   {
   }

   if (subLinkType == ARRAY_SUBLINK)
   {
  Assert(subplan-firstColType == tdesc-attrs[0]-attypid);

  /* use a fast array multidimensional builder when input is a array */
  use_md_array_builder = OidIsValid(get_element_type(subplan-firstColType);
   }
   
   ...
   
   for (slot = ExecProcNode() ...)
   {


  if (subLinkType == ARRAY_SUBLINK)
  {
 Datum   dvalue;
 booldisnull;

 found = true;
 dvalue = slot_getattr(slot, 1, disnull);

 if (use_md_array_builder)
mdastate = accumMdArray(...);
 else
astate = accumArrayResult(...);
  }

   } /* endfor */

   if (subLinkType == ARRAY_SUBLINK)
   {
  ..
  if (astate != NULL)
node-curArray = makeArrayResult(astate, ...);
  else if (mdastate != NULL)
node-curArray = makeMdArray(mdastate, ...);
  else
  {
  }
   }

   

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP test breakage on MacOS X

2014-10-26 Thread Andrew Dunstan


On 10/26/2014 12:29 PM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

On 10/07/2014 01:57 PM, Tom Lane wrote:

Peter had a patch to eliminate the overhead of multiple subinstalls;
not sure where that stands, but presumably it would address your issue.

Is there any progress on this. I'm reluctant to add this to the
buildfarm client until it's solved. These tests currently take a heck of
a lot longer than any other test suite.

While I'd like to see that patch committed to cut the runtime of make
check in contrib, it's hardly the only stumbling block between us and
enabling TAP tests in the buildfarm.

The pathname length problem I noted in
http://www.postgresql.org/message-id/16477.1413831...@sss.pgh.pa.us
seems like a show-stopper as well, since undoubtedly a number of
buildfarm critters are using buildroots with paths long enough to
trigger it.



+1 for fixing that, although it seems like a problem in what's being 
tested rather than in the test suite.




The larger issue though is that even with both the above things fixed,
the TAP tests would still be an expensive no-op on the majority of
buildfarm members.  AFAICT, I do not own a single machine on which the
current TAP tests will consent to run, and in most cases that's after
going out of my way to fetch CPAN modules that aren't in the vendor Perl
installs.  Peter's mostly been fixing the portability issues by disabling
tests, which I guess is better than no fix at all, but it leaves darn
little useful functionality.

I think we need a serious discussion about choosing a baseline Perl
version on which we need the TAP tests to work, and then some effort
to make the tests actually work (not just skip tests) on all versions
beyond that.





As far as the buildfarm goes, we could make it a cheap noop by checking 
for the presence of the required modules (AFAIK that's Test::More, 
IPC::CMD and IPC::Run).


I agree that just having it not run tests on most platforms is hardly a 
solution.



cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Andrew Dunstan


On 10/16/2014 04:12 PM, Pavel Stehule wrote:



1. missing documentation

2. I miss more comments related to this functions. This code is 
relative simple, but some more explanation can be welcome.


3. why these functions are marked as stable?




New patch:

Docs added, functions marked immutable, more comments added.

cheers

andrew
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..352b408 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10716,6 +10716,19 @@ table2-mapping
 /programlisting
/entry
   /row
+  row
+   entryparaliteraljson_strip_nulls(from_json json)/literal
+ /paraparaliteraljsonb_strip_nulls(from_json jsonb)/literal
+   /para/entry
+   entryparatypejson/type/paraparatypejsonb/type/para/entry
+   entry
+ Returns replaceablefrom_json/replaceable
+ with all object fields that have null values omitted. Other null values
+ are untouched.
+   /entry
+   entryliteraljson_strip_nulls('[{f1:1,f2:null},2,null,3]')/literal/entry
+   entryliteral[{f1:1},2,null,3]/literal/entry
+   /row
  /tbody
 /tgroup
/table
@@ -10752,6 +10765,16 @@ table2-mapping
 /para
   /note
 
+  note
+para
+  If the argument to literaljson_strip_nulls/ contains duplicate
+  field names in any object, the result could be semantically somewhat
+  different, depending on the order in which they occur. This is not an
+  issue for literaljsonb_strip_nulls/ since jsonb values never have
+  duplicate object field names.
+/para
+  /note
+
   para
 See also xref linkend=functions-aggregate for the aggregate
 function functionjson_agg/function which aggregates record
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index 2d00dbe..06db3e4 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -105,6 +105,15 @@ static void populate_recordset_object_end(void *state);
 static void populate_recordset_array_start(void *state);
 static void populate_recordset_array_element_start(void *state, bool isnull);
 
+/* semantic action functions for json_strip_nulls */
+static void sn_object_start(void *state);
+static void sn_object_end(void *state);
+static void sn_array_start(void *state);
+static void sn_array_end(void *state);
+static void sn_object_field_start (void *state, char *fname, bool isnull);
+static void sn_array_element_start (void *state, bool isnull);
+static void sn_scalar(void *state, char *token, JsonTokenType tokentype);
+
 /* worker function for populate_recordset and to_recordset */
 static Datum populate_recordset_worker(FunctionCallInfo fcinfo, const char *funcname,
 		  bool have_record_arg);
@@ -225,6 +234,13 @@ typedef struct PopulateRecordsetState
 	MemoryContext fn_mcxt;		/* used to stash IO funcs */
 } PopulateRecordsetState;
 
+/* state for json_strip_nulls */
+typedef struct StripnullState{
+	JsonLexContext *lex;
+	StringInfo  strval;
+	bool skip_next_null;
+} StripnullState;
+
 /* Turn a jsonb object into a record */
 static void make_row_from_rec_and_jsonb(Jsonb *element,
 			PopulateRecordsetState *state);
@@ -2996,3 +3012,184 @@ findJsonbValueFromContainerLen(JsonbContainer *container, uint32 flags,
 
 	return findJsonbValueFromContainer(container, flags, k);
 }
+
+/*
+ * Semantic actions for json_strip_nulls.
+ *
+ * Simply repeat the input on the output unless we encounter
+ * a null object field. State for this is set when the field
+ * is started and reset when the scalar action (which must be next)
+ * is called.
+ */
+
+static void
+sn_object_start(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '{');
+}
+
+static void
+sn_object_end(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '}');
+}
+
+static void
+sn_array_start(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '[');
+}
+
+static void
+sn_array_end(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, ']');
+}
+
+static void
+sn_object_field_start (void *state, char *fname, bool isnull)
+{
+	StripnullState *_state = (StripnullState *) state;
+
+	if (isnull)
+	{
+		/* 
+		 * The next thing must be a scalar or isnull couldn't be true,
+		 * so there is no danger of this state being carried down
+		 * into a nested  object or array. The flag will be reset in the
+		 * scalar action.
+		 */
+		_state-skip_next_null = true;
+		return;
+	}
+
+	if (_state-strval-data[_state-strval-len - 1] != '{')
+		appendStringInfoCharMacro(_state-strval, ',');
+
+	/*
+	 * Unfortunately we don't have the quoted and escaped string any more,
+	 * so we have to re-escape it.
+	 */
+	escape_json(_state-strval,fname);
+
+	appendStringInfoCharMacro(_state-strval, ':');
+}
+
+static 

Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Pavel Stehule
Hi

I have a question,

what is expected result of null strip of

{a: {b: null, c, null} }

?

2014-10-26 19:59 GMT+01:00 Andrew Dunstan and...@dunslane.net:


 On 10/16/2014 04:12 PM, Pavel Stehule wrote:



 1. missing documentation

 2. I miss more comments related to this functions. This code is relative
 simple, but some more explanation can be welcome.

 3. why these functions are marked as stable?



 New patch:

 Docs added, functions marked immutable, more comments added.

 cheers

 andrew



Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Andrew Dunstan


On 10/26/2014 03:50 PM, Pavel Stehule wrote:

Hi

I have a question,

what is expected result of null strip of

{a: {b: null, c, null} }

?




Please remember not to top-post.

The above is not legal json, so the answer would be an error.

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Thom Brown
On 26 October 2014 20:07, Andrew Dunstan and...@dunslane.net wrote:


 On 10/26/2014 03:50 PM, Pavel Stehule wrote:

 Hi

 I have a question,

 what is expected result of null strip of

 {a: {b: null, c, null} }

 ?



 Please remember not to top-post.

 The above is not legal json, so the answer would be an error.


I believe Pavel means:

{a: {b: null, c: null} }

-- 
Thom


Re: [HACKERS] Performance regression: 9.2+ vs. ScalarArrayOpExpr vs. ORDER BY

2014-10-26 Thread Tom Lane
I wrote:
 Andrew Gierth and...@tao11.riddles.org.uk writes:
 Bruce == Bruce Momjian br...@momjian.us writes:
 Bruce Uh, did this ever get addressed?

 It did not.

 It dropped off the radar screen (I think I'd assumed the patch would
 appear in the next commitfest, which it didn't unless I missed something).
 I'll make a note to look at it once I've finished with the timezone
 abbreviation mess.

I've pushed this patch with some further redesign of build_index_paths'
API --- if we're going to have it reporting about what it found, we
should extend that to the other case of non-amsearcharray indexes too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Pavel Stehule
2014-10-26 21:18 GMT+01:00 Andrew Dunstan and...@dunslane.net:


 On 10/26/2014 04:14 PM, Thom Brown wrote:

 On 26 October 2014 20:07, Andrew Dunstan and...@dunslane.net mailto:
 and...@dunslane.net wrote:


 On 10/26/2014 03:50 PM, Pavel Stehule wrote:

 Hi

 I have a question,

 what is expected result of null strip of

 {a: {b: null, c, null} }

 ?



 Please remember not to top-post.

 The above is not legal json, so the answer would be an error.


 I believe Pavel means:

 {a: {b: null, c: null} }


 This is the expected result:

andrew=# select json_strip_nulls('{a: {b: null, c: null} }');
  json_strip_nulls
--
  {a:{}}
(1 row)


 It is NOT expected that we replace an empty object with NULL (and then
 strip it if it's a field value of an outer level object).


ok,

This case should be in regress test probably

Thank you

Pavel



 cheers

 andrew




Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Andrew Dunstan


On 10/26/2014 04:14 PM, Thom Brown wrote:
On 26 October 2014 20:07, Andrew Dunstan and...@dunslane.net 
mailto:and...@dunslane.net wrote:



On 10/26/2014 03:50 PM, Pavel Stehule wrote:

Hi

I have a question,

what is expected result of null strip of

{a: {b: null, c, null} }

?



Please remember not to top-post.

The above is not legal json, so the answer would be an error.


I believe Pavel means:

{a: {b: null, c: null} }


This is the expected result:

   andrew=# select json_strip_nulls('{a: {b: null, c: null} }');
 json_strip_nulls
   --
 {a:{}}
   (1 row)


It is NOT expected that we replace an empty object with NULL (and then 
strip it if it's a field value of an outer level object).


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] TAP test breakage on MacOS X

2014-10-26 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 On 10/26/2014 12:29 PM, Tom Lane wrote:
 The pathname length problem I noted in
 http://www.postgresql.org/message-id/16477.1413831...@sss.pgh.pa.us
 seems like a show-stopper as well, since undoubtedly a number of
 buildfarm critters are using buildroots with paths long enough to
 trigger it.

 +1 for fixing that, although it seems like a problem in what's being 
 tested rather than in the test suite.

I agree, but nonetheless we don't want the buildfarm turning mostly
red because we enable TAP before fixing this.

 The larger issue though is that even with both the above things fixed,
 the TAP tests would still be an expensive no-op on the majority of
 buildfarm members.

 As far as the buildfarm goes, we could make it a cheap noop by checking 
 for the presence of the required modules (AFAIK that's Test::More, 
 IPC::CMD and IPC::Run).

You'd probably have to check not just presence but version; but yeah,
that is a potential solution to the cycle-wastage problem.

 I agree that just having it not run tests on most platforms is hardly a 
 solution.

It doesn't do much to make the tests actually useful, for sure ...

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] INSERT ... ON CONFLICT {UPDATE | IGNORE}

2014-10-26 Thread Peter Geoghegan
On Sat, Oct 25, 2014 at 8:12 AM, Robert Haas robertmh...@gmail.com wrote:
 Generating index paths for the UPDATE is a waste of cycles.
 Theoretically, there could be an (a, b, c) unique index and a (c,b,a)
 unique index, and those two might have a non-equal cost to scan. But
 that almost certainly isn't going to happen in practice, since that's
 a rather questionable indexing strategy, and even when it does, you're
 going to have to insert into all the unique indexes a good proportion
 of the time anyway, making the benefits of that approach pale in
 comparison to the costs.

 I don't care whether you actually generate index-paths or not, and in
 fact I suspect it makes no sense to do so.  But I do care that you do
 a cost comparison between the available indexes and pick the one that
 looks cheapest.  If somebody's got a bloated index and builds a new
 one, they will want it to get used even before they drop the old one.

That seems extremely narrow. I am about ready to just do what you say,
by costing the index based on something like relpages, but for the
record I see no point. If I see no point, and you're sure that there
is a point, then there is a danger that I'll miss the point, which
contributes to my giving push back. I know I said that already, but I
reiterate it once more for emphasis.

 Even if that weren't an issue, though, the fact that you can't
 re-parse but you can re-plan is a killer point AFAICS. It means you
 are borked if the statement gets re-executed after the index you
 picked is dropped.

That simply isn't the case. As specifically noted in comments in the
patch, relcache invalidation works in such a way as to invalidate the
query proper, even when only an index has been dropped. For a time
during development, the semantic dependence was more directly
represented by actually having extract_query_dependencies() extract
the arbitrating unique index pg_class Oid as a dependency for the
benefit of the plan cache, but I ultimately decided against doing
anything more than noting the potential future problem (the problem
that arises in a world where relcache invalidation is more selective).

But, I digress: I'll have inference occur during planning during the
next revision since you think that's important.

 From my point of view, I spent a significant amount of time making the
 patch more or less match your proposed design for unique index
 inference. It is discouraging to hear that you think I'm not
 cooperating with community process. I'm doing my best. I think it
 would be a bad idea for me to not engage with the community for an
 extended period at this point. There were plenty of other issues
 address by V1.3 that were not the CONFLICTING()/EXCLUDING thing that
 you highlighted (or the other thing you highlighted around where to do
 unique index inference).

 I think this gets at a point that has been bugging me and, perhaps,
 other people here.  You often post a new patch with some fixes but
 without fixes for the issues that reviewers have indicated are
 top-of-mind for them.  Sometimes, but not always, you say something
 like I know this doesn't fix X but I'd like comments on other aspects
 of the patch.  Even when you do, though, it can be difficult to
 overlook something that appears to be a fundamental structural problem
 to comment on details, and sometimes it feels like that's what we're
 being asked to do.

I think that it's accurate to say that I asked the community to do
that once, last year. I was even very explicit about it at the time.
I'm surprised that you think that's the case now, though. I learned my
lesson a little later: don't do that, because fellow contributors are
unwilling to go along with it, even for something as important as
this.

As recently as a few months ago, you wanted me to go a *completely*
different direction with this (i.e. attempt an UPDATE first). I'm
surprised that you're emphasizing the EXCLUDED()/CONFLICTING() thing
so much. Should I take it that I more or less have your confidence
that the way the patch fits together at a high level is sound?
Because, that's the only way I can imagine you'd think that the
EXCLUDED()/CONFLICTING() thing is of such fundamental importance at
this point. It is after all split out as a much smaller commit on top
of the main implementation (a commit which could easily be deferred).
While it's important, it is very clearly subsidiary to the overall
structure of my design, which I think that there has been precious
little discussion of on this thread (e.g. the whole way I use
EvalPlanQual(), the general idea of a special auxiliary plan that is
executed in a special way, etc). That concerns me, because I suspect
that the reason there has been next to no discussion of those aspects
is because they're particularly hard things to have an opinion on, and
yet are the things of immediate importance. Please correct me if I
have it wrong.

I am in no position to demand that you or anyone else discuss any
particular aspect 

Re: [HACKERS] [BUGS] ltree::text not immutable?

2014-10-26 Thread Tom Lane
I wrote:
 More generally, it seems like we ought to have a test in the type_sanity
 regression script that checks that type I/O functions aren't volatile,
 ...

 Actually, the right thing to do if we want to enforce this is for
 CREATE TYPE to check the marking.  We'd still need a type_sanity
 test to check erroneous declarations of built-in types, but a complaint
 in CREATE TYPE would cover all the contrib modules as well as third-party
 code.

 I wouldn't propose back-patching such an error check, but it seems
 reasonable to add it for 9.5.  Any objections?

On the way to fixing this, I was unpleasantly surprised to discover that
we have one I/O function in contrib that *actually is* volatile, and not
just thoughtlessly marked that way.  That's chkpass_in(), which is
volatile because it goes to the trouble of using random() to produce a
new password salt value on every call.

Not entirely sure what to do about this.  It seems like for the purposes
of contrib/chkpass, it's a good thing that chkpass_in() won't reuse the
same salt.  Weak as a 12-bit salt might be nowadays, it's still better
than no salt.  Nonetheless, this behavior is breaking assumptions made
in places like array_in and record_in.

For the moment I'm tempted to mark chkpass_in as stable (with a comment
explaining that it isn't really) just so we can put in the error check
in CREATE TYPE.  But I wonder if anyone has a better idea.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strip nulls functions for json and jsonb

2014-10-26 Thread Andrew Dunstan


On 10/26/2014 04:22 PM, Pavel Stehule wrote:



2014-10-26 21:18 GMT+01:00 Andrew Dunstan and...@dunslane.net 
mailto:and...@dunslane.net:



On 10/26/2014 04:14 PM, Thom Brown wrote:

On 26 October 2014 20:07, Andrew Dunstan and...@dunslane.net
mailto:and...@dunslane.net mailto:and...@dunslane.net
mailto:and...@dunslane.net wrote:


On 10/26/2014 03:50 PM, Pavel Stehule wrote:

Hi

I have a question,

what is expected result of null strip of

{a: {b: null, c, null} }

?



Please remember not to top-post.

The above is not legal json, so the answer would be an error.


I believe Pavel means:

{a: {b: null, c: null} }


This is the expected result:

   andrew=# select json_strip_nulls('{a: {b: null, c: null} }');
 json_strip_nulls
   --
 {a:{}}
   (1 row)


It is NOT expected that we replace an empty object with NULL (and
then strip it if it's a field value of an outer level object).


ok,

This case should be in regress test probably




Patch attached.

cheers

andrew
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 7e5bcd9..352b408 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10716,6 +10716,19 @@ table2-mapping
 /programlisting
/entry
   /row
+  row
+   entryparaliteraljson_strip_nulls(from_json json)/literal
+ /paraparaliteraljsonb_strip_nulls(from_json jsonb)/literal
+   /para/entry
+   entryparatypejson/type/paraparatypejsonb/type/para/entry
+   entry
+ Returns replaceablefrom_json/replaceable
+ with all object fields that have null values omitted. Other null values
+ are untouched.
+   /entry
+   entryliteraljson_strip_nulls('[{f1:1,f2:null},2,null,3]')/literal/entry
+   entryliteral[{f1:1},2,null,3]/literal/entry
+   /row
  /tbody
 /tgroup
/table
@@ -10752,6 +10765,16 @@ table2-mapping
 /para
   /note
 
+  note
+para
+  If the argument to literaljson_strip_nulls/ contains duplicate
+  field names in any object, the result could be semantically somewhat
+  different, depending on the order in which they occur. This is not an
+  issue for literaljsonb_strip_nulls/ since jsonb values never have
+  duplicate object field names.
+/para
+  /note
+
   para
 See also xref linkend=functions-aggregate for the aggregate
 function functionjson_agg/function which aggregates record
diff --git a/src/backend/utils/adt/jsonfuncs.c b/src/backend/utils/adt/jsonfuncs.c
index 2d00dbe..06db3e4 100644
--- a/src/backend/utils/adt/jsonfuncs.c
+++ b/src/backend/utils/adt/jsonfuncs.c
@@ -105,6 +105,15 @@ static void populate_recordset_object_end(void *state);
 static void populate_recordset_array_start(void *state);
 static void populate_recordset_array_element_start(void *state, bool isnull);
 
+/* semantic action functions for json_strip_nulls */
+static void sn_object_start(void *state);
+static void sn_object_end(void *state);
+static void sn_array_start(void *state);
+static void sn_array_end(void *state);
+static void sn_object_field_start (void *state, char *fname, bool isnull);
+static void sn_array_element_start (void *state, bool isnull);
+static void sn_scalar(void *state, char *token, JsonTokenType tokentype);
+
 /* worker function for populate_recordset and to_recordset */
 static Datum populate_recordset_worker(FunctionCallInfo fcinfo, const char *funcname,
 		  bool have_record_arg);
@@ -225,6 +234,13 @@ typedef struct PopulateRecordsetState
 	MemoryContext fn_mcxt;		/* used to stash IO funcs */
 } PopulateRecordsetState;
 
+/* state for json_strip_nulls */
+typedef struct StripnullState{
+	JsonLexContext *lex;
+	StringInfo  strval;
+	bool skip_next_null;
+} StripnullState;
+
 /* Turn a jsonb object into a record */
 static void make_row_from_rec_and_jsonb(Jsonb *element,
 			PopulateRecordsetState *state);
@@ -2996,3 +3012,184 @@ findJsonbValueFromContainerLen(JsonbContainer *container, uint32 flags,
 
 	return findJsonbValueFromContainer(container, flags, k);
 }
+
+/*
+ * Semantic actions for json_strip_nulls.
+ *
+ * Simply repeat the input on the output unless we encounter
+ * a null object field. State for this is set when the field
+ * is started and reset when the scalar action (which must be next)
+ * is called.
+ */
+
+static void
+sn_object_start(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '{');
+}
+
+static void
+sn_object_end(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '}');
+}
+
+static void
+sn_array_start(void *state)
+{
+	StripnullState *_state = (StripnullState *) state;
+	appendStringInfoCharMacro(_state-strval, '[');
+}
+
+static void
+sn_array_end(void *state)
+{

[HACKERS] proposal: CREATE DATABASE vs. (partial) CHECKPOINT

2014-10-26 Thread Tomas Vondra
Hi all,

currently, CREATE DATABASE forces an immediate checkpoint (actually, it
forces two, but the time interval is usually rather small). For
traditional deployments this is not a big deal, because creating a
database is a rare event, and may be planned to off-peak times.

However for shared (cloud-like) deployments, this is not the case. E.g.
we're hosting hundreds (or even thousands) of customer databases on some
clusters, and creating a new database is quite common.

This turns the checkpoints into a significant pain point for us, because
it forces a write of all the dirty buffers from all the databases. No
matter how well we tune the spread checkpoints, this makes it
inefficient and causes significant I/O spikes (especially with larger
shared_buffer values). It also leads to high duration of the CREATE
DATABASE command, making it rather inpractical for 'interactive' use (a
user hitting a button in a UI or something).

Based on the talks from pgconf.eu, where I've seen this mentioned in at
least two talks (and in the hallway track), we're not alone. I'd like to
address this, if possible.

The usual workaround for this is create the databases in advance but
that's not always possible (e.g. when having more than handful of
templates, or when the template evolves over time).

After eyeballing the code for an hour or two, I think CREATE DATABASE
should be fine with performing only a 'partial checkpoint' on the
template database - calling FlushDatabaseBuffers and processing unlink
requests, as suggested by the comment in createdb().

It's not exactly trivial change, but it does not seem frighteningly
difficult coding either.

The templates are usually static, so this would minimize both the CREATE
DATABASE duration and disruption to the cluster it causes.

My fear however is that this while the code will work, it will break the
recovery in some subtle way (as illustrated by the comments about 8.0
PITR bugs in createdb).

Am I missing something that makes this dead in the water?

regards
Tomas


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-10-26 Thread Noah Misch
On Wed, Oct 22, 2014 at 12:12:36AM -0400, Noah Misch wrote:
 On Mon, Oct 20, 2014 at 01:03:31AM -0400, Noah Misch wrote:
  I reproduced narwhal's problem using its toolchain on another 32-bit Windows
  Server 2003 system.  The crash happens at the SHGetFolderPath() call in
  pqGetHomeDirectory().  A program can acquire that function via shfolder.dll 
  or
  via shell32.dll; we've used the former method since commit 889f038, for 
  better
  compatibility[1] with Windows NT 4.0.  On this system, shfolder.dll's 
  version
  loads and unloads shell32.dll.  In PostgreSQL built using this older 
  compiler,
  shfolder.dll:SHGetFolderPath() unloads libpq in addition to unloading 
  shell32!
  That started with commit 846e91e.  I don't expect to understand the 
  mechanism
  behind it, but I recommend we switch back to linking libpq with shell32.dll.
  The MSVC build already does that in all supported branches, and it feels 
  right
  for the MinGW build to follow suit in 9.4+.  Windows versions that lack the
  symbol in shell32.dll are now ancient history.
 
 That allowed narwhal to proceed a bit further than before, but it crashes
 later in the dblink test suite.  Will test again...

Calls to ldap_init() exhibit the same problem shfolder.dll:SHGetFolderPath()
exhibited: it loads and unloads some DLLs, and it manages to unload libpq in
passing.  There's nothing comparable to the above workaround this time, but I
found a more fundamental fix.

Each DLL header records the DLL's presumed name, viewable with objdump -x
libpq.dll | grep ^Name.  Per the GNU ld documentation, one can override it in
a .def file:

  The optional LIBRARY name command indicates the internal name of the
  output DLL. If `name' does not include a suffix, the default library
  suffix, `.DLL' is appended.

libpqdll.def uses LIBRARY LIBPQ, which yields an internal name of
LIBPQ.dll under MinGW-w64 i686-4.9.1-release-win32-dwarf-rt_v3-rev1.  Our
MSVC build process also gives LIBPQ.dll.  Under narwhal's older toolchain,
libpq.dll gets a name of just LIBPQ.  The attached patch brings the product
of narwhal's toolchain in line with the others.  I don't know by what magic
wldap32.dll:ldap_init() and shfolder.dll:SHGetFolderPath() care about finding
.dll in the names of loaded libraries, but it does fix the bug.

This erases the impetus for my recent commit 53566fc.  I'm inclined to keep
that commit in the name of consistency with the MSVC build, but one could
argue for reverting its 9.4 counterpart.  I don't feel strongly either way, so
I expect to let 2f51f42 stand.

FYI, the problem is reproducible in a 32-bit PostgreSQL build running on
32-bit or 64-bit Windows Server 2003.  It does not occur on 64-bit Windows
Server 2008, even after marking postgres.exe to run in the compatibility mode
for Windows Server 2003.  (I did not try 32-bit Windows Server 2008.)
commit 912846f (HEAD, master)
Author: Noah Misch n...@leadboat.com
AuthorDate: Sun Oct 26 18:02:55 2014 -0400
Commit: Noah Misch n...@leadboat.com
CommitDate: Sun Oct 26 18:17:34 2014 -0400

MinGW: Include .dll extension in .def file LIBRARY commands.

Newer toolchains append the extension implicitly if missing, but
buildfarm member narwhal (gcc 3.4.2, ld 2.15.91 20040904) does not.
This affects most core libraries having an exports.txt file, namely
libpq and the ECPG support libraries.  On Windows Server 2003, Windows
API functions that load and unload DLLs internally will mistakenly
unload a libpq whose DLL header reports LIBPQ instead of LIBPQ.dll.
When, subsequently, control would return to libpq, the backend crashes.
Back-patch to 9.4, like commit 846e91e0223cf9f2821c3ad4dfbb929cb027.
Before that commit, we used a different linking technique that yielded
libpq.dll in the DLL header.

Commit 53566fc0940cf557416b13252df57350a4511ce4 worked around this by
eliminating a call to a function that loads and unloads DLLs internally.
That commit is no longer necessary for correctness, but its improving
consistency with the MSVC build remains valid.

diff --git a/src/Makefile.shlib b/src/Makefile.shlib
index 012dd12..e3a06ef 100644
--- a/src/Makefile.shlib
+++ b/src/Makefile.shlib
@@ -423,13 +423,13 @@ UC_NAME = $(shell echo $(NAME) | tr 
'abcdefghijklmnopqrstuvwxyz' 'ABCDEFGHIJKLMN
 
 lib$(NAME)dll.def: $(SHLIB_EXPORTS)
echo '; DEF file for MS VC++' $@
-   echo 'LIBRARY LIB$(UC_NAME)' $@
+   echo 'LIBRARY LIB$(UC_NAME).dll' $@
echo 'EXPORTS' $@
sed -e '/^#/d' -e 's/^\(.*[ ]\)\([0-9][0-9]*\)/\1@ \2/' $ $@
 
 lib$(NAME)ddll.def: $(SHLIB_EXPORTS)
echo '; DEF file for MS VC++' $@
-   echo 'LIBRARY LIB$(UC_NAME)D' $@
+   echo 'LIBRARY LIB$(UC_NAME)D.dll' $@
echo 'EXPORTS' $@
sed -e '/^#/d' -e 's/^\(.*[ ]\)\([0-9][0-9]*\)/\1@ \2/' $ $@
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

[HACKERS] pset_quoted_string is broken

2014-10-26 Thread David Rowley
It seems the buffer created in pset_quoted_string is just 1 char too small.

This breaks psql's \pset for me, though I've no idea why the buildfarm is
not complaining a bit more.

As it stands, if the function is given an empty string to quote, it tries
to build a string with 2 single quotes and a NUL. This needs 3 chars, not 2.

The attached simple patch fixes the problem.
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index cb94ce3..26089352 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -2711,7 +2711,7 @@ pset_bool_string(bool val)
 static char *
 pset_quoted_string(const char *str)
 {
-   char   *ret = pg_malloc(strlen(str) * 2 + 2);
+   char   *ret = pg_malloc(strlen(str) * 2 + 3);
char   *r = ret;
 
*r++ = '\'';

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] narwhal and PGDLLIMPORT

2014-10-26 Thread Tom Lane
Noah Misch n...@leadboat.com writes:
 Calls to ldap_init() exhibit the same problem shfolder.dll:SHGetFolderPath()
 exhibited: it loads and unloads some DLLs, and it manages to unload libpq in
 passing.  There's nothing comparable to the above workaround this time, but I
 found a more fundamental fix.
 ...
 libpqdll.def uses LIBRARY LIBPQ, which yields an internal name of
 LIBPQ.dll under MinGW-w64 i686-4.9.1-release-win32-dwarf-rt_v3-rev1.  Our
 MSVC build process also gives LIBPQ.dll.  Under narwhal's older toolchain,
 libpq.dll gets a name of just LIBPQ.  The attached patch brings the product
 of narwhal's toolchain in line with the others.  I don't know by what magic
 wldap32.dll:ldap_init() and shfolder.dll:SHGetFolderPath() care about finding
 .dll in the names of loaded libraries, but it does fix the bug.

Nice detective work!

 This erases the impetus for my recent commit 53566fc.  I'm inclined to keep
 that commit in the name of consistency with the MSVC build, but one could
 argue for reverting its 9.4 counterpart.  I don't feel strongly either way, so
 I expect to let 2f51f42 stand.

Yeah, I lean the same way.  The fewer differences in the results of our
assorted Windows build processes, the better.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump/pg_restore seem broken on hamerkop

2014-10-26 Thread Tatsuo Ishii
 Buildfarm member hamerkop has been failing in the pg_upgrade regression
 test for the last several days.  The problem looks like this:
 
 command: 
 C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore
  --port 50432 --username Administrator --exit-on-error --verbose --dbname 
 postgres pg_upgrade_dump_12145.custom  pg_upgrade_dump_12145.log 21
 pg_restore: connecting to database for restore
 pg_restore: [archiver (db)] Error while INITIALIZING:
 pg_restore: [archiver (db)] could not execute query: ERROR:  invalid byte 
 sequence for encoding UTF8: 0x93
 
 I can't help noticing that this started immediately after commit
 0eea804 pg_dump: Reduce use of global variables.  No idea why
 the issue is only showing up on this one animal.
 
 I guess one of possibilities is there's garbage in memory which is
 related to restore the process. If so, coverity may be able to find
 something. The last coverity analysis was Oct 12. The commit 0eea804
 was made on Oct 14. So let's see what new coverity scan reports...

The latest coverity scan report did not include paticular defects
related to pg_dump/pg_restore...

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pset_quoted_string is broken

2014-10-26 Thread Tom Lane
David Rowley dgrowle...@gmail.com writes:
 It seems the buffer created in pset_quoted_string is just 1 char too small.

Yeah, that's a bug.  Fix pushed, thanks!

 This breaks psql's \pset for me, though I've no idea why the buildfarm is
 not complaining a bit more.

I think in most cases, maxalign padding of the malloc allocation would
result in there being room for another byte without trashing anything
important.  You must be using a libc that notices and complains about
even 1-byte buffer overruns.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump/pg_restore seem broken on hamerkop

2014-10-26 Thread Alvaro Herrera
Tatsuo Ishii wrote:
  Buildfarm member hamerkop has been failing in the pg_upgrade regression
  test for the last several days.  The problem looks like this:
  
  command: 
  C:/buildfarm/build_root/HEAD/pgsql.build/contrib/pg_upgrade/tmp_check/install/bin/pg_restore
   --port 50432 --username Administrator --exit-on-error --verbose --dbname 
  postgres pg_upgrade_dump_12145.custom  pg_upgrade_dump_12145.log 
  21
  pg_restore: connecting to database for restore
  pg_restore: [archiver (db)] Error while INITIALIZING:
  pg_restore: [archiver (db)] could not execute query: ERROR:  invalid byte 
  sequence for encoding UTF8: 0x93
  
  I can't help noticing that this started immediately after commit
  0eea804 pg_dump: Reduce use of global variables.  No idea why
  the issue is only showing up on this one animal.
 
 I guess one of possibilities is there's garbage in memory which is
 related to restore the process.

That sounds most likely.  The complete error in hamerkop's log is:

pg_restore: connecting to database for restore
pg_restore: [archiver (db)] Error while INITIALIZING:
pg_restore: [archiver (db)] could not execute query: ERROR:  invalid byte 
sequence for encoding UTF8: 0x93
Command was: -- Started on 2014-10-26 03:06:00 “Œ‹ž (•W€Žž)

This Started on business comes from pg_backup_archiver.c, which has

if (AH-public.verbose)
dumpTimestamp(AH, Started on, AH-createDate);

where dumpTimestamp is

/*
 * dumpTimestamp
 */
static void
dumpTimestamp(ArchiveHandle *AH, const char *msg, time_t tim)
{
charbuf[64];

if (strftime(buf, sizeof(buf), %Y-%m-%d %H:%M:%S %z, localtime(tim)) != 
0)
ahprintf(AH, -- %s %s\n\n, msg, buf);
}

So this seems related to the %z part of the strftime() call.  I have no
explanation for this failure ATM; maybe pg_restore is failing to set the
locale properly?  I also notice pg_restore.c previously included
pg_backup_archiver.h (which in turn includes time.h); strftime
requires time.h so maybe this is causing a problem, but since
pg_restore.c itself is not calling strftime, I don't see how this would
be related.

[Some more code and git-log reading later]  I see that the %z is a very
recent addition: it only got there as of commit ad5d46a449, of September
5th ... and now I also see that hamerkop's last green run before the
failure, on Oct 13rd, did *not* include the pg_upgrade check.  So I'm
thinking this was broken much earlier than 0eea804.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump/pg_restore seem broken on hamerkop

2014-10-26 Thread Tom Lane
Alvaro Herrera alvhe...@2ndquadrant.com writes:
 So this seems related to the %z part of the strftime() call.  I have no
 explanation for this failure ATM; maybe pg_restore is failing to set the
 locale properly?  I also notice pg_restore.c previously included
 pg_backup_archiver.h (which in turn includes time.h); strftime
 requires time.h so maybe this is causing a problem, but since
 pg_restore.c itself is not calling strftime, I don't see how this would
 be related.

Hm.  %z ought not be locale-dependent ... however, it has a bigger
problem, which is that it's a C99-ism.  It's not there in SUSv2,
which is our normal baseline for what's portable.  I think we need
to get rid of that.  %Z should be portable.

(Is it possible that Windows' strftime() reads %z as doing something
other than what C99 says?)

 [Some more code and git-log reading later]  I see that the %z is a very
 recent addition: it only got there as of commit ad5d46a449, of September
 5th ... and now I also see that hamerkop's last green run before the
 failure, on Oct 13rd, did *not* include the pg_upgrade check.  So I'm
 thinking this was broken much earlier than 0eea804.

Ooohh ... you are right, the first failing build involved not only
the pg_dump refactoring commit, but an upgrade in the buildfarm script
that hamerkop was using (from 4.4 to 4.14).  So it's entirely possible
this issue was already there and we just weren't seeing it tested.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_dump/pg_restore seem broken on hamerkop

2014-10-26 Thread Tom Lane
I wrote:
 Hm.  %z ought not be locale-dependent ... however, it has a bigger
 problem, which is that it's a C99-ism.  It's not there in SUSv2,
 which is our normal baseline for what's portable.  I think we need
 to get rid of that.  %Z should be portable.

 (Is it possible that Windows' strftime() reads %z as doing something
 other than what C99 says?)

A bit of googling leads me to Microsoft reference material saying that
their strftime treats %z and %Z alike.  So in point of fact, the
assumption underlying commit ad5d46a4494b0b48 was flat out wrong.
Switching to %z doesn't get you out of the problem noted in the
comments it removed:

/*
 * We don't print the timezone on Win32, because the names are long and
 * localized, which means they may contain characters in various random
 * encodings; this has been seen to cause encoding errors when reading the
 * dump script.
 */

I'm going to go revert most of that commit and make the code like it
was before:

if (strftime(buf, sizeof(buf),
#ifndef WIN32
 %Y-%m-%d %H:%M:%S %Z,
#else
 %Y-%m-%d %H:%M:%S,
#endif
 localtime(now)) != 0)

If somebody wants to have timezones printed on Windows, they can
try again, but they're gonna have to work harder than this.

It's possible that fixing this will not fix whatever's bothering
hamerkop, but given the lack of backward-portability of %z, this
code has got to go anyway.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Simplify EXISTS subqueries containing LIMIT

2014-10-26 Thread David Rowley
On Mon, Oct 27, 2014 at 1:56 AM, Marti Raudsepp ma...@juffo.org wrote:

 On Wed, Oct 22, 2014 at 1:37 PM, David Rowley dgrowle...@gmail.com
 wrote:
  I've had a bit of a look at this and here's a couple of things:

 Thanks. Sorry I didn't back to you earlier, I almost forgot about the
 review.

  /*
  +* LIMIT clause can be removed if it's a positive constant or
 ALL, to
  +* prevent it from being an optimization barrier. It's a common
 meme to put
  +* LIMIT 1 within EXISTS subqueries.
  +*/
 
  I think this comment may be better explained along the lines of:
 
  A subquery which has a LIMIT clause with a positive value is
 effectively a
  no-op in this scenario. With EXISTS we only care about the first row
 anyway,
  so any positive limit value will have no behavioral change to the query,
 so
  we'll simply remove the LIMIT clause. If we're unable to prove that the
  LIMIT value is a positive number then we'd better not touch it.

 That's a bit redundant. Here's what I wrote instead, does this sound
 better?

 * A subquery that has a LIMIT clause with a positive value or NULL causes
 * no behavioral change to the query. With EXISTS we only care about the
 * first row anyway, so we'll simply remove the LIMIT clause. If the LIMIT
 * value does not reduce to a constant that's positive or NULL then do not
 * touch it.


Sounds better.


  + /* Checking for negative values is done later; 0 is just silly */
  + if (! limit-constisnull  DatumGetInt64(limit-constvalue) = 0)
 
  I'm a bit confused by the comment here. You're saying that we'll check
 for
  negatives later... but you're checking for = 0 on the next line... Did I
  miss something or is this a mistake?

 I meant that the case when a negative value raises an error is checked
 later in the execution pass. But you're right, this code has nothing
 to do with that so I've removed the mention about negative values. It
 now says:

 /* If it's not NULL and not positive, keep it as a regular subquery */


Also better.


  I guess here you're just testing to ensure that you've not broken this
 and
  that the new code does not accidentally remove the LIMIT clause. I think
 it
  also might be worth adding some tests to ensure that co-related EXISTS
  clauses with a positive LIMIT value get properly changed into a SEMI JOIN
  instead of remaining as a subplan. And also make sure that a LIMIT 0 or
  LIMIT -1 gets left as a subplan. I'd say such a test would supersede the
  above test, and I think it's also the case you were talking about
 optimising
  anyway?

 Sure, this is a planner-only change so it makes sense to test EXPLAIN
 output.

 The dilemma I always have is: wouldn't this cause failures due to plan
 instability, if for example autoanalyze gets around to this table
 earlier or later and nudges it from a hash join to merge etc? Or is
 this not a problem?


I guess it could be a problem. You could write your tests to use tenk1
instead, it seems to be analyzed in copy.sql

I'm reasonably happy with the patch now, the only small niggles are maybe
the Assert() is probably not so much needed as transformLimitClause() seems
to coerce to int8 anyway, and recompute_limits() does not bother with the
Assert() when it does the same thing, either way, it's certainly doing no
harm. The other thing was the regression tests, I'd rather see the tests
use 2 separate tables for the EXISTS checks, but this is likely only
because I'm thinking of doing some work in the future to remove duplicate
joins from queries, so perhaps it's fine for now.

Good work. Marking as ready for committer.

Regards

David Rowley




Patch attached with all above changes.

 Regards,
 Marti



Re: [HACKERS] pset_quoted_string is broken

2014-10-26 Thread David Rowley
On Mon, Oct 27, 2014 at 12:20 PM, Tom Lane t...@sss.pgh.pa.us wrote:

 David Rowley dgrowle...@gmail.com writes:
  It seems the buffer created in pset_quoted_string is just 1 char too
 small.

 Yeah, that's a bug.  Fix pushed, thanks!


Thanks for committing.


  This breaks psql's \pset for me, though I've no idea why the buildfarm is
  not complaining a bit more.

 I think in most cases, maxalign padding of the malloc allocation would
 result in there being room for another byte without trashing anything
 important.  You must be using a libc that notices and complains about
 even 1-byte buffer overruns.



I'm using MSVC.
After a bit of reading it seems like when compiled in debug mode that
malloc() uses something called _malloc_dbg() which allocates a bit more
memory to allow for more strict checking of buffer overruns.

http://msdn.microsoft.com/en-us/library/974tc9t1.aspx

I guess all the MSVC buildfarm members must be compiled in release mode
then? I wonder if it would be worth changing one to build with debug as it
seem like none of the buildfarm animals picked this up despite there being
a regression test to ensure \pset works.

Regards

David Rowley


Re: [HACKERS] Reducing lock strength of adding foreign keys

2014-10-26 Thread Andreas Karlsson

On 10/24/2014 06:07 PM, Robert Haas wrote:

I think instead of focusing on foreign keys, we should rewind a bit
and think about the locking level required to add a trigger.


Agreed.


As far as triggers are concerned, the issue of skew between the
transaction snapshot and what the ruleutils.c snapshots do seems to be
the principal issue.  Commit e5550d5fec66aa74caad1f79b79826ec64898688
changed pg_get_constraintdef() to use an MVCC snapshot rather than a
current MVCC snapshot; if that change is safe, I am not aware of any
reason why we couldn't change pg_get_triggerdef() similarly. Barring
further hazards I haven't thought of, I would expect that we could add
a trigger to a relation with only ShareRowExclusiveLock.


Thanks for the info. This is just the kind of issues I was worrying about.


Anything
less than ShareRowExclusiveLock would open up strange timing races
around the firing of triggers by transactions writing the table: they
might or might not notice that a trigger had been added before
end-of-transaction, depending on the timing of cache flushes, which
certainly seems no good.  But even RowExclusiveLock seems like a large
improvement over AccessExclusiveLock.


Would not ShareLock give the same result, except for also allowing 
concurrent CREATE INDEX and concurrent other CREATE TRIGGER which does 
not look dangerous to me?


From a user point of view ShareRowExclusiveLock should be as useful as 
ShareLock.



When a constraint trigger - which is used to implement a foreign key -
is added, there are actually TWO tables involved: the table upon which
the trigger will actually fire, and some other table which is
mentioned in passing in the trigger definition.  It's possible that
the locking requirements for the secondary table are weaker since I
don't think the presence of the trigger actually affects runtime
behavior there.  However, there's probably little point in try to
weaken the lock to less than the level required for the main table
because a foreign key involves adding referential integrity triggers
to both tables.

So I tentatively propose (and with due regard for the possibility
others may see dangers that I've missed) that a reasonable goal would
be to lower the lock strength required for both CREATE TRIGGER and ADD
FOREIGN KEY from AccessExclusiveLock to ShareRowExclusiveLock,
allowing concurrent SELECT and SELECT FOR SHARE against the tables,
but not any actual write operations.


Agreed.. But I think reducing the lock level of the secondary table is 
much more important than doing the same for the primary table due to the 
case where the secondary table is an existing table which is hit by a 
workload of long running queries and DML while the primary is a new 
table which is added now. In my dream world I could add the new table 
without any disruption at all of queries using the secondary table, no 
matter the duration of the transaction adding the table (barring 
insertion of actual data into the primary table, which would take row 
locks).


This is just a dream scenario though, and focusing on triggers is indeed 
the reasonable goal for 9.5.


--
Andreas Karlsson


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Function array_agg(array)

2014-10-26 Thread Ali Akbar
2014-10-27 1:38 GMT+07:00 Pavel Stehule pavel.steh...@gmail.com:

 Hi

 My idea is using new ArrayBuilder optimized for building multidimensional
 arrays with own State type. I think so casting to ArrayBuildState is base
 of our problems, so I don't would to do. Code in array_agg_* is simple,
 little bit more complex code is in nodeSubplan.c. Some schematic changes
 are in attachments.


Thanks! The structure looks clear, and thanks for the example on
nodeSubplan.c. I will restructure the v10 of the patch to this structure.

Regards,
--
Ali Akbar


Re: [HACKERS] [BUGS] ltree::text not immutable?

2014-10-26 Thread Jim Nasby

On 10/26/14, 3:46 PM, Tom Lane wrote:

I wrote:

More generally, it seems like we ought to have a test in the type_sanity
regression script that checks that type I/O functions aren't volatile,
...



Actually, the right thing to do if we want to enforce this is for
CREATE TYPE to check the marking.  We'd still need a type_sanity
test to check erroneous declarations of built-in types, but a complaint
in CREATE TYPE would cover all the contrib modules as well as third-party
code.



I wouldn't propose back-patching such an error check, but it seems
reasonable to add it for 9.5.  Any objections?


On the way to fixing this, I was unpleasantly surprised to discover that
we have one I/O function in contrib that *actually is* volatile, and not
just thoughtlessly marked that way.  That's chkpass_in(), which is
volatile because it goes to the trouble of using random() to produce a
new password salt value on every call.

Not entirely sure what to do about this.  It seems like for the purposes
of contrib/chkpass, it's a good thing that chkpass_in() won't reuse the
same salt.  Weak as a 12-bit salt might be nowadays, it's still better
than no salt.  Nonetheless, this behavior is breaking assumptions made
in places like array_in and record_in.

For the moment I'm tempted to mark chkpass_in as stable (with a comment
explaining that it isn't really) just so we can put in the error check
in CREATE TYPE.  But I wonder if anyone has a better idea.


The check could explicitly ignore chkpass_in. That's pretty grotty, but perhaps 
less so than marking it stable when it's not...
--
Jim Nasby, Data Architect, Blue Treble Consulting
Data in Trouble? Get it in Treble! http://BlueTreble.com


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] alter user/role CURRENT_USER

2014-10-26 Thread Rushabh Lathia
Thanks Kyotaro,

I just did quickly looked at the patch and it does cover more syntax then
earlier patch. But still if doesn't not cover the all the part of syntax
where
we can use CURRENT_USER/CURRENT_ROLE/USER/SESSION_USER. For example:

-- Not working
alter default privileges for role current_user grant SELECT ON TABLES TO
current_user ;

-- Not working
grant user1 TO current_user;

Their might few more syntax like this.

I understand that patch is  sightly getting bigger and complex then what it
was
originally proposed. Before going back to more review on latest patch I
would
like to understand the requirement of this new feature. Also would like
others
to comment on where/how we should restrict this feature ?

On Fri, Oct 24, 2014 at 1:59 PM, Kyotaro HORIGUCHI 
horiguchi.kyot...@lab.ntt.co.jp wrote:

 Hi, here is the revised patch.

 Attached files are the followings

  - 0001-ALTER-ROLE-CURRENT_USER_v2.patch  - the patch.

  - testset.tar.bz2 - test set. Run by typing 'make check' as a
superuser of the running postgreSQL server. It creates testdb
and some roles.

 Documents are not edited this time.

 

 Considering your comments, I found more points to modify.

  - CREATE SCHEMA (IF NOT EXISTS) .. AUTHORIZATION role ...

  - ALTER AGGREAGE/COLLATION/etc... OWNER TO role

  - CREATE/ALTER/DROP USER MAPPING FOR role SERVER ..

 GRANT/REVOKE also takes role as an arguemnt but CURRENT_USER and
 the similar keywords seem to be useless for them.

 Finally, the new patch modifies the following points.

 In gram.y,

  - RoleId's are replaced with RoleId_or_curruser in more places.
It accepts CURRENT_USER/USER/CURRENT_ROLE/SESSION_USER.

  - ALTER USER ALL syntax is added. (not changed from the previous patch)

  - The non-terminal auth_ident now uses RoleId_or_curruser
instead of RoleId. This affects CREATE/ALTER/DROP USER MAPPING

 In user.c, new function ResolveRoleId() is added and used for all
 role ID resolutions that correspond to the syntax changes in
 parser. It is AlterRole() in user.c.

 In foreigncmds.c, GetUserOidFromMapping() is removed and
 ResolveRoleId is used instead.

 In alter.c and tablecmds.c, all calls to get_role_oid()
 correspond the the grammer change were replaced with
 ResolveRoleId().

 The modifications are a bit complicated so I provided a
 comprehensive test set.


 regards,

 --
 Kyotaro Horiguchi
 NTT Open Source Software Center




-- 
Rushabh Lathia