Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Christian Kruse
Hi,

On 28/01/14 22:35, Tom Lane wrote:
  Absolutely.  Probably best to save errno into a local just before the
  ereport.
 
  I think just resetting to edata-saved_errno is better and sufficient?
 
 -1 --- that field is nobody's business except elog.c's.

Ok, so I propose the attached patch as a fix.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 8705586..f40215a 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -715,6 +715,7 @@ errcode_for_socket_access(void)
 	{ \
 		char		   *fmtbuf; \
 		StringInfoData	buf; \
+		int saved_errno = errno; \
 		/* Internationalize the error format string */ \
 		if (translateit  !in_error_recursion_trouble()) \
 			fmt = dgettext((domain), fmt);  \
@@ -744,6 +745,7 @@ errcode_for_socket_access(void)
 			pfree(edata-targetfield); \
 		edata-targetfield = pstrdup(buf.data); \
 		pfree(buf.data); \
+		errno = saved_errno; \
 	}
 
 /*
@@ -756,6 +758,7 @@ errcode_for_socket_access(void)
 		const char	   *fmt; \
 		char		   *fmtbuf; \
 		StringInfoData	buf; \
+		int saved_errno = errno; \
 		/* Internationalize the error format string */ \
 		if (!in_error_recursion_trouble()) \
 			fmt = dngettext((domain), fmt_singular, fmt_plural, n); \
@@ -787,6 +790,7 @@ errcode_for_socket_access(void)
 			pfree(edata-targetfield); \
 		edata-targetfield = pstrdup(buf.data); \
 		pfree(buf.data); \
+		errno = saved_errno; \
 	}
 
 


pgpnd3GqyaztU.pgp
Description: PGP signature


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Rajeev rastogi
On 28th January, Mitsumasa KONDO wrote:
  2014-01-26 Simon Riggs si...@2ndquadrant.com
  mailto:si...@2ndquadrant.com
 
   On 21 January 2014 19:48, Simon Riggs si...@2ndquadrant.com
   mailto:si...@2ndquadrant.com wrote:
 On 21 January 2014 12:54, KONDO Mitsumasa
  kondo.mitsum...@lab.ntt.co.jp
   mailto:kondo.mitsum...@lab.ntt.co.jp wrote:
 Rebased patch is attached.

 Does this fix the Windows bug reported by Kumar on
 20/11/2013 ?
  Sorry, I was misunderstanding. First name of Mr. Rajeev Rastogi is
  Kumar! I searched only e-mail address and title by his name...
 
  I don't have windows compiler enviroment, but attached patch might
 be
  fixed.
  Could I ask Mr. Rajeev Rastogi to test my patch again?
 
  I tried to test this but I could not apply the patch on latest git
 HEAD.
  This may be because of recent patch (related to pg_stat_statement
 only
  pg_stat_statements external query text storage ), which got
  committed on 27th January.
 Thank you for trying to test my patch. As you say, recently commit
 changes pg_stat_statements.c a lot. So I have to revise my patch.
 Please wait for a while.
 
 By the way, latest pg_stat_statement might affect performance in
 Windows system.
 Because it uses fflush() system call every creating new entry in
 pg_stat_statements, and it calls many fread() to warm file cache. It
 works well in Linux system, but I'm not sure in Windows system. If you
 have time, could you test it on your Windows system? If it affects
 perfomance a lot, we can still change it.


No Issue, you can share me the test cases, I will take the performance report.

Thanks and Regards,
Kumar Rajeev Rastogi


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: regexp_matches variant returning an array of matching positions

2014-01-29 Thread Björn Harrtell
I'll elaborate on the use case. I have OCR scanned text for a large amounts
of images, corresponding to one row per image. I want to match against
words in another table. I need two results sets, one with all matched words
and one with only the first matched word within the first 50 chars of the
OCR scanned text. Having the matched position in the first result set makes
it easy to produce the second.

I cannot find the position using the substring because I use word
boundaries in my regexp.

Returning a SETOF named composite makes sense, so I could try to make such
a function instead if there is interest. Perhaps a good name for such a
function would be simply regexp_match och regexp_search (as in python).

/Björn


2014-01-29 David Johnston pol...@yahoo.com

 Alvaro Herrera-9 wrote
  Björn Harrtell wrote:
  I've written a variant of regexp_matches called regexp_matches_positions
  which instead of returning matching substrings will return matching
  positions. I found use of this when processing OCR scanned text and
  wanted
  to prioritize matches based on their position.
 
  Interesting.  I didn't read the patch but I wonder if it would be of
  more general applicability to return more info in a fell swoop a
  function returning a set (position, length, text of match), rather than
  an array.  So instead of first calling one function to get the match and
  then their positions, do it all in one pass.
 
  (See pg_event_trigger_dropped_objects for a simple example of a function
  that returns in that fashion.  There are several others but AFAIR that's
  the simplest one.)

 Confused as to your thinking. Like regexp_matches this returns SETOF
 type[].  In this case integer but text for the matches.  I could see
 adding
 a generic function that returns a SETOF named composite (match varchar[],
 position int[], length int[]) and the corresponding type.  I'm not
 imagining
 a situation where you'd want the position but not the text and so having to
 evaluate the regexp twice seems wasteful.  The length is probably a waste
 though since it can readily be gotten from the text and is less often
 needed.  But if it's pre-calculated anyway...

 My question is what position is returned in a multiple-match situation? The
 supplied test only covers the simple, non-global, situation.  It needs to
 exercise empty sub-matches and global searches.  One theory is that the
 first array slot should cover the global position of match zero (i.e., the
 entire pattern) within the larger document while sub-matches would be
 relative offsets within that single match.  This conflicts, though, with
 the
 fact that _matches only returns array elements for () items and never for
 the full match - the goal in this function being parallel un-nesting. But
 as
 nesting is allowed it is still possible to have occur.

 How does this resolve in the patch?

 SELECT regexp_matches('abcabc','((a)(b)(c))','g');

 David J.







 --
 View this message in context:
 http://postgresql.1045698.n5.nabble.com/Patch-regexp-matches-variant-returning-an-array-of-matching-positions-tp5789321p5789414.html
 Sent from the PostgreSQL - hackers mailing list archive at Nabble.com.




Re: [HACKERS] proposal: hide application_name from other users

2014-01-29 Thread Dave Page
On Tue, Jan 28, 2014 at 8:17 PM, Stephen Frost sfr...@snowman.net wrote:
 Greg,

 * Greg Stark (st...@mit.edu) wrote:
 On Tue, Jan 28, 2014 at 11:56 AM, Josh Berkus j...@agliodbs.com wrote:
  For example, I would really like to GRANT an unpriv user access to the
  WAL columns in pg_stat_replication so that I can monitor replication
  delay without granting superuser permissions.

 So you can do this now by defining a security definer function that
 extracts precisely the information you need and grant execute access
 to precisely the users you want. There was some concern upthread about
 defining security definer functions being tricky but I'm not sure what
 conclusion to draw from that argument.

 Yeah, but that sucks if you want to build a generic monitoring system
 like check_postgres.pl.  Telling users to grant certain privileges may
 work out, telling them to install these pl/pgsql things you write as
 security-definer-to-superuser isn't going to be nearly as easy when
 these users are (understandably, imv) uncomfortable having a monitor
 role have superusr privs.

I couldn't agree more. Whatever we do here we need a standard
mechanism that tool developers can expect to be present and the same
on all servers. Otherwise, we make it extremely difficult to build
tools like pgAdmin, check_postgres.pl and so on.

-- 
Dave Page
Blog: http://pgsnake.blogspot.com
Twitter: @pgsnake

EnterpriseDB UK: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-29 Thread Dean Rasheed
On 28 January 2014 20:16, Florian Pflug f...@phlo.org wrote:
 On Jan27, 2014, at 23:28 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 strict transfn vs non-strict inv_transfn
 

 This case is explicitly forbidden, both in CREATE AGGREGATE and in the
 executor. To me, that seems overly restrictive --- if transfn is
 strict, then you know for sure that no NULL values are added to the
 aggregate state, so surely it doesn't matter whether or not
 inv_transfn is strict. It will never be asked to remove NULL values.

 I think there are definite use-cases where a user might want to use a
 pre-existing function as the inverse transition function, so it seems
 harsh to force them to wrap it in a strict function in this case.

 AFAICS, the code in advance/retreat_windowaggregate should just work
 if those checks are removed.

 True. It didn't use to in earlier version of the patch because the advance
 and retreat functions looked at the strict settings directly, but now that
 there's an ignore_nulls flag in the executor node, the only requirement
 should be that ignore_nulls is set if either of the transition functions
 is strict.

 I'm not sure that the likelihood of someone wanting to combine a strict
 forward with a non-strict inverse function is very hight, but neither


Me neither, but the checks to forbid it aren't adding anything, and I
think it's best to make it as flexible as possible.


 non-strict transfn vs strict inv_transfn
 

 At first this seems as though it must be an error --- the forwards
 transition function allows NULL values into the aggregate state, and
 the inverse transition function is strict, so it cannot remove them.

 But actually what the patch is doing in this case is treating the
 forwards transition function as partially strict --- it won't be
 passed NULL values (at least in a window context), but it may be
 passed a NULL state in order to build the initial state when the first
 non-NULL value arrives.

 Exactly.

 It looks like this behaviour is intended to support aggregates that
 ignore NULL values, but cannot actually be made to have a strict
 transition function. I think typically this is because the aggregate's
 initial state is NULL and it's state type differs from the type of the
 values being aggregated, and so can't be automatically created from
 the first non-NULL value.

 Yes. I added this because the alternative would haven been to count
 non-NULL values in most of the existing SUM() aggregates.

 That all seems quite ugly though, because now you have a transition
 function that is not strict, which is passed NULL values when used in
 a normal aggregate context, and not passed NULL values when used in a
 window context (whether sliding or not), except for the NULL state for
 the first non-NULL value.

 Ugh, true. Clearly, nodeAgg.c needs to follow what nodeWindowAgg.c does
 and skip NULL inputs if the aggregate has a strict inverse transition
 function. That fact that we never actually *call* the inverse doesn't
 matter. Will fix that once we decided on the various strictness issues.


Yuk!


 I'm not sure if there is a better way to do it though. If we disallow
 this case, these aggregates would have to use non-strict functions for
 both the forward and inverse transition functions, which means they
 would have to implement their own non-NULL value counting.
 Alternatively, allowing strict transition functions for these
 aggregates would require that we provide some other way to initialise
 the state from the first non-NULL input, such as a new initfn.

 I'm not sure an initfn would really help. It would allow us to return
 to the initial requirement that the strict settings match - but you
 already deem the check that the forward transition function can only
 be strict if the inverse is also overly harsh, so would that really be
 an improvement? It's also cost us an additional pg_proc entry per aggregate.

 Another idea would be to have an explicit nulls=ignore|process option
 for CREATE AGGREGATE. If nulls=process and either of the transition
 functions are strict, we could either error out, or simply do what
 normal functions calls do and pretend they return NULL for NULL inputs.
 Not sure how the rule that forward transition functions may not return
 NULL if there's an inverse transition function would fit in if we do
 the latter, though.

 The question is - is it worth it the effort to add that flag?


Yeah, I thought about a flag too. I think that could work quite nicely.

Basically where I'm coming from is trying to make this as flexible as
possible, without pre-judging the kinds of aggregates that users may
want to add.

One use-case I had in mind upthread was suppose you wanted to write a
custom version of array_agg that only collected non-NULL values. With
such a flag, that would be trivial, but with the current patch you'd
have to (count-intuitively) wrap the inverse transition 

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Magnus Hagander
On Wed, Jan 29, 2014 at 9:03 AM, KONDO Mitsumasa 
kondo.mitsum...@lab.ntt.co.jp wrote:

 (2014/01/29 15:51), Tom Lane wrote:
  KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp writes:
  By the way, latest pg_stat_statement might affect performance in
 Windows system.
  Because it uses fflush() system call every creating new entry in
  pg_stat_statements, and it calls many fread() to warm file cache.
  This statement doesn't seem to have much to do with the patch as
  committed.  There are no fflush calls, and no notion of warming the
  file cache either.
 Oh, all right.

  We do assume that the OS is smart enough to keep
  a frequently-read file in cache ... is Windows too stupid for that?
 I don't know about it. But I think Windows cache feature is stupid.
 It seems to always write/read data to/from disk, nevertheless having large
 memory...
 I'd like to know test result on Windows, if we can...


I am quite certain the Windows cache is *not* that stupid, and hasn't been
since the Windows 3.1 days.

If you want to make certain, set  FILE_ATTRIBUTE_TEMPORARY when the file is
opened, then it will work really hard not to write it to disk - harder than
most Linux fs'es do AFAIK. This should of course only be done if we don't
mind it going away :)

As per port/open.c, pg will set this when O_SHORT_LIVED is specified. But
AFAICT, we don't actually use this *anywhere* in the backend? Perhaps we
should for this - and also for the pgstat files?

(I may have missed a codepath, only had a minute to do some greping)


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Triggers on foreign tables

2014-01-29 Thread Kouhei Kaigai
Hello,

 I initially tried to keep track of them by allocating read pointers on the
 tuple store, but it turned out to be so expensive that I had to find another
 way (24bytes per stored tuple, which are not reclaimable until the end of
 the transaction).
 
 What do you think about this approach ? Is there something I missed which
 would make it not sustainable ?

It seems to me reasonable approach to track them.
Just a corner case, it may cause an unexpected problem if someone tried to
update a foreign table with 2^31 of tuples because of int index.
It may make sense to put a check fdw_nextwrite is less than INT_MAX. :-)


I have a few other minor comments:

+static HeapTuple
+ExtractOldTuple(TupleTableSlot *planSlot,
+   ResultRelInfo *relinfo)
+{
+   boolisNull;
+   JunkFilter *junkfilter = relinfo-ri_junkFilter;
+   HeapTuple   oldtuple = palloc0(sizeof(HeapTupleData));
+   HeapTupleHeader td;
+   Datum   datum = ExecGetJunkAttribute(planSlot,
+junkfilter-jf_junkAttNo,
+isNull);
+
+   /* shouldn't ever get a null result... */
+   if (isNull)
+   elog(ERROR, wholerow is NULL);
+   td = DatumGetHeapTupleHeader(datum);
+   (*oldtuple).t_len = HeapTupleHeaderGetDatumLength(td);
+   (*oldtuple).t_data = td;
+   return oldtuple;
+}
+

Why not usual coding manner as:
  oldtuple-t_len = HeapTupleHeaderGetDatumLength(td);
  oldtuple-t_data = td;

Also, it don't put tableOid on the tuple.
  oldtuple-t_tableOid = RelationGetRelid(relinfo-ri_RelationDesc);


@@ -730,6 +738,45 @@ rewriteTargetListIU(Query *parsetree, Relation 
target_relation,
+   /*
+* For foreign tables, build a similar array for returning tlist.
+*/
+   if (need_full_returning_tlist)
+   {
+   returning_tles = (TargetEntry **) palloc0(numattrs * sizeof(TargetEntry 
*));
+   foreach(temp, parsetree-returningList)
+   {
+   TargetEntry *old_rtle = (TargetEntry *) lfirst(temp);
+
+   if (IsA(old_rtle-expr, Var))
+   {
+   Var*var = (Var *) old_rtle-expr;
+
+   if (var-varno == parsetree-resultRelation)
+   {
+   attrno = var-varattno;
+   if (attrno  1 || attrno  numattrs)
+   elog(ERROR, bogus resno %d in targetlist, attrno);

This checks caused an error when returning list contains a reference to
system column; that has negative attribute number.
Probably, it should be continue;, instead of elog().

BTW, isn't it sufficient to inhibit optimization by putting whole-row-reference
here, rather than whole-row-reference. Postgres_fdw extracts whole-row-reference
into individual columns reference.

+   att_tup = target_relation-rd_att-attrs[attrno - 1];
+   /* We can (and must) ignore deleted attributes */
+   if (att_tup-attisdropped)
+   continue;
+   returning_tles[attrno - 1] = old_rtle;
+   }
+   }
+   }
+   }
+

Thanks,
--
NEC OSS Promotion Center / PG-Strom Project
KaiGai Kohei kai...@ak.jp.nec.com


 -Original Message-
 From: pgsql-hackers-ow...@postgresql.org
 [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Ronan Dunklau
 Sent: Thursday, January 23, 2014 11:18 PM
 To: Noah Misch
 Cc: pgsql-hackers@postgresql.org
 Subject: Re: [HACKERS] Triggers on foreign tables
 
 Thank you very much for this review.
 
  The need here is awfully similar to a need of INSTEAD OF triggers on views.
  For them, we add a single wholerow resjunk TLE.  Is there a good
  reason to do it differently for triggers on foreign tables?
 
 I wasn't aware of that, it makes for some much cleaner code IMO. Thanks.
 
   - for after triggers, the whole queuing mechanism is bypassed for
   foreign tables. This is IMO acceptable, since foreign tables cannot
   have constraints or constraints triggers, and thus have not need for
   deferrable execution. This design avoids the need for storing and
   retrieving/identifying remote tuples until the query or transaction
 end.
 
  Whether an AFTER ROW trigger is deferred determines whether it runs at
  the end of the firing query or at the end of the firing query's transaction.
  In all cases, every BEFORE ROW trigger of a given query fires before
  any AFTER ROW trigger of the same query.  SQL requires that.  This
  proposal would give foreign table AFTER ROW triggers a novel firing
  time; let's not do that.
 
  I think the options going forward are either (a) design a way to queue
  foreign table AFTER ROW triggers such that we can get the old and/or
  new rows at the end of the query or (b) not support AFTER ROW triggers
  on foreign tables for the time being.
 
 
 I did not know this was mandated by the standard.
 
 The attached patch tries to solve this problem by allocating a tuplestore
 in the global afterTriggers structure. This tuplestore is used for 

Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Magnus Hagander
On Wed, Jan 29, 2014 at 8:58 AM, Peter Geoghegan p...@heroku.com wrote:

 I am not opposed in principle to adding new things to the counters
 struct in pg_stat_statements. I just think that the fact that the
 overhead of installing the module on a busy production system is
 currently so low is of *major* value, and therefore any person that
 proposes to expand that struct should be required to very conclusively
 demonstrate that there is no appreciably increase in overhead. Having
 a standard deviation column would be nice, but it's still not that
 important. Maybe when we have portable atomic addition we can afford
 to be much more inclusive of that kind of thing.


Not (at this point) commenting on the details, but a big +1 on the fact
that the overhead is low is a *very* important property. If the overhead
starts growing it will be uninstalled - and that will make things even
harder to diagnose.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Changeset Extraction v7.3

2014-01-29 Thread Albe Laurenz
Andreas Karlsson wrote:
 On 01/28/2014 10:56 PM, Andres Freund wrote:
 On 2014-01-28 21:48:09 +, Thom Brown wrote:
 On 28 January 2014 21:37, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 11:53 AM, Robert Haas robertmh...@gmail.com 
 wrote:
 The point of Andres's patch set is to introduce a new technology
 called logical decoding; that is, the ability to get a replication
 stream that is based on changes to tuples rather than changes to
 blocks.  It could also be called logical replication.  In these
 patches, our existing replication is referred to as physical
 replication, which sounds kind of funny to me.  Anyone have another
 suggestion?

 Logical and Binary replication?

 Unfortunately changeset extraction output's can be binary data...
 
 I think physical and logical are fine and they seem to be well known
 terminology. Oracle uses those words and I have also seen many places
 use physical backup and logical backup, for example on Barman's
 homepage.

+1

I think it also fits well with the well-known terms physical [database]
design and logical design.  Not that it is the same thing, but I
believe that every database person, when faced with the distiction
physical versus logical, will conclude that the former refers to
data placement or block structure, while the latter refers to the
semantics of the data being stored.

Yours,
Laurenz Albe

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Craig Ringer
On 01/29/2014 03:34 PM, Kouhei Kaigai wrote:
 Kouhei Kaigai kai...@ak.jp.nec.com writes:
 Let me ask an elemental question. What is the reason why inheritance
 expansion logic should be located on the planner stage, not on the
 tail of rewriter?

 I think it's mostly historical.  You would however have to think of a way
 to preserve the inheritance relationships in the parsetree representation.
 In the current code, expand_inherited_tables() adds AppendRelInfo nodes
 to the planner's data structures as it does the expansion; but I don't think
 AppendRelInfo is a suitable structure for the rewriter to work with, and
 in any case there's no place to put it in the Query representation.

 Actually though, isn't this issue mostly about inheritance of a query
 *target* table?  Moving that expansion to the rewriter is a totally
 different and perhaps more tractable change.  It's certainly horribly ugly
 as it is; hard to see how doing it at the rewriter could be worse.

 It's just an idea, so might not be a deep consideration.
 
 Isn't ii available to describe a parse tree as if some UPDATE/DELETE 
 statements
 are combined with UNION ALL? Of course, even if it is only internal form.
 
   UPDATE parent SET x = 2*x, y = y || '_update' WHERE x % 10 = 5
 UNION ALL
   UPDATE child_1 SET x = 2*x, y = y || '_update' WHERE x % 10 = 5
  :
 
 Right now, only SELECT statement is allowed being placed under set-operations.
 If rewriter can expand inherited relations into multiple individual selects
 with UNION ALL, it may be a reasonable internal representation.
 
 In similar way,  both of UPDATE/DELETE takes a scan on relation once, then
 it modifies the target relation. Probably, here is no significant different
 on the earlier half; that performs as if multiple SELECTs with UNION ALL are
 running, except for it fetches ctid system column as junk attribute and 
 acquires row-level locks.
 
 On the other hand, it may need to adjust planner code to construct
 ModifyTable node running on multiple relations. Existing code pulls
 resultRelations from AppendRelInfo, doesn't it? It needs to take the list
 of result relations using recursion of set operations, if not flatten.
 Once we can construct ModifyTable with multiple relations on behalf of
 multiple relation's scan, here is no difference from what we're doing now.
 
 How about your opinion?


My worry here is that a fair bit of work gets done before inheritance
expansion. Partitioning already performs pretty poorly for anything but
small numbers of partitions.

If we're expanding inheritance in the rewriter, won't that further
increase the already large amount of duplicate work involved in planning
a query that involves inheritance?

Or am I misunderstanding you?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add force option to dropdb

2014-01-29 Thread salah jubeh
 
Hello Robert,

I'm not particularly in favor of implementing this as client-side
functionality, because then it's only available to people who use that
particular client.  Simon Riggs proposed a server-side option to the
DROP DATABASE command some time ago, and I think that might be the way
to go.

Could
you please direct me -if possible- to the thread. I think,implementing
it on the client side gives the user the some visibility and control. 
Initially, I wanted to force drop the
database, then I have changed it to kill connections. I think the
change in the client tool, is simple and covers the main reason for not
being able to drop a database. I think, killing client connection is one of the 
FAQs. 
 


OTOH,
having an option like DROP DATABASE [IF EXISTS, FORCE] database is
more crisp. However, what does force mean?  many options exist such as 
killing the connections gently, waiting
for connections to terminate, killing connections immediately. Also,
as Alvaro Herrera mentioned, DROP OWNED BY and/or REASSIGNED OWNED BY
might hinder the force option -an example here would be nice-. So, for quick 
wins, I prefer the kill
option in the client side; but, for mature solution , certainly back-end is the 
way to proceed.


Regards


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-01-29 Thread Heikki Linnakangas

On 01/28/2014 07:01 PM, Heikki Linnakangas wrote:

On 01/27/2014 07:03 PM, Amit Kapila wrote:

I have tried to improve algorithm in another way so that we can get
benefit of same chunks during find match (something similar to lz).
The main change is to consider chunks at fixed boundary (4 byte)
and after finding match, try to find if there is a longer match than
current chunk. While finding longer match, it still takes care that
next bigger match should be at chunk boundary. I am not
completely sure about the chunk boundary may be 8 or 16 can give
better results.


Since you're only putting a value in the history every 4 bytes, you
wouldn't need to calculate the hash in a rolling fashion. You could just
take next four bytes, calculate hash, put it in history table. Then next
four bytes, calculate hash, and so on. Might save some cycles when
building the history table...


On a closer look, you're putting a chunk in the history table only every 
four bytes, but you're *also* checking the history table for a match 
only every four bytes. That completely destroys the shift-resistence of 
the algorithm. For example, if the new tuple is an exact copy of the old 
tuple, except for one additional byte in the beginning, the algorithm 
would fail to recognize that. It would be good to add a test case like 
that in the test suite.


You can skip bytes when building the history table, or when finding 
matches, but not both. Or you could skip N bytes, and then check for 
matches for the next four bytes, then skip again and so forth, as long 
as you always check four consecutive bytes (because the hash function is 
calculated from four bytes).


I couldn't resist the challenge, and started hacking this. First, some 
observations from your latest patch (pgrb_delta_encoding_v5.patch):


1. There are a lot of comments and code that refers to chunks, which 
seem obsolete. For example, ck_size field in PGRB_HistEntry is always 
set to a constant, 4, except maybe at the very end of the history 
string. The algorithm has nothing to do with Rabin-Karp anymore.


2. The 'hindex' field in PGRB_HistEntry is unused. Also, ck_start_pos is 
redundant with the index of the entry in the array, because the array is 
filled in order. That only leaves us just the 'next' field, and that can 
be represented as a int16 rather than a pointer. So, we really only need 
a simple int16 array as the history entries array.


3. You're not gaining much by calculating the hash in a rolling fashion. 
A single rolling step requires two multiplications and two sums, plus 
shifting the variables around. Calculating the hash from scratch 
requires three multiplications and three sums.


4. Given that we're not doing the Rabin-Karp variable-length chunk 
thing, we could use a cheaper hash function to begin with. Like, the one 
used in pglz. The multiply-by-prime method probably gives fewer 
collisions than pglz's shift/xor method, but I don't think that matters 
as much as computation speed. No-one has mentioned or measured the 
effect of collisions in this thread; that either means that it's a 
non-issue or that no-one's just realized how big a problem it is yet. 
I'm guessing that it's not a problem, and if it is, it's mitigated by 
only trying to find matches every N bytes; collisions would make finding 
matches slower, and that's exactly what skipping helps with.


After addressing the above, we're pretty much back to PGLZ approach. I 
kept the change to only find matches every four bytes, that does make 
some difference. And I like having this new encoding code in a separate 
file, not mingled with pglz stuff, it's sufficiently different that 
that's better. I haven't done all much testing with this, so take it 
with a grain of salt.


I don't know if this is better or worse than the other patches that have 
been floated around, but I though I might as well share it..


- Heikki
diff --git a/doc/src/sgml/ref/create_table.sgml b/doc/src/sgml/ref/create_table.sgml
index e0b8a4e..c4ac2bd 100644
--- a/doc/src/sgml/ref/create_table.sgml
+++ b/doc/src/sgml/ref/create_table.sgml
@@ -1014,6 +1014,22 @@ CREATE [ [ GLOBAL | LOCAL ] { TEMPORARY | TEMP } | UNLOGGED ] TABLE [ IF NOT EXI
 /listitem
/varlistentry
 
+   varlistentry
+termliteralwal_compress_update/ (typeboolean/)/term
+listitem
+ para
+  Enables or disables the WAL tuple compression for commandUPDATE/
+  on this table.  Default value of this option is false to maintain
+  backward compatability for the command. If true, all the update
+  operations on this table which will place the new tuple on same page
+  as it's original tuple will compress the WAL for new tuple and
+  subsequently reduce the WAL volume.  It is recommended to enable
+  this option for tables where commandUPDATE/ changes less than
+  50 percent of tuple data.
+ /para
+ /listitem
+/varlistentry
+
/variablelist
 
   /refsect2
diff --git 

Re: [HACKERS] patch: option --if-exists for pg_dump

2014-01-29 Thread Jeevan Chalke
Hi Pavel,

Now the patch looks good to me. However when I try to restore your own sql
file's dump, I get following errors:

pg_restore: [archiver (db)] could not execute query: ERROR:  relation
public.emp does not exist
Command was: DROP TRIGGER IF EXISTS emp_insert_trigger ON public.emp;

pg_restore: [archiver (db)] could not execute query: ERROR:  schema
myschema does not exist
Command was: DROP FUNCTION IF EXISTS myschema.int_to_date(integer);

Is that expected after your patch ?

Also, I didn't quite understand these lines of comments:

/*
 * Descriptor string (te-desc) should not be same
as object
 * specifier for DROP STATEMENT. The DROP DEFAULT
has not
 * IF EXISTS clause - has not sense.
 */

Will you please rephrase ?

Thanks
-- 
Jeevan B Chalke
Principal Software Engineer, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] patch: option --if-exists for pg_dump

2014-01-29 Thread Pavel Stehule
2014-01-29 Jeevan Chalke jeevan.cha...@enterprisedb.com

 Hi Pavel,

 Now the patch looks good to me. However when I try to restore your own sql
 file's dump, I get following errors:

 pg_restore: [archiver (db)] could not execute query: ERROR:  relation
 public.emp does not exist
 Command was: DROP TRIGGER IF EXISTS emp_insert_trigger ON public.emp;

 pg_restore: [archiver (db)] could not execute query: ERROR:  schema
 myschema does not exist
 Command was: DROP FUNCTION IF EXISTS myschema.int_to_date(integer);

 Is that expected after your patch ?


it should be fixed by
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b152c6cd0de1827ba58756e24e18110cf902182acommit



 Also, I didn't quite understand these lines of comments:

 /*
  * Descriptor string (te-desc) should not be same
 as object
  * specifier for DROP STATEMENT. The DROP DEFAULT
 has not
  * IF EXISTS clause - has not sense.
  */

 Will you please rephrase ?


I can try it - .

A content of te-desc is usually substring of DROP STATEMENT with one
related exception - CONSTRAINT.
Independent to previous sentence - ALTER TABLE ALTER COLUMN DROP DEFAULT
doesn't support IF EXISTS - and therefore it should not be injected.

Regards

Pavel



 Thanks
 --
 Jeevan B Chalke
 Principal Software Engineer, Product Development
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company




Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Simon Riggs
On 28 January 2014 21:28, Robert Haas robertmh...@gmail.com wrote:
 On Tue, Jan 28, 2014 at 5:02 AM, Simon Riggs si...@2ndquadrant.com wrote:
 I agree that this is being seen the wrong way around. The planner
 contains things it should not do, and as a result we are now
 discussing enhancing the code that is in the wrong place, which of
 course brings objections.

 I think we would be best served by stopping inheritance in its tracks
 and killing it off. It keeps getting in the way. What we need is real
 partitioning. Other uses are pretty obscure and we should re-examine
 things.

 I actually think that inheritance is a pretty good foundation for real
 partitioning.  If we were to get rid of it, we'd likely end up needing
 to add most of the same functionality back when we tried to do some
 kind of real partitioning later, and that doesn't sound fun.  I don't
 see any reason why some kind of partitioning syntax couldn't be added
 that leverages the existing inheritance mechanism but stores
 additional metadata allowing for better optimization.

 Well... I'm lying, a little bit.  If our chosen implementation of
 real partitioning involved some kind of partition objects that could
 be created and dropped but never directly accessed via DML commands,
 then we might not need anything that looks like the current planner
 support for partitioned tables.  But I think that would be a
 surprising choice for this community.  IMV, the problem with the
 planner and inheritance isn't that there's too much cruft in there
 already, but that there are still key optimizations that are missing.
 Still, I'd rather try to fix that than start over.

 In the absence of that, releasing this updateable-security views
 without suppport for inheritance is a good move. It gives us most of
 what we want now and continuing to have some form of restriction is
 better than having a much greater restriction of it not working at
 all.

 -1.  Inheritance may be a crappy substitute for real partitioning, but
 there are plenty of people using it that way.

When I talk about removing inheritance, of course I see some need for
partitioning.

Given the way generalised inheritance works, it is possible to come up
with some monstrous designs that are then hard to rewrite and plan.

What I propose is that we remove the user-visible generalised
inheritance feature and only allow a more structured version which we
call partitioning. If all target/result relations are identical it
will be much easier to handle things because there'll be no special
purpose code to juggle.

Yes, key optimizations are missing. Overburdening ourselves with
complications that slow us down from delivering more useful features
is sensible. Think of it as feature-level refactoring, rather than the
normal code-level refactoring we frequently discuss.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GSoC 2014 - mentors, students and admins

2014-01-29 Thread Heikki Linnakangas

On 01/28/2014 07:34 PM, Thom Brown wrote:

And I'd be fine with being admin again this year, unless there's
anyone else who would like to take up the mantle?


Please do, thanks!


Who would be up for mentoring this year?


I can mentor.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Simon Riggs
On 29 January 2014 06:43, Tom Lane t...@sss.pgh.pa.us wrote:

 Actually though, isn't this issue mostly about inheritance of a query
 *target* table?

Exactly. IMHO updateable views on inheritance sets will have multiple
other performance problems, so trying to support them here will not
make their usage seamless.

We allowed updateable views to work with restrictions in earlier
releases, so I can't see why continuing with a much reduced
restriction would be a problem in this release. We don't need to
remove the remaining restriction all in one release, so we?

 Moving that expansion to the rewriter is a totally
 different and perhaps more tractable change.  It's certainly horribly ugly
 as it is; hard to see how doing it at the rewriter could be worse.

I see the patch adding some changes to inheritance_planner that might
well get moved to rewriter.
As long as the ugliness all stays in one place, does it matter where
that is -- for this patch -- ? Just asking whether moving it will
reduce the net ugliness of  our inheritance support.

@Craig: I don't think this would have much effect on partitioning
performance. The main cost of that is constraint exclusion, which we
wuld still perform only once. All other copying and re-jigging still
gets performed even for excluded relations, so doing it earlier
wouldn't hurt as much as you might think.

@Dean: If we moved that to rewriter, would expand_security_quals() and
preprocess_rowmarks() also move?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Craig Ringer
On 01/23/2014 06:06 PM, Dean Rasheed wrote:
 On 21 January 2014 09:18, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Yes, please review the patch from 09-Jan
 (http://www.postgresql.org/message-id/CAEZATCUiKxOg=voovja2s6g-sixzzxg18totggp8zobq6qn...@mail.gmail.com).

 
 After further testing I found a bug --- it involves having a security
 barrier view on top of a base relation that has a rule that rewrites
 the query to have a different result relation, and possibly also a
 different command type, so that the securityQuals are no longer on the
 result relation, which is a code path not previously tested and the
 rowmark handling was wrong. That's probably a pretty obscure case in
 the context of security barrier views, but that code path would be
 used much more commonly if RLS were built on top of this. Fortunately
 the fix is trivial --- updated patch attached.

This is the most recent patch I see, and the one I've been working on
top of.

Are there any known tests that this patch fails?

Can we construct any tests that this patch fails? If so, can we make it
pass them, or error out cleanly?

The discussion has gone a bit off the wall a bit - partly my fault I
think - I mentioned inheritance. Lets try to refocus on the immediate
patch at hand, and whether it's good to go.

Right now, I'm not personally aware of tests cases that cause this code
to fail.

There's a good-taste complaint about handling of inheritance, but
frankly, there's not much about inheritance that _is_ good taste. I
don't see that this patch makes it worse, and it's functional.

It might be interesting to revisit some of these broader questions in
9.5, but what can we do to get this functionality in place for 9.4?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Heikki Linnakangas

On 01/28/2014 06:11 PM, Christian Kruse wrote:

Hi,

attached you will find a new version of the patch, ported to HEAD,
fixed the mentioned bug and - hopefully - dealing the the remaining
issues.


Thanks, I have committed this now.

The documentation is still lacking. We should explain somewhere how to 
set nr.hugepages, for example. The Managing Kernel Resources section 
ought to mention setting. Could I ask you to work on that, please?


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Dean Rasheed
On 29 January 2014 11:27, Simon Riggs si...@2ndquadrant.com wrote:
 On 29 January 2014 06:43, Tom Lane t...@sss.pgh.pa.us wrote:

 Actually though, isn't this issue mostly about inheritance of a query
 *target* table?

 Exactly. IMHO updateable views on inheritance sets will have multiple
 other performance problems, so trying to support them here will not
 make their usage seamless.

 We allowed updateable views to work with restrictions in earlier
 releases, so I can't see why continuing with a much reduced
 restriction would be a problem in this release. We don't need to
 remove the remaining restriction all in one release, so we?

 Moving that expansion to the rewriter is a totally
 different and perhaps more tractable change.  It's certainly horribly ugly
 as it is; hard to see how doing it at the rewriter could be worse.

 I see the patch adding some changes to inheritance_planner that might
 well get moved to rewriter.
 As long as the ugliness all stays in one place, does it matter where
 that is -- for this patch -- ? Just asking whether moving it will
 reduce the net ugliness of  our inheritance support.

 @Craig: I don't think this would have much effect on partitioning
 performance. The main cost of that is constraint exclusion, which we
 wuld still perform only once. All other copying and re-jigging still
 gets performed even for excluded relations, so doing it earlier
 wouldn't hurt as much as you might think.

 @Dean: If we moved that to rewriter, would expand_security_quals() and
 preprocess_rowmarks() also move?


Actually I tend to think that expand_security_quals() should remain
where it is, regardless of where inheritance expansion happens. One of
the things that simplifies the job that expand_security_quals() has to
do is that it takes place after preprocess_targetlist(), so the
targetlist for the security barrier subqueries that it constructs is
known. Calling expand_security_quals() earlier would require
additional surgery on preprocess_targetlist() and expand_targetlist(),
and would probably also mean that the Query would need to record the
sourceRelation subquery RTE, as discussed upthread.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-01-29 Thread Amit Kapila
On Wed, Jan 29, 2014 at 3:41 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:
 On 01/28/2014 07:01 PM, Heikki Linnakangas wrote:

 On 01/27/2014 07:03 PM, Amit Kapila wrote:

 I have tried to improve algorithm in another way so that we can get
 benefit of same chunks during find match (something similar to lz).
 The main change is to consider chunks at fixed boundary (4 byte)
 and after finding match, try to find if there is a longer match than
 current chunk. While finding longer match, it still takes care that
 next bigger match should be at chunk boundary. I am not
 completely sure about the chunk boundary may be 8 or 16 can give
 better results.


 Since you're only putting a value in the history every 4 bytes, you
 wouldn't need to calculate the hash in a rolling fashion. You could just
 take next four bytes, calculate hash, put it in history table. Then next
 four bytes, calculate hash, and so on. Might save some cycles when
 building the history table...

First of all thanks for looking into patch.

Yes, this is right, we can save cycles by not doing rolling during hash
calculation and I was working to improve the patch on those lines. Earlier
it was their because of rabin's delta encoding where we need to check
for a special match after each byte.


 On a closer look, you're putting a chunk in the history table only every
 four bytes, but you're *also* checking the history table for a match only
 every four bytes. That completely destroys the shift-resistence of the
 algorithm.

You are right that it will loose the shift-resistence and even Robert has
pointed me this, that's why he wants to maintain the property of special
bytes at chunk boundaries as mentioned in Rabin encoding. The only
real reason to shift to fixed size was to improve CPU usage and I
thought most cases in Update will update the fixed length columns,
but it might not be true.

 For example, if the new tuple is an exact copy of the old tuple,
 except for one additional byte in the beginning, the algorithm would fail to
 recognize that. It would be good to add a test case like that in the test
 suite.

 You can skip bytes when building the history table, or when finding matches,
 but not both. Or you could skip N bytes, and then check for matches for the
 next four bytes, then skip again and so forth, as long as you always check
 four consecutive bytes (because the hash function is calculated from four
 bytes).

Can we do something like:
Build Phase
a. Calculate the hash and add the entry in history table at every 4 bytes.

Match Phase
a. Calculate the hash in rolling fashion and try to find match at every byte.
b. When match is found then skip only in chunks, something like I was
doing in find match function
+
+ /* consider only complete chunk matches. */
+ if (history_chunk_size == 0)
+ thislen += PGRB_MIN_CHUNK_SIZE;
+ }

Will this address the concern?

The main reason to process in chunks as much as possible is to save
cpu cycles. For example if we build hash table byte-by-byte, then even
for best case where most of tuple has a match, it will have reasonable
overhead due to formation of hash table.


 I couldn't resist the challenge, and started hacking this. First, some
 observations from your latest patch (pgrb_delta_encoding_v5.patch):

 1. There are a lot of comments and code that refers to chunks, which seem
 obsolete. For example, ck_size field in PGRB_HistEntry is always set to a
 constant, 4, except maybe at the very end of the history string. The
 algorithm has nothing to do with Rabin-Karp anymore.

 2. The 'hindex' field in PGRB_HistEntry is unused. Also, ck_start_pos is
 redundant with the index of the entry in the array, because the array is
 filled in order. That only leaves us just the 'next' field, and that can be
 represented as a int16 rather than a pointer. So, we really only need a
 simple int16 array as the history entries array.

 3. You're not gaining much by calculating the hash in a rolling fashion. A
 single rolling step requires two multiplications and two sums, plus shifting
 the variables around. Calculating the hash from scratch requires three
 multiplications and three sums.

 4. Given that we're not doing the Rabin-Karp variable-length chunk thing, we
 could use a cheaper hash function to begin with. Like, the one used in pglz.
 The multiply-by-prime method probably gives fewer collisions than pglz's
 shift/xor method, but I don't think that matters as much as computation
 speed. No-one has mentioned or measured the effect of collisions in this
 thread; that either means that it's a non-issue or that no-one's just
 realized how big a problem it is yet. I'm guessing that it's not a problem,
 and if it is, it's mitigated by only trying to find matches every N bytes;
 collisions would make finding matches slower, and that's exactly what
 skipping helps with.

 After addressing the above, we're pretty much back to PGLZ approach.

Here during match phase, I think we can avoid copying literal 

Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Dean Rasheed
On 29 January 2014 11:34, Craig Ringer cr...@2ndquadrant.com wrote:
 On 01/23/2014 06:06 PM, Dean Rasheed wrote:
 On 21 January 2014 09:18, Dean Rasheed dean.a.rash...@gmail.com wrote:
 Yes, please review the patch from 09-Jan
 (http://www.postgresql.org/message-id/CAEZATCUiKxOg=voovja2s6g-sixzzxg18totggp8zobq6qn...@mail.gmail.com).


 After further testing I found a bug --- it involves having a security
 barrier view on top of a base relation that has a rule that rewrites
 the query to have a different result relation, and possibly also a
 different command type, so that the securityQuals are no longer on the
 result relation, which is a code path not previously tested and the
 rowmark handling was wrong. That's probably a pretty obscure case in
 the context of security barrier views, but that code path would be
 used much more commonly if RLS were built on top of this. Fortunately
 the fix is trivial --- updated patch attached.

 This is the most recent patch I see, and the one I've been working on
 top of.

 Are there any known tests that this patch fails?


None that I've been able to come up with.


 Can we construct any tests that this patch fails? If so, can we make it
 pass them, or error out cleanly?


Sounds sensible. Feel free to add any test cases you think up to the
wiki page. Even if we don't like this design, any alternative must at
least pass all the tests listed there.

https://wiki.postgresql.org/wiki/Making_security_barrier_views_automatically_updatable

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LIKE INCLUDING CONSTRAINTS is broken

2014-01-29 Thread Ashutosh Bapat
While transforming the LIKE clause in transformTableLikeClause(), the
function does remap the varattnos of the constraint expression.

 838│
 839│ ccbin_node =
map_variable_attnos(stringToNode(ccbin),
 840│
1, 0,
 841│
attmap, tupleDesc-natts,
 842│
found_whole_row);

So, upto this point, the attribute numbers in the constraint expression
string are in sync with the table schema definition.

But when it comes to treating inheritance in
DefineRelation-MergeAttributes(), the inherited attributes are added
before the table specific attributes (including the attributed included
because of LIKE clause). At this point the attribute numbers in constraint
expressions get out of sync with the table's schema and are never corrected
later. In AddRelationNewConstraints() we have following code and comment

2231 else
2232 {
2233 Assert(cdef-cooked_expr != NULL);
2234
2235 /*
2236  * Here, we assume the parser will only pass us valid CHECK
2237  * expressions, so we do no particular checking.
2238  */
2239 expr = stringToNode(cdef-cooked_expr);
2240 }

So, either in MergeAttributes or in AddRelationNewConstraints, we need to
restamp the attribute numbers in the constraints, so that they are in sync
with the table schema after adding the inherited columns.

The other possibility is to add the inherited columns after the table
specific columns (including those included because of LIKE clause), but
that would break lot of other things (including backward compatibility) I
guess.


On Sat, Jan 25, 2014 at 1:36 AM, Alvaro Herrera alvhe...@2ndquadrant.comwrote:

 It seems CREATE TABLE ... (LIKE INCLUDING CONSTRAINTS)  doesn't work
 cleanly when there's also regular inheritance; my guess is that attnums
 get messed up at some point after the constraints are generated.

 Here's a trivial test case:

 create table b (b1 int unique check (b1  100));
 CREATE TABLE c (c1 int not null references b (b1));
 create table d (d1 int, d2 point not null);
 create table a (a1 int not null,
 a2 text primary key,
 a3 timestamptz(6),
 like b including constraints,
 like c)
 inherits (d);

 You can see the broken state:

 alvherre=# \d [ab]
Tabla «public.a»
  Columna |Tipo | Modificadores
 -+-+---
  d1  | integer |
  d2  | point   | not null
  a1  | integer | not null
  a2  | text| not null
  a3  | timestamp(6) with time zone |
  b1  | integer |
  c1  | integer | not null
 Índices:
 a_pkey PRIMARY KEY, btree (a2)
 Restricciones CHECK:
 b_b1_check CHECK (a2  100)
 Hereda: d

  Tabla «public.b»
  Columna |  Tipo   | Modificadores
 -+-+---
  b1  | integer |
 Índices:
 b_b1_key UNIQUE CONSTRAINT, btree (b1)
 Restricciones CHECK:
 b_b1_check CHECK (b1  100)
 Referenciada por:
 TABLE c CONSTRAINT c_c1_fkey FOREIGN KEY (c1) REFERENCES b(b1)


 Notice how the CHECK constraint in table b points to column b1, but in
 table a it is mentioning column a2, even though that one is not even of
 the correct datatype.  In fact if you try an insert, you get a weird
 error message:

 alvherre=# insert into a (d2, a2, a1, c1) values ('(1, 0)', '1', 1, 1);
 ERROR:  attribute 4 has wrong type
 DETALLE:  Table has type text, but query expects integer.

 If I take out the INHERITS clause in table a, the error disappears.

 --
 Álvaro Herrerahttp://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers




-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-29 Thread Christian Convey
On Mon, Jan 27, 2014 at 7:14 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 FDW's join pushing down is one of the valuable use-cases of this interface,
 but not all. As you might know, my motivation is to implement GPU
 acceleration
 feature on top of this interface, that offers alternative way to scan or
 join
 relations or potentially sort or aggregate.


I'm curious how this relates to the pluggable storage idea discussed here
https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting and here
http://www.databasesoup.com/2013/05/postgresql-new-development-priorities-2.html


I haven't seen a specific proposal about how much functionality should be
encapsulated by a pluggable storage system.  But I wonder if that would be
the best place for specialized table-scan code to end up?


Re: [HACKERS] [PATCH] Negative Transition Aggregate Functions (WIP)

2014-01-29 Thread Florian Pflug
On Jan29, 2014, at 09:59 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 On 28 January 2014 20:16, Florian Pflug f...@phlo.org wrote:
 On Jan27, 2014, at 23:28 , Dean Rasheed dean.a.rash...@gmail.com wrote:
 This case is explicitly forbidden, both in CREATE AGGREGATE and in the
 executor. To me, that seems overly restrictive --- if transfn is
 strict, then you know for sure that no NULL values are added to the
 aggregate state, so surely it doesn't matter whether or not
 inv_transfn is strict. It will never be asked to remove NULL values.
 
 I think there are definite use-cases where a user might want to use a
 pre-existing function as the inverse transition function, so it seems
 harsh to force them to wrap it in a strict function in this case.
 
 I'm not sure that the likelihood of someone wanting to combine a strict
 forward with a non-strict inverse function is very hight, but neither
 
 
 Me neither, but the checks to forbid it aren't adding anything, and I
 think it's best to make it as flexible as possible.

Ok.

 Another idea would be to have an explicit nulls=ignore|process option
 for CREATE AGGREGATE. If nulls=process and either of the transition
 functions are strict, we could either error out, or simply do what
 normal functions calls do and pretend they return NULL for NULL inputs.
 Not sure how the rule that forward transition functions may not return
 NULL if there's an inverse transition function would fit in if we do
 the latter, though.
 
 The question is - is it worth it the effort to add that flag?
 
 One use-case I had in mind upthread was suppose you wanted to write a
 custom version of array_agg that only collected non-NULL values. With
 such a flag, that would be trivial, but with the current patch you'd
 have to (count-intuitively) wrap the inverse transition function in a
 strict function.

I'd be more convinced by that if doing so was actually possible for
non-superusers. But it isn't, because the aggregate uses internal as
it's state type and it's thus entirely up to the user to not crash the
database by mixing transfer and final functions with incompatible
state data. Plus, instead of creating a custom aggregate, one can just
use a FILTER clause to get rid of the NULL values.

 Another use-case I can imagine is suppose you wanted a custom version
 of sum() that returned NULL if any of the input values were NULL. If
 you knew that most of your data was non-NULL, you might still wish to
 benefit from an inverse transition function. Right now the patch won't
 allow that because of the error in advance_windowaggregate(), but
 possibly that could be relaxed by forcing a restart in that case.

That's not really true - that patch only forbids that if you insist on
representing the state i have seen a NULL input with a NULL state value.
But if you instead just count the number of NULLS in your transition
functions, all you need to do is to have your final function return NULL
if that count is not zero.

 If I've understood correctly that should give similar to current
 performance if NULLs are present, and enhanced performance as the
 window moved over non-NULL data.

Exactly - and this makes defining a NULL-sensitive SUM() this way
rather silly - a simple counter has very nearly zero overhead, and avoids
all rescans.

 In that second case, it would also be nice if you could simply re-use
 the existing sum forward and inverse transition functions, with a
 different null-handling flag.

Even if we had a nulls=process|ignore flag, SUM's transition functions
would still need to take that use-case into account explicitly to make
this work - at the very least, the forward transition function would
need to return NULL if the input is NULL instead of just skipping it as
it does now. But that would leads to completely unnecessary rescans, so
what we actually ought to do then is to make it track whether there have
been NULL inputs and make the finalfunc return NULL in this case. Which
would border on ironic, since the whole raison d'etre for this is to
*avoid* spreading NULL-counting logic around...

Plus, to make this actually useable, we'd have to document this, and tell
people how to define such a SUM aggregate. But I don't want to go there -
we really mustn't encourage people to mix-and-match built-in aggregates
with state type internal, since whether they work or crash depends
on factors outside the control of the user.

And finally, you can get that behaviour quite easily by doing

  CASE WHEN bool_and(input IS NOT NULL) whatever_agg(input) ELSE NULL END

 So I think an ignore-nulls flag would have real benefits, as well as
 being a cleaner design than relying on a strict inverse transition
 function.

The more I think about this, the less convinced I am. In fact, I'm
currently leaning towards just forbidding non-strict forward transition
function with strict inverses, and adding non-NULL counters to the
aggregates that then require them. It's really only the SUM() aggregates
that are 

Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Dean Rasheed
On 29 January 2014 06:43, Tom Lane t...@sss.pgh.pa.us wrote:
 Kouhei Kaigai kai...@ak.jp.nec.com writes:
 Let me ask an elemental question. What is the reason why inheritance
 expansion logic should be located on the planner stage, not on the tail
 of rewriter?

 I think it's mostly historical.  You would however have to think of a
 way to preserve the inheritance relationships in the parsetree
 representation.  In the current code, expand_inherited_tables() adds
 AppendRelInfo nodes to the planner's data structures as it does the
 expansion; but I don't think AppendRelInfo is a suitable structure
 for the rewriter to work with, and in any case there's no place to
 put it in the Query representation.

 Actually though, isn't this issue mostly about inheritance of a query
 *target* table?  Moving that expansion to the rewriter is a totally
 different and perhaps more tractable change.  It's certainly horribly ugly
 as it is; hard to see how doing it at the rewriter could be worse.


That's interesting. Presumably then we could make rules work properly
on inheritance children. I'm not sure if anyone has actually
complained that that currently doesn't work.

Thinking about that though, it does potentially open up a whole other
can of worms --- a single update query could be turned into multiple
other queries of different command types. Perhaps that's not so
different from what currently happens in the rewriter, except that
you'd need a way to track which of those queries counts towards the
statement's final row count. And how many ModifyTable nodes would the
resulting plan have?

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Simon Riggs
On 29 January 2014 09:16, Magnus Hagander mag...@hagander.net wrote:
 On Wed, Jan 29, 2014 at 8:58 AM, Peter Geoghegan p...@heroku.com wrote:

 I am not opposed in principle to adding new things to the counters
 struct in pg_stat_statements. I just think that the fact that the
 overhead of installing the module on a busy production system is
 currently so low is of *major* value, and therefore any person that
 proposes to expand that struct should be required to very conclusively
 demonstrate that there is no appreciably increase in overhead. Having
 a standard deviation column would be nice, but it's still not that
 important. Maybe when we have portable atomic addition we can afford
 to be much more inclusive of that kind of thing.


 Not (at this point) commenting on the details, but a big +1 on the fact that
 the overhead is low is a *very* important property. If the overhead starts
 growing it will be uninstalled - and that will make things even harder to
 diagnose.

+1 also. I think almost everybody agrees.

Having said that, I'm not certain that moves us forwards with how to
handle the patch at hand. Here is my attempt at an objective view of
whether or not to apply the patch:

The purpose of the patch is to provide additional help to diagnose
performance issues. Everybody seems to want this as an additional
feature, given the above caveat. The author has addressed the original
performance concerns there and provided some evidence to show that
appears effective.

It is possible that adding this extra straw breaks the camel's back,
but it seems unlikely to be that simple. A new feature to help
performance problems will be of a great use to many people; complaints
about the performance of pg_stat_statements are unlikely to increase
greatly from this change, since such changes would only affect those
right on the threshold of impact. The needs of the many...

Further changes to improve scalability of pg_stat_statements seem
warranted in the future, but I see no reason to block the patch for
that reason.

If somebody has some specific evidence that the impact outweighs the
benefit, please produce it or I am inclined to proceed to commit.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Removing xloginsert_slots?

2014-01-29 Thread Michael Paquier
Hi all,

The undocumented GUC called xloginsert_slots has been introduced by
commit 9a20a9b. It is mentioned by the commit that this parameter
should be removed before the release. Wouldn't it be a good time to
remove this parameter soon? I imagine that removing it before the beta
would make sense so now is perhaps too early... Either way, attached
is a patch doing so...
Regards,
-- 
Michael
*** a/src/backend/access/transam/xlog.c
--- b/src/backend/access/transam/xlog.c
***
*** 69,74  extern uint32 bootstrap_data_checksum_version;
--- 69,76 
  #define PROMOTE_SIGNAL_FILE		promote
  #define FALLBACK_PROMOTE_SIGNAL_FILE fallback_promote
  
+ /* Number of slots for concurrent xlog insertions */
+ #define XLOG_INSERT_SLOTS		8
  
  /* User-settable parameters */
  int			CheckPointSegments = 3;
***
*** 85,91  int			sync_method = DEFAULT_SYNC_METHOD;
  int			wal_level = WAL_LEVEL_MINIMAL;
  int			CommitDelay = 0;	/* precommit delay in microseconds */
  int			CommitSiblings = 5; /* # concurrent xacts needed to sleep */
- int			num_xloginsert_slots = 8;
  
  #ifdef WAL_DEBUG
  bool		XLOG_DEBUG = false;
--- 87,92 
*** a/src/backend/utils/misc/guc.c
--- b/src/backend/utils/misc/guc.c
***
*** 2078,2094  static struct config_int ConfigureNamesInt[] =
  	},
  
  	{
- 		{xloginsert_slots, PGC_POSTMASTER, WAL_SETTINGS,
- 			gettext_noop(Sets the number of slots for concurrent xlog insertions.),
- 			NULL,
- 			GUC_NOT_IN_SAMPLE
- 		},
- 		num_xloginsert_slots,
- 		8, 1, 1000,
- 		NULL, NULL, NULL
- 	},
- 
- 	{
  		/* see max_connections */
  		{max_wal_senders, PGC_POSTMASTER, REPLICATION_SENDING,
  			gettext_noop(Sets the maximum number of simultaneously running WAL sender processes.),
--- 2078,2083 
*** a/src/include/access/xlog.h
--- b/src/include/access/xlog.h
***
*** 192,198  extern bool EnableHotStandby;
  extern bool fullPageWrites;
  extern bool wal_log_hints;
  extern bool log_checkpoints;
- extern int	num_xloginsert_slots;
  
  /* WAL levels */
  typedef enum WalLevel
--- 192,197 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing xloginsert_slots?

2014-01-29 Thread Andres Freund
Hi,

On 2014-01-29 21:59:05 +0900, Michael Paquier wrote:
 The undocumented GUC called xloginsert_slots has been introduced by
 commit 9a20a9b. It is mentioned by the commit that this parameter
 should be removed before the release. Wouldn't it be a good time to
 remove this parameter soon? I imagine that removing it before the beta
 would make sense so now is perhaps too early... Either way, attached
 is a patch doing so...

I'd rather wait till somebody actually has done some benchmarks. I don't
think we're more clueful about it now than back when the patch went
in. And such benchmarking is more likely during beta, so...

Greetings,

Andres Freund

-- 
 Andres Freund http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: Custom Scan APIs (Re: [HACKERS] Custom Plan node)

2014-01-29 Thread Kohei KaiGai
2014-01-29 Christian Convey christian.con...@gmail.com:

 On Mon, Jan 27, 2014 at 7:14 PM, Kouhei Kaigai kai...@ak.jp.nec.com wrote:

 FDW's join pushing down is one of the valuable use-cases of this
 interface,
 but not all. As you might know, my motivation is to implement GPU
 acceleration
 feature on top of this interface, that offers alternative way to scan or
 join
 relations or potentially sort or aggregate.


 I'm curious how this relates to the pluggable storage idea discussed here
 https://wiki.postgresql.org/wiki/PgCon_2013_Developer_Meeting and here
 http://www.databasesoup.com/2013/05/postgresql-new-development-priorities-2.html

 I haven't seen a specific proposal about how much functionality should be
 encapsulated by a pluggable storage system.  But I wonder if that would be
 the best place for specialized table-scan code to end up?

If you are interested in designing your own storage layer (thus it needs to
have own scan/writer implementation), FDW is an option currently available.
It defines a set of interface that allows extensions to generate things to be
there on the fly. It does not force to perform as a client of remote database,
even though it was originally designed for dblink purpose.
In other words, FDW is a feature to translate a particular data source into
something visible according to the table definition. As long as driver can
intermediate table definition and data format of your own storage layer,
it shall work.

On the other hands, custom-scan interface, basically, allows extensions to
implement alternative way to access the data. If we have wiser way to
scan or join relations than built-in logic (yep, it will be a wiser
logic to scan
a result set of remote-join than local join on a couple of remote scan results),
this interface suggest the backend I have such a wise strategy, then planner
will choose one of them; including either built-in or additional one, according
to the cost.

Let's back to your question. This interface is, right now, not designed to
implement pluggable storage layer. FDW is an option now, and maybe
a development item in v9.5 cycle if we want regular tables being pluggable.
Because I'm motivated to implement my GPU acceleration feature to
perform on regular relations, I put my higher priority on the interface to
allow extension to suggest how to scan it.

Thanks,
-- 
KaiGai Kohei kai...@kaigai.gr.jp


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for pg_stat_archiver view

2014-01-29 Thread Michael Paquier
On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao masao.fu...@gmail.com wrote:
 I think that it's time to rename all the variables related to 
 pg_stat_bgwriter.
 For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter.
 I think that it's okay to make this change as separate patch, though.
Please find attached a simple patch renaming PgStat_GlobalStats to
PgStat_BgWriterStats. Its related variables and functions are renamed
as well and use the term bgwriter instead of global. Comments are
updated as well. This patch also removes stats_timestamp from
PgStat_GlobalStats and uses instead a static variable in pgstat.c,
making all the structures dedicated to each component clearer.
Regards,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Support for pg_stat_archiver view

2014-01-29 Thread Michael Paquier
On Wed, Jan 29, 2014 at 10:38 PM, Michael Paquier
michael.paqu...@gmail.com wrote:
 On Sat, Jan 25, 2014 at 5:41 AM, Fujii Masao masao.fu...@gmail.com wrote:
 I think that it's time to rename all the variables related to 
 pg_stat_bgwriter.
 For example, it's better to change PgStat_GlobalStats to PgStat_Bgwriter.
 I think that it's okay to make this change as separate patch, though.
 Please find attached a simple patch renaming PgStat_GlobalStats to
 PgStat_BgWriterStats.
And of course I forgot the patch... Now attached.
-- 
Michael
*** a/src/backend/postmaster/pgstat.c
--- b/src/backend/postmaster/pgstat.c
***
*** 221,228  static int	localNumBackends = 0;
   * Contains statistics that are not collected per database
   * or per table.
   */
  static PgStat_ArchiverStats archiverStats;
! static PgStat_GlobalStats globalStats;
  
  /* Write request info for each database */
  typedef struct DBWriteRequest
--- 221,229 
   * Contains statistics that are not collected per database
   * or per table.
   */
+ static TimestampTz global_stat_timestamp;
  static PgStat_ArchiverStats archiverStats;
! static PgStat_BgWriterStats bgwriterStats;
  
  /* Write request info for each database */
  typedef struct DBWriteRequest
***
*** 2344,2361  pgstat_fetch_stat_archiver(void)
  
  /*
   * -
!  * pgstat_fetch_global() -
   *
   *	Support function for the SQL-callable pgstat* functions. Returns
!  *	a pointer to the global statistics struct.
   * -
   */
! PgStat_GlobalStats *
! pgstat_fetch_global(void)
  {
  	backend_read_statsfile();
  
! 	return globalStats;
  }
  
  
--- 2345,2362 
  
  /*
   * -
!  * pgstat_fetch_stat_bgwriter() -
   *
   *	Support function for the SQL-callable pgstat* functions. Returns
!  *	a pointer to the bgwriter statistics struct.
   * -
   */
! PgStat_BgWriterStats *
! pgstat_fetch_stat_bgwriter(void)
  {
  	backend_read_statsfile();
  
! 	return bgwriterStats;
  }
  
  
***
*** 3594,3600  pgstat_write_statsfiles(bool permanent, bool allDbs)
  	/*
  	 * Set the timestamp of the stats file.
  	 */
! 	globalStats.stats_timestamp = GetCurrentTimestamp();
  
  	/*
  	 * Write the file header --- currently just a format ID.
--- 3595,3601 
  	/*
  	 * Set the timestamp of the stats file.
  	 */
! 	global_stat_timestamp = GetCurrentTimestamp();
  
  	/*
  	 * Write the file header --- currently just a format ID.
***
*** 3604,3612  pgstat_write_statsfiles(bool permanent, bool allDbs)
  	(void) rc;	/* we'll check for error with ferror */
  
  	/*
! 	 * Write global stats struct
  	 */
! 	rc = fwrite(globalStats, sizeof(globalStats), 1, fpout);
  	(void) rc;	/* we'll check for error with ferror */
  
  	/*
--- 3605,3613 
  	(void) rc;	/* we'll check for error with ferror */
  
  	/*
! 	 * Write bgwriter stats struct
  	 */
! 	rc = fwrite(bgwriterStats, sizeof(bgwriterStats), 1, fpout);
  	(void) rc;	/* we'll check for error with ferror */
  
  	/*
***
*** 3630,3636  pgstat_write_statsfiles(bool permanent, bool allDbs)
  		 */
  		if (allDbs || pgstat_db_requested(dbentry-databaseid))
  		{
! 			dbentry-stats_timestamp = globalStats.stats_timestamp;
  			pgstat_write_db_statsfile(dbentry, permanent);
  		}
  
--- 3631,3637 
  		 */
  		if (allDbs || pgstat_db_requested(dbentry-databaseid))
  		{
! 			dbentry-stats_timestamp = global_stat_timestamp;
  			pgstat_write_db_statsfile(dbentry, permanent);
  		}
  
***
*** 3881,3898  pgstat_read_statsfiles(Oid onlydb, bool permanent, bool deep)
  		 HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
  
  	/*
! 	 * Clear out global and archiver statistics so they start from zero
  	 * in case we can't load an existing statsfile.
  	 */
! 	memset(globalStats, 0, sizeof(globalStats));
  	memset(archiverStats, 0, sizeof(archiverStats));
  
  	/*
  	 * Set the current timestamp (will be kept only in case we can't load an
  	 * existing statsfile).
  	 */
! 	globalStats.stat_reset_timestamp = GetCurrentTimestamp();
! 	archiverStats.stat_reset_timestamp = globalStats.stat_reset_timestamp;
  
  	/*
  	 * Try to open the stats file. If it doesn't exist, the backends simply
--- 3882,3899 
  		 HASH_ELEM | HASH_FUNCTION | HASH_CONTEXT);
  
  	/*
! 	 * Clear out bgwriter and archiver statistics so they start from zero
  	 * in case we can't load an existing statsfile.
  	 */
! 	memset(bgwriterStats, 0, sizeof(bgwriterStats));
  	memset(archiverStats, 0, sizeof(archiverStats));
  
  	/*
  	 * Set the current timestamp (will be kept only in case we can't load an
  	 * existing statsfile).
  	 */
! 	bgwriterStats.stat_reset_timestamp = GetCurrentTimestamp();
! 	archiverStats.stat_reset_timestamp = bgwriterStats.stat_reset_timestamp;
  
  	/*
  	 * Try to open the stats file. If it doesn't exist, the backends simply
***
*** 3925,3933  pgstat_read_statsfiles(Oid onlydb, bool permanent, bool 

Re: [HACKERS] Infinite recursion in row-security based on updatable s.b. views

2014-01-29 Thread Craig Ringer
On 01/28/2014 02:11 PM, Craig Ringer wrote:
  My first thought is to add a boolean flag to RangeTblEntry (similar to
  the inh flag) to say whether RLS expansion is requested for that
  RTE. Then set it to false on each RTE as you add new RLS quals to it's
  securityQuals.
 That's what I was getting at with adding flags to RangeTblEntry, yes.
 
 Given the number of flags we're growing I wonder if they should be
 consolodated into a bitmask, but I'll leave that problem for later.
 
 I'll do this part first, since it seems you agree that a RangeTblEntry
 flag is the appropriate path. That'll make row-security checking work
 and make the patch testable.
 
 It won't deal with recursive rules, they'll still crash the backend.
 I'll deal with that as a further step.
 

I've put together a working RLS patch on top of updatable security
barrier views.

It has some known issues remaining; it doesn't do recursion checking
yet, and it fails some of the regression tests in exciting ways. I'm
looking into them step by step.

Some differences in the tests behaviours that have changed due to the
inheritance rules changing; many appear to be oversights or bugs yet to
be chased down.

You can find it here;

https://github.com/ringerc/postgres/compare/rls-9.4-upd-sb-views

i.e. https://github.com/ringerc/postgres.git ,
 branch rls-9.4-upd-sb-views

(subject to rebasing) or the non-rebased tag rls-9.4-upd-sb-views-v2

The guts of the patch appear as a diff, attached, but it's not
standalone so I suggest using git.

I'll be looking into recursion issues and the test failures tomorrow.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services
From 03c02389c7b475d06864f9a7f38fd583c6e891e9 Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Tue, 28 Jan 2014 14:23:23 +0800
Subject: [PATCH 1/4] RLS: Add rowsec_done flag to RangeTblEntry

To be used to track completion status of row security expansion on a
range table entry. Rels with this flag set must not be row-security
expanded.
---
 src/backend/nodes/copyfuncs.c  | 1 +
 src/backend/nodes/outfuncs.c   | 1 +
 src/backend/nodes/readfuncs.c  | 1 +
 src/include/nodes/parsenodes.h | 1 +
 4 files changed, 4 insertions(+)

diff --git a/src/backend/nodes/copyfuncs.c b/src/backend/nodes/copyfuncs.c
index 3e2acf0..9607911 100644
--- a/src/backend/nodes/copyfuncs.c
+++ b/src/backend/nodes/copyfuncs.c
@@ -1975,6 +1975,7 @@ _copyRangeTblEntry(const RangeTblEntry *from)
 	COPY_SCALAR_FIELD(rtekind);
 	COPY_SCALAR_FIELD(relid);
 	COPY_SCALAR_FIELD(relkind);
+	COPY_SCALAR_FIELD(rowsec_done);
 	COPY_NODE_FIELD(subquery);
 	COPY_SCALAR_FIELD(security_barrier);
 	COPY_SCALAR_FIELD(rowsec_relid);
diff --git a/src/backend/nodes/outfuncs.c b/src/backend/nodes/outfuncs.c
index dd447ed..974d169 100644
--- a/src/backend/nodes/outfuncs.c
+++ b/src/backend/nodes/outfuncs.c
@@ -2372,6 +2372,7 @@ _outRangeTblEntry(StringInfo str, const RangeTblEntry *node)
 		case RTE_RELATION:
 			WRITE_OID_FIELD(relid);
 			WRITE_CHAR_FIELD(relkind);
+			WRITE_BOOL_FIELD(rowsec_done);
 			break;
 		case RTE_SUBQUERY:
 			WRITE_NODE_FIELD(subquery);
diff --git a/src/backend/nodes/readfuncs.c b/src/backend/nodes/readfuncs.c
index a0cb369..7a43b59 100644
--- a/src/backend/nodes/readfuncs.c
+++ b/src/backend/nodes/readfuncs.c
@@ -1214,6 +1214,7 @@ _readRangeTblEntry(void)
 		case RTE_RELATION:
 			READ_OID_FIELD(relid);
 			READ_CHAR_FIELD(relkind);
+			READ_BOOL_FIELD(rowsec_done);
 			break;
 		case RTE_SUBQUERY:
 			READ_NODE_FIELD(subquery);
diff --git a/src/include/nodes/parsenodes.h b/src/include/nodes/parsenodes.h
index b3e718a..b3ae722 100644
--- a/src/include/nodes/parsenodes.h
+++ b/src/include/nodes/parsenodes.h
@@ -736,6 +736,7 @@ typedef struct RangeTblEntry
 	 */
 	Oid			relid;			/* OID of the relation */
 	char		relkind;		/* relation kind (see pg_class.relkind) */
+	bool		rowsec_done;	/* True if row-security quals checked for and applied already */
 
 	/*
 	 * Fields valid for a subquery RTE (else NULL):
-- 
1.8.3.1

From b0f5797d81a4b856f26818e14c93a0ec453a35de Mon Sep 17 00:00:00 2001
From: Craig Ringer cr...@2ndquadrant.com
Date: Fri, 24 Jan 2014 16:18:40 +0800
Subject: [PATCH 2/4] RLS: Enforce row-security constraints

Row-security constraints are enforced here by having the securityQual expansion
code check for a row-security constraint before expanding quals.
---
 src/backend/optimizer/prep/prepsecurity.c |  17 +++-
 src/backend/optimizer/prep/prepunion.c|   5 ++
 src/backend/optimizer/util/Makefile   |   3 +-
 src/backend/optimizer/util/rowsecurity.c  | 144 ++
 src/include/optimizer/rowsecurity.h   |  25 ++
 5 files changed, 192 insertions(+), 2 deletions(-)
 create mode 100644 src/backend/optimizer/util/rowsecurity.c
 create mode 100644 src/include/optimizer/rowsecurity.h

diff --git a/src/backend/optimizer/prep/prepsecurity.c b/src/backend/optimizer/prep/prepsecurity.c

Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Vik Fearing
On 01/29/2014 01:12 PM, Heikki Linnakangas wrote:
 On 01/28/2014 06:11 PM, Christian Kruse wrote:
 Hi,

 attached you will find a new version of the patch, ported to HEAD,
 fixed the mentioned bug and - hopefully - dealing the the remaining
 issues.

 Thanks, I have committed this now.

 The documentation is still lacking.


The documentation is indeed lacking since it breaks the build.

doc/src/sgml/config.sgml contains the line

normal allocation if that fails. With literalon/literal, failure

which doesn't correctly terminate the closing /literal tag.

Trivial patch attached.

-- 
Vik

*** a/doc/src/sgml/config.sgml
--- b/doc/src/sgml/config.sgml
***
*** 1137,1143  include 'filename'
 para
  With varnamehuge_tlb_pages/varname set to literaltry/literal,
  the server will try to use huge pages, but fall back to using
! normal allocation if that fails. With literalon/literal, failure
  to use huge pages will prevent the server from starting up. With
  literaloff/literal, huge pages will not be used.
 /para
--- 1137,1143 
 para
  With varnamehuge_tlb_pages/varname set to literaltry/literal,
  the server will try to use huge pages, but fall back to using
! normal allocation if that fails. With literalon/literal, failure
  to use huge pages will prevent the server from starting up. With
  literaloff/literal, huge pages will not be used.
 /para

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Andrew Dunstan


On 01/29/2014 02:58 AM, Peter Geoghegan wrote:



I am not opposed in principle to adding new things to the counters
struct in pg_stat_statements. I just think that the fact that the
overhead of installing the module on a busy production system is
currently so low is of *major* value, and therefore any person that
proposes to expand that struct should be required to very conclusively
demonstrate that there is no appreciably increase in overhead. Having
a standard deviation column would be nice, but it's still not that
important. Maybe when we have portable atomic addition we can afford
to be much more inclusive of that kind of thing.




Importance is in the eye of the beholder.  As far as I'm concerned, min 
and max are of FAR less value than stddev. If stddev gets left out I'm 
going to be pretty darned annoyed, especially since the benchmarks seem 
to show the marginal cost as being virtually unmeasurable. If those 
aren't enough for you, perhaps you'd like to state what sort of tests 
would satisfy you.


cheers

andrew





--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Heikki Linnakangas

On 01/29/2014 04:01 PM, Vik Fearing wrote:

On 01/29/2014 01:12 PM, Heikki Linnakangas wrote:

The documentation is still lacking.


The documentation is indeed lacking since it breaks the build.

doc/src/sgml/config.sgml contains the line

 normal allocation if that fails. With literalon/literal, failure

which doesn't correctly terminate the closing /literal tag.

Trivial patch attached.


Thanks, applied!

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add force option to dropdb

2014-01-29 Thread Sawada Masahiko
On Tue, Jan 28, 2014 at 11:01 PM, salah jubeh s_ju...@yahoo.com wrote:
 Hello Sawada,

- This patch is not patched to master branch
 Sorry, My mistake
//new connections are not allowed
 Corrected.

 I hope now the patch in better state, if somthing left, I will be glad to
 fix it
 Regards


 On Tuesday, January 28, 2014 4:17 AM, Sawada Masahiko
 sawada.m...@gmail.com wrote:
 On 2014年1月17日 0:56, salah jubeh s_ju...@yahoo.com wrote:

If the user owns objects, that will prevent this from working also.  I
have the feeling that adding DROP OWNED BY and/or REASSIGNED OWNED BY
calls to this utility would be a bit excessive, but who knows.

 Please find attached the first attempt to drop drop the client
 connections.
 I have added an option -k, --kill instead of force since killing client
 connection does not guarantee -drop force-.
 Regards


 On Tuesday, January 14, 2014 8:06 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 salah jubeh wrote:

 For the sake of completeness:
 1. I think also, I need also to temporary disallow conecting to the
 database, is that right?
 2. Is there other factors can hinder dropping database?

 If the user owns objects, that will prevent this from working also.  I
 have the feeling that adding DROP OWNED BY and/or REASSIGNED OWNED BY
 calls to this utility would be a bit excessive, but who knows.


 3. Should I write two patches one for pg_version=9.2 and one for
 pg_version9.2


 No point -- nothing gets applied to branches older than current
 development anyway.


 Thank you for the patch.
 And sorry for delay in reviewing.

 I started to look this patch, So the following is first review comment.

 - This patch is not patched to master branch
 I tried to patch this patch file to master branch, but I got following
 error.
 $ cd postgresql
 $ patch -d. -p1  ../dropdb.patch
 can't find fiel to patch at input line 3
 Perhaps you used the wrong -p or --strip option?
 the text leading up to this was:
 --
 |--- dropdb_org.c 2014-01-16
 |+++ dropdb.c 2014-01-16
 --

 There is not dropdb_org.c. I think  that you made mistake when the
 patch is created.

 - This patch is not according the coding rule
 For example, line 71 of the patch:
 //new connections are not allowed
 It should be:
 /* new connections are not allowed */
 (Comment blocks that need specific line breaks should be formatted as
 block comments, where the comment starts as /*--.)
 Please refer to coding rule.
 http://wiki.postgresql.org/wiki/Developer_FAQ#What.27s_the_formatting_style_used_in_PostgreSQL_source_code.3F



Thank you for updating patch!

It did not works fine with following case.

---
$ createdb -U postgres hoge
$ psql -d hoge -U postgres
hoge=# create table test (col text);
hoge=# insert into test select repeat(chr(code),1) from
generate_series(1,10) code;

Execute dropdb -k while the client is inserting many tuples into database
$ dropdb -k hoge
2014-01-29 23:10:49 JST FATAL:  terminating connection due to
administrator command
2014-01-29 23:10:49 JST STATEMENT:  insert into test select
repeat(chr(code),1) from generate_series(1,200) code;
2014-01-29 23:10:54 JST ERROR:  database hoge is being accessed by other users
2014-01-29 23:10:54 JST DETAIL:  There is 1 other session using the database.
2014-01-29 23:10:54 JST STATEMENT:  DROP DATABASE hoge;

2014-01-29 23:10:54 JST ERROR:  syntax error at or near hoge at character 41
2014-01-29 23:10:54 JST STATEMENT:  UPDATE pg_database SET
datconnlimit = e hoge is being accessed by other users WHERE
datname= 'hoge';
dropdb: database removal failed: ERROR:  syntax error at or near hoge
LINE 1: UPDATE pg_database SET datconnlimit = e hoge is being acce...
^
hoge=# \l
 List of databases
   Name|  Owner   | Encoding | Collate | Ctype |   Access privileges
---+--+--+-+---+---
 hoge  | postgres | UTF8 | C   | C |
 postgres  | postgres | UTF8 | C   | C |
 template0 | postgres | UTF8 | C   | C | =c/postgres  +
   |  |  | |   | postgres=CTc/postgres
 template1 | postgres | UTF8 | C   | C | =c/postgres  +
   |  |  | |   | postgres=CTc/postgres

hoge database is not dropped yet.
Is this the bug? or not?


Regards,

---
Sawada Masahiko


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Merlin Moncure
On Tue, Jan 28, 2014 at 5:58 AM, Christian Kruse
christ...@2ndquadrant.com wrote:
 Hi,

 On 28/01/14 13:51, Heikki Linnakangas wrote:
 Oh darn, I remembered we had already committed this, but clearly not. I'd
 love to still get this into 9.4. The latest patch (hugepages-v5.patch) was
 pretty much ready for commit, except for documentation.

 I'm working on it. I ported it to HEAD and currently doing some
 benchmarks. Next will be documentation.

you mentioned benchmarks -- do you happen to have the results handy? (curious)

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Performance Improvement by reducing WAL for Update Operation

2014-01-29 Thread Heikki Linnakangas

On 01/29/2014 02:21 PM, Amit Kapila wrote:

On Wed, Jan 29, 2014 at 3:41 PM, Heikki Linnakangas
hlinnakan...@vmware.com wrote:

For example, if the new tuple is an exact copy of the old tuple,
except for one additional byte in the beginning, the algorithm would fail to
recognize that. It would be good to add a test case like that in the test
suite.

You can skip bytes when building the history table, or when finding matches,
but not both. Or you could skip N bytes, and then check for matches for the
next four bytes, then skip again and so forth, as long as you always check
four consecutive bytes (because the hash function is calculated from four
bytes).


Can we do something like:
Build Phase
a. Calculate the hash and add the entry in history table at every 4 bytes.

Match Phase
a. Calculate the hash in rolling fashion and try to find match at every byte.


Sure, that'll work. However, I believe it's cheaper to add entries to 
the history table at every byte, and check for a match every 4 bytes. I 
think you'll get more or less the same level of compression either way, 
but adding to the history table is cheaper than checking for matches, 
and we'd rather do the cheap thing more often than the expensive thing.



b. When match is found then skip only in chunks, something like I was
 doing in find match function
+
+ /* consider only complete chunk matches. */
+ if (history_chunk_size == 0)
+ thislen += PGRB_MIN_CHUNK_SIZE;
+ }

Will this address the concern?


Hmm, so when checking if a match is truly a match, you compare the 
strings four bytes at a time rather than byte-by-byte? That might work, 
but I don't think that's a hot spot currently. In the profiling I did, 
with a nothing matches test case, all the cycles were spent in the 
history building, and finding matches. Finding out how long a match is 
was negligible. Of course, it might be a different story with input 
where the encoding helps and you have matches, but I think we were doing 
pretty well in those cases already.



The main reason to process in chunks as much as possible is to save
cpu cycles. For example if we build hash table byte-by-byte, then even
for best case where most of tuple has a match, it will have reasonable
overhead due to formation of hash table.


Hmm. One very simple optimization we could do is to just compare the two 
strings byte by byte, before doing anything else, to find any common 
prefix they might have. Then output a tag for the common prefix, and run 
the normal algorithm on the rest of the strings. In many real-world 
tables, the 1-2 first columns are a key that never changes, so that 
might work pretty well in practice. Maybe it would also be worthwhile to 
do the same for any common suffix the tuples might have.


That would fail to find matches where you e.g. update the last column to 
have the same value as the first column, and change nothing else, but 
that's ok. We're not aiming for the best possible compression, just 
trying to avoid WAL-logging data that wasn't changed.



Here during match phase, I think we can avoid copying literal bytes until
a match is found, that can save cycles for cases when old and new
tuple are mostly different.


I think the extra if's in the loop will actually cost you more cycles 
than you save. You could perhaps have two copies of the main 
match-finding loop though. First, loop without outputting anything, 
until you find the first match. Then, output anything up to that point 
as literals. Then fall into the second loop, which outputs any 
non-matches byte by byte.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP patch (v2) for updatable security barrier views

2014-01-29 Thread Simon Riggs
On 29 January 2014 12:20, Dean Rasheed dean.a.rash...@gmail.com wrote:

 @Dean: If we moved that to rewriter, would expand_security_quals() and
 preprocess_rowmarks() also move?


 Actually I tend to think that expand_security_quals() should remain
 where it is, regardless of where inheritance expansion happens. One of
 the things that simplifies the job that expand_security_quals() has to
 do is that it takes place after preprocess_targetlist(), so the
 targetlist for the security barrier subqueries that it constructs is
 known. Calling expand_security_quals() earlier would require
 additional surgery on preprocess_targetlist() and expand_targetlist(),
 and would probably also mean that the Query would need to record the
 sourceRelation subquery RTE, as discussed upthread.

Looks to me that preprocess_targetlist() could be done earlier also,
if necessary, since it appears to contain checks and additional
columns, not anything related to optimization.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] jsonb generation functions

2014-01-29 Thread Andrew Dunstan
In the jsonb patch I have been working on, I have replicated all of what 
I call the json processing functions, and I will shortly add analogs for 
the new functions in that category json_to_record and json_to_recordset.


However I have not replicated what I call the json generation functions, 
array_to_json, row_to_json, to_json, and the new functions 
json_build_array, json_build_object, and json_object, nor the aggregate 
functions json_agg and the new json_object_agg. The reason for that is 
that I have always used those for constructing json given to the client, 
rather than json stored in the database, and for such a use there would 
be no point in turning it into jsonb rather than generating the json 
string directly.


However, I could be persuaded that we should have a jsonb analog of 
every json function. If we decide that, the next question is whether we 
have to have it now, or if it can wait.


(The other notable thing that's missing, and I think can't wait, is 
casts from json to jsonb and vice versa. I'm going to work on that 
immediately.)


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb generation functions

2014-01-29 Thread Merlin Moncure
On Wed, Jan 29, 2014 at 9:08 AM, Andrew Dunstan and...@dunslane.net wrote:
 In the jsonb patch I have been working on, I have replicated all of what I
 call the json processing functions, and I will shortly add analogs for the
 new functions in that category json_to_record and json_to_recordset.

 However I have not replicated what I call the json generation functions,
 array_to_json, row_to_json, to_json, and the new functions json_build_array,
 json_build_object, and json_object, nor the aggregate functions json_agg and
 the new json_object_agg. The reason for that is that I have always used
 those for constructing json given to the client, rather than json stored in
 the database, and for such a use there would be no point in turning it into
 jsonb rather than generating the json string directly.

 However, I could be persuaded that we should have a jsonb analog of every
 json function. If we decide that, the next question is whether we have to
 have it now, or if it can wait.

my 0.02$: it can wait -- possibly forever.  Assuming the casts work I
see absolutely no issue whatsover asking users to do:

select xx_to_json(something complex)::jsonb;

If you examine all the use cases json and jsonb, while they certainly
have some overlap, are going to be used in different patterns.  In
hindsight the type bifurcation was a good thing ISTM.

I don't think there should be any expectation for 100% match of the
API especially since you can slide things around with casts.  In fact,
for heavy serialization at the end of the day it might be better to
defer the jsonb creation to the final stage of serialization anyways
(avoiding iterative insertion) even if we did match the API.

(can't hurt to manage scope either considering the timelines for 9.4)

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb generation functions

2014-01-29 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 In the jsonb patch I have been working on, I have replicated all of what 
 I call the json processing functions, and I will shortly add analogs for 
 the new functions in that category json_to_record and json_to_recordset.

 However I have not replicated what I call the json generation functions, 
 array_to_json, row_to_json, to_json, and the new functions 
 json_build_array, json_build_object, and json_object, nor the aggregate 
 functions json_agg and the new json_object_agg. The reason for that is 
 that I have always used those for constructing json given to the client, 
 rather than json stored in the database, and for such a use there would 
 be no point in turning it into jsonb rather than generating the json 
 string directly.

 However, I could be persuaded that we should have a jsonb analog of 
 every json function. If we decide that, the next question is whether we 
 have to have it now, or if it can wait.

 (The other notable thing that's missing, and I think can't wait, is 
 casts from json to jsonb and vice versa. I'm going to work on that 
 immediately.)

It disturbs me that two weeks into CF4, we appear to still be in
full-speed-ahead development mode for jsonb.  Every other feature
that's hoping to get into 9.4 is supposed to have a completed patch
under review by the CF process.

If jsonb is an exception, why?  It seems to have already gotten a
pass on the matter of documentation quality.  I'm reluctant to write
a blank check for more code.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb generation functions

2014-01-29 Thread Andrew Dunstan


On 01/29/2014 10:21 AM, Tom Lane wrote:

Andrew Dunstan and...@dunslane.net writes:

In the jsonb patch I have been working on, I have replicated all of what
I call the json processing functions, and I will shortly add analogs for
the new functions in that category json_to_record and json_to_recordset.
However I have not replicated what I call the json generation functions,
array_to_json, row_to_json, to_json, and the new functions
json_build_array, json_build_object, and json_object, nor the aggregate
functions json_agg and the new json_object_agg. The reason for that is
that I have always used those for constructing json given to the client,
rather than json stored in the database, and for such a use there would
be no point in turning it into jsonb rather than generating the json
string directly.
However, I could be persuaded that we should have a jsonb analog of
every json function. If we decide that, the next question is whether we
have to have it now, or if it can wait.
(The other notable thing that's missing, and I think can't wait, is
casts from json to jsonb and vice versa. I'm going to work on that
immediately.)

It disturbs me that two weeks into CF4, we appear to still be in
full-speed-ahead development mode for jsonb.  Every other feature
that's hoping to get into 9.4 is supposed to have a completed patch
under review by the CF process.

If jsonb is an exception, why?  It seems to have already gotten a
pass on the matter of documentation quality.  I'm reluctant to write
a blank check for more code.




In that case I will add the casts, which are trivial but essential, and 
be done. Apart from that there is no essential development work.


FYI, the principal causes of delay have been a) some ill health on my 
part and b) Russian vacation season. I've been very conscious that we've 
been stretching the deadline a bit.


I am going to change the documentation stuff you griped about, in the 
way I previously suggested.


cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Review: Patch FORCE_NULL option for copy COPY in CSV mode

2014-01-29 Thread Ian Lawrence Barwick
2014/1/29 Ian Lawrence Barwick barw...@gmail.com:
 2014-01-29 Andrew Dunstan and...@dunslane.net:

 On 01/28/2014 05:55 AM, Ian Lawrence Barwick wrote:


 Hi Payal

 Many thanks for the review, and my apologies for not getting back to
 you earlier.

 Updated version of the patch attached with suggested corrections.

 On a very quick glance, I see that you have still not made adjustments to
 contrib/file_fdw to accommodate this new option. I don't see why this COPY
 option should be different in that respect.

 Hmm, that idea seems to have escaped me completely. I'll get onto it 
 forthwith.

Striking while the keyboard is hot... version with contrib/file_fdw
modifications
attached.


Regards

Ian Barwick
diff --git a/contrib/file_fdw/data/text.csv b/contrib/file_fdw/data/text.csv
new file mode 100644
index ed348a9..c7e243c
*** a/contrib/file_fdw/data/text.csv
--- b/contrib/file_fdw/data/text.csv
***
*** 1,4 
! AAA,aaa
! XYZ,xyz
! NULL,NULL
! ABC,abc
--- 1,4 
! AAA,aaa,123,
! XYZ,xyz,,321
! NULL,NULL,NULL,NULL
! ABC,abc,,
diff --git a/contrib/file_fdw/file_fdw.c b/contrib/file_fdw/file_fdw.c
new file mode 100644
index 5639f4d..5877512
*** a/contrib/file_fdw/file_fdw.c
--- b/contrib/file_fdw/file_fdw.c
*** struct FileFdwOption
*** 48,56 
  
  /*
   * Valid options for file_fdw.
!  * These options are based on the options for COPY FROM command.
!  * But note that force_not_null is handled as a boolean option attached to
!  * each column, not as a table option.
   *
   * Note: If you are adding new option for user mapping, you need to modify
   * fileGetOptions(), which currently doesn't bother to look at user mappings.
--- 48,56 
  
  /*
   * Valid options for file_fdw.
!  * These options are based on the options for the COPY FROM command.
!  * But note that force_not_null and force_null are handled as boolean options
!  * attached to a column, not as a table option.
   *
   * Note: If you are adding new option for user mapping, you need to modify
   * fileGetOptions(), which currently doesn't bother to look at user mappings.
*** static const struct FileFdwOption valid_
*** 69,75 
  	{null, ForeignTableRelationId},
  	{encoding, ForeignTableRelationId},
  	{force_not_null, AttributeRelationId},
! 
  	/*
  	 * force_quote is not supported by file_fdw because it's for COPY TO.
  	 */
--- 69,75 
  	{null, ForeignTableRelationId},
  	{encoding, ForeignTableRelationId},
  	{force_not_null, AttributeRelationId},
! 	{force_null, AttributeRelationId},
  	/*
  	 * force_quote is not supported by file_fdw because it's for COPY TO.
  	 */
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 187,192 
--- 187,193 
  	Oid			catalog = PG_GETARG_OID(1);
  	char	   *filename = NULL;
  	DefElem*force_not_null = NULL;
+ 	DefElem*force_null = NULL;
  	List	   *other_options = NIL;
  	ListCell   *cell;
  
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 243,252 
  		}
  
  		/*
! 		 * Separate out filename and force_not_null, since ProcessCopyOptions
! 		 * won't accept them.  (force_not_null only comes in a boolean
! 		 * per-column flavor here.)
  		 */
  		if (strcmp(def-defname, filename) == 0)
  		{
  			if (filename)
--- 244,253 
  		}
  
  		/*
! 		 * Separate out filename and column-specific options, since
! 		 * ProcessCopyOptions won't accept them.
  		 */
+ 
  		if (strcmp(def-defname, filename) == 0)
  		{
  			if (filename)
*** file_fdw_validator(PG_FUNCTION_ARGS)
*** 255,270 
  		 errmsg(conflicting or redundant options)));
  			filename = defGetString(def);
  		}
  		else if (strcmp(def-defname, force_not_null) == 0)
  		{
  			if (force_not_null)
  ereport(ERROR,
  		(errcode(ERRCODE_SYNTAX_ERROR),
! 		 errmsg(conflicting or redundant options)));
  			force_not_null = def;
  			/* Don't care what the value is, as long as it's a legal boolean */
  			(void) defGetBoolean(def);
  		}
  		else
  			other_options = lappend(other_options, def);
  	}
--- 256,297 
  		 errmsg(conflicting or redundant options)));
  			filename = defGetString(def);
  		}
+ 		/*
+ 		 * force_not_null is a boolean option; after validation we can discard
+ 		 * it - it will be retrieved later in get_file_fdw_attribute_options()
+ 		 */
  		else if (strcmp(def-defname, force_not_null) == 0)
  		{
  			if (force_not_null)
  ereport(ERROR,
  		(errcode(ERRCODE_SYNTAX_ERROR),
! 		 errmsg(conflicting or redundant options),
! 		 errhint(option \force_not_null\ supplied more than once for a column)));
! 			if(force_null)
! ereport(ERROR,
! 		(errcode(ERRCODE_SYNTAX_ERROR),
! 		 errmsg(conflicting or redundant options),
! 		 errhint(option \force_not_null\ cannot be used together with \force_null\)));
  			force_not_null = def;
  			/* Don't care what the value is, as long as it's a legal boolean */
  			(void) defGetBoolean(def);
  		}
+ 		/* See comments for force_not_null above */

Re: [HACKERS] [PATCH] Relocation of tablespaces in pg_basebackup

2014-01-29 Thread Steeve Lennmark
New patch attached with the following changes:

On Thu, Jan 23, 2014 at 11:01 AM, Steeve Lennmark stee...@handeldsbanken.se
 wrote:

 On Thu, Jan 23, 2014 at 2:06 AM, Peter Eisentraut pete...@gmx.net wrote:

 I'm not totally happy with the choice of : as the mapping separator,
 because that would always require escaping on Windows.  We could make it
 analogous to the path handling and make ; the separator on Windows.
 Then again, this is not a path, so maybe it should look like one.  We
 pick something else altogether, like =.

 The option name --tablespace isn't very clear.  It ought to be
 something like --tablespace-mapping.


 This design choice I made about -T (--tablespace) and colon as
 separator was copied from pg_barman, which is the way I back up my
 clusters at the moment. Renaming the option to --tablespace-mapping and
 changing the syntax to something like = is totally fine by me.


I changed the directory separator from : to = but made it
configurable in the code.


  I don't think we should require the new directory to be an absolute
 path.  It should be relative to the current directory, just like the -D
 option does it.


 Accepting a relative path should be fine, I made a failed attempt using
 realpath(3) initially but I guess checking for [0] != '/' and
 prepending getcwd(3) would suffice.


Relative paths are now accepted. This code will probably not work on
windows though. I tried setting up Windows 8 with the git version of
postgres but was unsuccessful, so I can't really test any of this.
Help on this subject (Windows) would be much appreciated.


  Somehow, I had the subconscious assumption that this feature would do
 prefix matching on the directories, not only complete string matching.
 So if I had tablespaces in /mnt/data1, /mnt/data2, /mnt/data3, I could
 map them all with -T /mnt:mnt and be done.  Not sure if that is useful
 to many, but it's worth a thought.


 I like that a lot, but I'm afraid the code would have to get a bit more
 complex to add that functionality. It would be an easier rewrite if we
 added a hint character, something like -T '/mnt/*:mnt'.


This is not implemented as suggested by Peter in his previous comment.
-T /mnt:mnt now relocates all tablespaces under /mnt to the relative
path mnt.

 Review style guide for error messages:
 http://www.postgresql.org/docs/devel/static/error-style-guide.html


I've updated error messages according to the style guide.

 We need to think about how to handle this on platforms without symlinks.
 I don't like just printing an error message and moving on.  It should be
 either pass or fail or an option to choose between them.


 I hope someone with experience with those kind of systems can come up
 with suggestions on how to solve that. I only run postgres on Linux.


I would still love some input on this.

 Please put the options in the getopt handling in alphabetical order.
 It's not very alphabetical now, but between D and F is probably not the
 best place. ;-)


 Done.


//Steeve


0006-pg_basebackup-relocate-tablespace.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] patch: option --if-exists for pg_dump

2014-01-29 Thread Pavel Stehule
2014-01-29 Pavel Stehule pavel.steh...@gmail.com




 2014-01-29 Jeevan Chalke jeevan.cha...@enterprisedb.com

 Hi Pavel,

 Now the patch looks good to me. However when I try to restore your own
 sql file's dump, I get following errors:

 pg_restore: [archiver (db)] could not execute query: ERROR:  relation
 public.emp does not exist
 Command was: DROP TRIGGER IF EXISTS emp_insert_trigger ON public.emp;

 pg_restore: [archiver (db)] could not execute query: ERROR:  schema
 myschema does not exist
 Command was: DROP FUNCTION IF EXISTS myschema.int_to_date(integer);

 Is that expected after your patch ?


 it should be fixed by
 http://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=b152c6cd0de1827ba58756e24e18110cf902182acommit



 Also, I didn't quite understand these lines of comments:

 /*
  * Descriptor string (te-desc) should not be same
 as object
  * specifier for DROP STATEMENT. The DROP DEFAULT
 has not
  * IF EXISTS clause - has not sense.
  */

 Will you please rephrase ?


 I can try it - .

 A content of te-desc is usually substring of DROP STATEMENT with one
 related exception - CONSTRAINT.
 Independent to previous sentence - ALTER TABLE ALTER COLUMN DROP DEFAULT
 doesn't support IF EXISTS - and therefore it should not be injected.


is it ok?



 Regards

 Pavel



 Thanks
 --
 Jeevan B Chalke
 Principal Software Engineer, Product Development
 EnterpriseDB Corporation
 The Enterprise PostgreSQL Company





Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Jeff Janes
On Wed, Jan 29, 2014 at 4:12 AM, Heikki Linnakangas hlinnakan...@vmware.com
 wrote:

 On 01/28/2014 06:11 PM, Christian Kruse wrote:

 Hi,

 attached you will find a new version of the patch, ported to HEAD,
 fixed the mentioned bug and - hopefully - dealing the the remaining
 issues.


 Thanks, I have committed this now.


I'm getting this warning now with gcc (GCC) 4.4.7:

pg_shmem.c: In function 'PGSharedMemoryCreate':
pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
function
pg_shmem.c:332: note: 'allocsize' was declared here


Cheers,

Jeff


Re: [HACKERS] proposal: hide application_name from other users

2014-01-29 Thread Simon Riggs
On 28 January 2014 20:14, Josh Berkus j...@agliodbs.com wrote:
 On 01/28/2014 12:10 PM, Tom Lane wrote:
 Josh Berkus j...@agliodbs.com writes:
 For example, I would really like to GRANT an unpriv user access to the
 WAL columns in pg_stat_replication so that I can monitor replication
 delay without granting superuser permissions.

 Just out of curiosity, why is that superuser-only at all?  AFAICS the
 hidden columns are just some LSNs ... what is the security argument
 for hiding them in the first place?

 Beats me, I can't find any discussion on it at all.

No specific reason that I can recall but replication is heavily
protected by layers of security.

pg_stat_replication is a join with pg_stat_activity, so some of the
info is open, some closed. It seems possible to relax that.


Presumably the current patch is returned with feedback? Or can we fix
these problems by inventing a new user aspect called MONITOR (similar
to REPLICATION)? We can grant application_name and replication details
to that.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Tom Lane
Christian Kruse christ...@2ndquadrant.com writes:
 Ok, so I propose the attached patch as a fix.

No, what I meant is that the ereport caller needs to save errno, rather
than assuming that (some subset of) ereport-related subroutines will
preserve it.

In general, it's unsafe to assume that any nontrivial subroutine preserves
errno, and I don't particularly want to promise that the ereport functions
are an exception.  Even if we did that, this type of coding would still
be risky.  Here are some examples:

   ereport(...,
   foo() ? errdetail(...) : 0,
   (errno == something) ? errhint(...) : 0);

If foo() clobbers errno and returns false, there is nothing that elog.c
can do to make this coding work.

   ereport(...,
   errmsg(%s belongs to %s,
  foo(), (errno == something) ? this : that));

Again, even if every single elog.c entry point saved and restored errno,
this coding wouldn't be safe.

I don't think we should try to make the world safe for some uses of errno
within ereport logic, when there are other very similar-looking uses that
we cannot make safe.  What we need is a coding rule that you don't do
that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] FOREIGN KEY ... CONCURRENTLY

2014-01-29 Thread David Fetter
Esteemed hackers,

I can't be the only person to have encountered a situation where
adding a new foreign key pointing at a busy table essentially never
happens because the way things work now, creating the constraint
trigger on that busy table requires an AccessExclusive lock, or a
unicorn, whichever you can acquire first.

So I'd like to propose, per a conversation with Andrew Gierth, that we
make an option to create foreign keys concurrently, which would mean
in essence that the referencing table would:

1) need to be empty, at least in the first version, and

2) needs to stay in a non-writeable state until all possible
conflicting transactions had ended.

Now, the less-fun part.  Per Andres Freund, the current support for
CONCURRENTLY in other operations is complex and poorly understood, and
there's no reason to believe this new CONCURRENTLY would be simpler or
easier to understand.

A couple of questions:

1) Would people like to have FOREIGN KEY ... CONCURRENTLY as
described above?

2) Is there another way to solve the problem of adding a foreign
key constraint that points at a busy table?

Cheers,
David.
-- 
David Fetter da...@fetter.org http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david.fet...@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 14:12, Heikki Linnakangas wrote:
 The documentation is still lacking. We should explain somewhere how to set
 nr.hugepages, for example. The Managing Kernel Resources section ought to
 mention setting. Could I ask you to work on that, please?

Of course! Attached you will find a patch for better documentation.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
old mode 100644
new mode 100755
index 1b5f831..68b38f7
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1128,10 +1128,7 @@ include 'filename'
 The use of huge TLB pages results in smaller page tables and
 less CPU time spent on memory management, increasing performance. For
 more details, see
-ulink url=https://wiki.debian.org/Hugepages;the Debian wiki/ulink.
-Remember that you will need at least shared_buffers / huge page size +
-1 huge TLB pages. So for example for a system with 6GB shared buffers
-and a hugepage size of 2kb of you will need at least 3156 huge pages.
+link linkend=linux-huge-tlb-pagesLinux huge TLB pages/link.
/para
 
para
diff --git a/doc/src/sgml/runtime.sgml b/doc/src/sgml/runtime.sgml
old mode 100644
new mode 100755
index bbb808f..2288c7b
--- a/doc/src/sgml/runtime.sgml
+++ b/doc/src/sgml/runtime.sgml
@@ -1307,6 +1307,83 @@ echo -1000  /proc/self/oom_score_adj
/para
/note
   /sect2
+
+  sect2 id=linux-huge-tlb-pages
+   titleLinux huge TLB pages/title
+
+   para
+Nowadays memory address spaces for processes are virtual. This means, when
+a process reserves memory, it gets a virtual memory address which has to
+be translated to a physical memory address by the OS or the CPU. This can
+be done via calculations, but since memory is accessed very often there is
+a cache for that, called Translation Lookaside Buffer,
+short emphasisTLB/emphasis.
+   /para
+
+   para
+When a process reserves memory, this is done in chunks (often
+of literal4kb/literal) named pages. This means if a process requires
+1GB of RAM, it has literal262144/literal (literal1GB/literal
+/ literal4KB/literal) pages and therefore literal262144/literal
+entries for the translation table. Since the TLB has a limited number of
+entries it is obvious that they can't be they can't all be cached, which
+will lead to loss of performance.
+   /para
+
+   para
+One way to tune this is to increase the page size. Most platforms allow
+larger pages, e.g. x86 allows pages of literal2MB/literal. This would
+reduce the number of pages to literal512/literal
+(literal1GB/literal / literal2MB/literal). This reduces the number
+of necessary lookups drastrically.
+   /para
+
+   para
+To enable this feature in productnamePostgreSQL/productname you need a
+kernel with varnameCONFIG_HUGETLBFS=y/varname and
+varnameCONFIG_HUGETLB_PAGE=y/varname. You also have to tune the system
+setting varnamevm.nr_hugepages/varname. To calculate the number of
+necessary huge pages start productnamePostgreSQL/productname without
+huge pages enabled and check the varnameVmPeakvarname value from the
+proc filesystem:
+
+programlisting
+$ userinputhead -1 /path/to/data/directory/postmaster.pid/userinput
+4170
+$ userinputgrep ^VmPeak /proc/4170/status/userinput
+VmPeak:	 6490428 kB
+/programlisting
+ literal6490428/literal / literal2048/literal
+ (varnamePAGE_SIZE/varname literal2MB/literal) are
+ roughly literal3169.154/literal huge pages, so you will need at
+ least literal3170/literal huge pages:
+programlisting
+sysctl -w vm.nr_hugepages=3170
+/programlisting
+Sometimes the kernel is not able to allocate the desired number of huge
+pages, so it might be necessary to repeat that command or to reboot. Don't
+forget to add an entry to filename/etc/sysctl.conf/filename to persist
+this setting through reboots.
+   /para
+
+   para
+The default behavior for huge pages
+in productnamePostgreSQL/productname is to use them when possible and
+to fallback to normal pages when failing. To enforce the use of huge
+pages, you can
+set link linkend=guc-huge-tlb-pagesvarnamehuge_tlb_pages/varname/link
+to literalon/literal. Note that in this
+case productnamePostgreSQL/productname will fail to start if not
+enough huge pages are available.
+   /para
+
+   para
+For a detailed description of the productnameLinux/productname huge
+pages feature have a look
+at ulink url=https://www.kernel.org/doc/Documentation/vm/hugetlbpage.txt/ulink.
+   /para
+
+  /sect2
  /sect1
 
 


pgp2cXAeJk4PE.pgp
Description: PGP signature


[HACKERS] pg_sleep_enhancements.patch

2014-01-29 Thread Pavel Stehule
Hello

I am looking on this patch

http://www.postgresql.org/message-id/525fe206.6000...@dalibo.com

a) pg_sleep_for - no objection - it is simple and secure

b) pg_sleep_until

I am not sure - maybe this implementation is too simply. I see two possible
risk where it should not work as users can expect

a) what will be expected behave whem time is changed - CET/CEST ?

b) what will be expected behave when board clock is not accurate and it is
periodically fixed (by NTP) - isn't better to sleep only few seconds and
recalculate sleeping interval?

Regards

Pavel


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Josh Berkus
On 01/29/2014 04:55 AM, Simon Riggs wrote:
 It is possible that adding this extra straw breaks the camel's back,
 but it seems unlikely to be that simple. A new feature to help
 performance problems will be of a great use to many people; complaints
 about the performance of pg_stat_statements are unlikely to increase
 greatly from this change, since such changes would only affect those
 right on the threshold of impact. The needs of the many...
 
 Further changes to improve scalability of pg_stat_statements seem
 warranted in the future, but I see no reason to block the patch for
 that reason

If we're talking here about min, max and stddev, then +1.  The stats
we've seen so far seem to indicate that any affect on query response
time is within the margin of error for pgbench.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sleep_enhancements.patch

2014-01-29 Thread Vik Fearing
On 01/29/2014 08:04 PM, Pavel Stehule wrote:
 Hello

 I am looking on this patch

Thank you for looking at it.

 http://www.postgresql.org/message-id/525fe206.6000...@dalibo.com

 a) pg_sleep_for - no objection - it is simple and secure

Okay.

 b) pg_sleep_until

 I am not sure - maybe this implementation is too simply. I see two
 possible risk where it should not work as users can expect

 a) what will be expected behave whem time is changed - CET/CEST ?

There is no risk there, the wake up time is specified with time zone.

 b) what will be expected behave when board clock is not accurate and
 it is periodically fixed (by NTP) - isn't better to sleep only few
 seconds and recalculate sleeping interval?

We could do that, but it seems like overkill.  It would mean writing a
new C function whereas this is just a simple helper for the existing
pg_sleep() function.  So my vote is to keep the patch as-is.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 10:11, Jeff Janes wrote:
 I'm getting this warning now with gcc (GCC) 4.4.7:

Interesting. I don't get that warning. But the compiler is (formally)
right.

 pg_shmem.c: In function 'PGSharedMemoryCreate':
 pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
 function
 pg_shmem.c:332: note: 'allocsize' was declared here

Attached patch should fix that.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index f7596bf..c39dfb6 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -329,7 +329,7 @@ PGSharedMemoryIsInUse(unsigned long id1, unsigned long id2)
 static void *
 CreateAnonymousSegment(Size *size)
 {
-	Size		allocsize;
+	Size		allocsize = *size;
 	void	   *ptr = MAP_FAILED;
 
 #ifndef MAP_HUGETLB
@@ -358,7 +358,6 @@ CreateAnonymousSegment(Size *size)
 		 */
 		int			hugepagesize = 2 * 1024 * 1024;
 
-		allocsize = *size;
 		if (allocsize % hugepagesize != 0)
 			allocsize += hugepagesize - (allocsize % hugepagesize);
 
@@ -372,7 +371,6 @@ CreateAnonymousSegment(Size *size)
 	if (huge_tlb_pages == HUGE_TLB_OFF ||
 		(huge_tlb_pages == HUGE_TLB_TRY  ptr == MAP_FAILED))
 	{
-		allocsize = *size;
 		ptr = mmap(NULL, *size, PROT_READ | PROT_WRITE, PG_MMAP_FLAGS, -1, 0);
 	}
 


pgpzNJrOrDmZa.pgp
Description: PGP signature


Re: [HACKERS] Planning time in explain/explain analyze

2014-01-29 Thread Andreas Karlsson

On 01/28/2014 09:39 PM, Tom Lane wrote:

I'm for doing the measurement in ExplainOneQuery() and not printing
anything in code paths that don't go through there.


Reading the discussion here and realizing that my last patch was wrong I 
am now in agreement with this. I have attached a patch which no longer 
tries to measure planning time for prepared statements.


/Andreas

diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
new file mode 100644
index 2af1738..482490b
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** EXPLAIN SELECT * FROM tenk1;
*** 89,94 
--- 89,95 
   QUERY PLAN
  -
   Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
+  Planning time: 0.113 ms
  /screen
 /para
  
*** EXPLAIN SELECT * FROM tenk1;
*** 162,167 
--- 163,174 
 /para
  
 para
+ The literalPlanning time/literal shown is the time it took to generate
+ the query plan from the parsed query and optimize it. It does not include
+ rewriting and parsing.
+/para
+ 
+para
  Returning to our example:
  
  screen
*** EXPLAIN SELECT * FROM tenk1;
*** 170,175 
--- 177,183 
   QUERY PLAN
  -
   Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
+  Planning time: 0.113 ms
  /screen
 /para
  
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 198,203 
--- 206,212 
  
   Seq Scan on tenk1  (cost=0.00..483.00 rows=7001 width=244)
 Filter: (unique1 lt; 7000)
+  Planning time: 0.104 ms
  /screen
  
  Notice that the commandEXPLAIN/ output shows the literalWHERE/
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 234,239 
--- 243,249 
 Recheck Cond: (unique1 lt; 100)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.093 ms
  /screen
  
  Here the planner has decided to use a two-step plan: the child plan
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 262,267 
--- 272,278 
 Filter: (stringu1 = 'xxx'::name)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.089 ms
  /screen
  
  The added condition literalstringu1 = 'xxx'/literal reduces the
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 283,288 
--- 294,300 
  -
   Index Scan using tenk1_unique1 on tenk1  (cost=0.29..8.30 rows=1 width=244)
 Index Cond: (unique1 = 42)
+  Planning time: 0.076 ms
  /screen
  
  In this type of plan the table rows are fetched in index order, which
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 311,316 
--- 323,329 
 Index Cond: (unique1 lt; 100)
   -gt;  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0)
 Index Cond: (unique2 gt; 9000)
+  Planning time: 0.094 ms
  /screen
  
  But this requires visiting both indexes, so it's not necessarily a win
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 331,336 
--- 344,350 
 -gt;  Index Scan using tenk1_unique2 on tenk1  (cost=0.29..71.27 rows=10 width=244)
   Index Cond: (unique2 gt; 9000)
   Filter: (unique1 lt; 100)
+  Planning time: 0.087 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 10 AND t1.unique2
*** 364,369 
--- 378,384 
 Index Cond: (unique1 lt; 10)
 -gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.91 rows=1 width=244)
   Index Cond: (unique2 = t1.unique2)
+  Planning time: 0.117 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 10 AND t2.unique2
*** 415,420 
--- 430,436 
 -gt;  Materialize  (cost=0.29..8.51 rows=10 width=244)
   -gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
 Index Cond: (unique2 lt; 10)
+  Planning time: 0.119 ms
  /screen
  
  The condition literalt1.hundred lt; t2.hundred/literal can't be
*** WHERE t1.unique1 lt; 100 AND t1.unique2
*** 462,467 
--- 478,484 
 Recheck Cond: (unique1 lt; 100)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.182 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 100 AND t1.unique2
*** 492,497 
--- 509,515 
 -gt;  Sort  (cost=197.83..200.33 rows=1000 width=244)
   Sort Key: t2.unique2
   -gt;  Seq Scan on onek t2  (cost=0.00..148.00 rows=1000 

Re: [HACKERS] pg_sleep_enhancements.patch

2014-01-29 Thread Pavel Stehule
2014-01-29 Vik Fearing vik.fear...@dalibo.com

 On 01/29/2014 08:04 PM, Pavel Stehule wrote:
  Hello
 
  I am looking on this patch

 Thank you for looking at it.

  http://www.postgresql.org/message-id/525fe206.6000...@dalibo.com
 
  a) pg_sleep_for - no objection - it is simple and secure

 Okay.

  b) pg_sleep_until
 
  I am not sure - maybe this implementation is too simply. I see two
  possible risk where it should not work as users can expect
 
  a) what will be expected behave whem time is changed - CET/CEST ?

 There is no risk there, the wake up time is specified with time zone.

  b) what will be expected behave when board clock is not accurate and
  it is periodically fixed (by NTP) - isn't better to sleep only few
  seconds and recalculate sleeping interval?

 We could do that, but it seems like overkill.  It would mean writing a
 new C function whereas this is just a simple helper for the existing
 pg_sleep() function.  So my vote is to keep the patch as-is.


Ok

second question - is not this functionality too dangerous? If somebody use
it as scheduler, then

a) can holds connect, session data, locks too long time
b) it can stop on query timeout probably much more early then user expect

What is expected use case?

Regards

Pavel




 --
 Vik




Re: [HACKERS] proposal: hide application_name from other users

2014-01-29 Thread Josh Berkus
On 01/29/2014 10:19 AM, Simon Riggs wrote:
 No specific reason that I can recall but replication is heavily
 protected by layers of security.
 
 pg_stat_replication is a join with pg_stat_activity, so some of the
 info is open, some closed. It seems possible to relax that.

I'm all for the idea of restrict, then open up.  That is, it made
sense to start with data restricted, but then unrestrict is as we know
it's OK.  Going the other way generally isn't possible, as this patch
demonstrates.

 Presumably the current patch is returned with feedback? Or can we fix
 these problems by inventing a new user aspect called MONITOR (similar
 to REPLICATION)? We can grant application_name and replication details
 to that.

Yeah, except I don't see doing the MONITOR thing for 9.4.  We'd need a
spec for it first.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sleep_enhancements.patch

2014-01-29 Thread Vik Fearing
On 01/29/2014 08:21 PM, Pavel Stehule wrote:
 second question - is not this functionality too dangerous? If somebody
 use it as scheduler, then

 a) can holds connect, session data, locks too long time
 b) it can stop on query timeout probably much more early then user expect

 What is expected use case?

It is no more dangerous than plain pg_sleep().  The use case is
convenience and clarity of code.

I don't think people will be using it as a scheduler any more than they
do with pg_sleep() because it can't cross transaction boundaries.

-- 
Vik



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Heikki Linnakangas

On 01/29/2014 09:18 PM, Christian Kruse wrote:

Hi,

On 29/01/14 10:11, Jeff Janes wrote:

I'm getting this warning now with gcc (GCC) 4.4.7:


Interesting. I don't get that warning. But the compiler is (formally)
right.


pg_shmem.c: In function 'PGSharedMemoryCreate':
pg_shmem.c:332: warning: 'allocsize' may be used uninitialized in this
function
pg_shmem.c:332: note: 'allocsize' was declared here


Hmm, I didn't get that warning either.


Attached patch should fix that.


That's not quite right. If the first mmap() fails, allocsize is set to 
the rounded-up size, but the second mmap() uses the original size for 
the allocation. So it returns a too high value to the caller.


Ugh, it's actually broken anyway :-(. The first allocation also passes 
*size to mmap(), so the calculated rounded-up allocsize value is not 
used for anything.


Fix pushed.

- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_sleep_enhancements.patch

2014-01-29 Thread Pavel Stehule
2014-01-29 Vik Fearing vik.fear...@dalibo.com

 On 01/29/2014 08:21 PM, Pavel Stehule wrote:
  second question - is not this functionality too dangerous? If somebody
  use it as scheduler, then
 
  a) can holds connect, session data, locks too long time
  b) it can stop on query timeout probably much more early then user expect
 
  What is expected use case?

 It is no more dangerous than plain pg_sleep().  The use case is
 convenience and clarity of code.

 I don't think people will be using it as a scheduler any more than they
 do with pg_sleep() because it can't cross transaction boundaries.


I am sure so experienced user didn't use it. But beginners can be messed
due similarity with schedulers.

I invite a) documented these risks b) opinion of other hackers

Probably when it is used as single statement in transaction, then it can
breaks only vacuum

Pavel



 --
 Vik




Re: [HACKERS] Add force option to dropdb

2014-01-29 Thread Robert Haas
On Wed, Jan 29, 2014 at 4:56 AM, salah jubeh s_ju...@yahoo.com wrote:
I'm not particularly in favor of implementing this as client-side
functionality, because then it's only available to people who use that
particular client.  Simon Riggs proposed a server-side option to the
DROP DATABASE command some time ago, and I think that might be the way
to go.

 Could you please direct me -if possible- to the thread. I think,implementing
 it on the client side gives the user the some visibility and control.
 Initially, I wanted to force drop the database, then I have changed it to
 kill connections. I think the change in the client tool, is simple and
 covers the main reason for not being able to drop a database. I think,
 killing client connection is one of the FAQs.

 OTOH, having an option like DROP DATABASE [IF EXISTS, FORCE] database is
 more crisp. However, what does force mean?  many options exist such as
 killing the connections gently, waiting for connections to terminate,
 killing connections immediately. Also, as Alvaro Herrera mentioned, DROP
 OWNED BY and/or REASSIGNED OWNED BY might hinder the force option -an
 example here would be nice-. So, for quick wins, I prefer the kill option in
 the client side; but, for mature solution , certainly back-end is the way to
 proceed.

http://www.postgresql.org/message-id/1296552979.1779.8622.camel@ebony

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing xloginsert_slots?

2014-01-29 Thread Robert Haas
On Wed, Jan 29, 2014 at 8:22 AM, Andres Freund and...@2ndquadrant.com wrote:
 Hi,

 On 2014-01-29 21:59:05 +0900, Michael Paquier wrote:
 The undocumented GUC called xloginsert_slots has been introduced by
 commit 9a20a9b. It is mentioned by the commit that this parameter
 should be removed before the release. Wouldn't it be a good time to
 remove this parameter soon? I imagine that removing it before the beta
 would make sense so now is perhaps too early... Either way, attached
 is a patch doing so...

 I'd rather wait till somebody actually has done some benchmarks. I don't
 think we're more clueful about it now than back when the patch went
 in. And such benchmarking is more likely during beta, so...

Well, it's either got to go away, or get documented, IMHO.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Robert Haas
On Wed, Jan 29, 2014 at 9:06 AM, Andrew Dunstan and...@dunslane.net wrote:
 I am not opposed in principle to adding new things to the counters
 struct in pg_stat_statements. I just think that the fact that the
 overhead of installing the module on a busy production system is
 currently so low is of *major* value, and therefore any person that
 proposes to expand that struct should be required to very conclusively
 demonstrate that there is no appreciably increase in overhead. Having
 a standard deviation column would be nice, but it's still not that
 important. Maybe when we have portable atomic addition we can afford
 to be much more inclusive of that kind of thing.


 Importance is in the eye of the beholder.  As far as I'm concerned, min and
 max are of FAR less value than stddev. If stddev gets left out I'm going to
 be pretty darned annoyed, especially since the benchmarks seem to show the
 marginal cost as being virtually unmeasurable. If those aren't enough for
 you, perhaps you'd like to state what sort of tests would satisfy you.

I agree.  I find it somewhat unlikely that pg_stat_statements is
fragile enough that these few extra counters are going to make much of
a difference.  At the same time, I find min and max a dubious value
proposition.  It seems highly likely to me that stddev will pay its
way, but I'm much less certain about the others.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ALTER TABLE lock strength reduction patch is unsafe

2014-01-29 Thread Simon Riggs
On 29 January 2014 05:43, Tom Lane t...@sss.pgh.pa.us wrote:
 Robert Haas robertmh...@gmail.com writes:
 On Tue, Jan 28, 2014 at 12:36 PM, Alvaro Herrera
 alvhe...@2ndquadrant.com wrote:
 Honestly, I would prefer that we push a patch that has been thoroughly
 reviewed and in which we have more confidence, so that we can push
 without a GUC.  This means mark in CF as needs-review, then some other
 developer reviews it and marks it as ready-for-committer.

 I also believe that would be the correct procedure.  Personally, I
 think it would be great if Noah can review this, as he's very good at
 finding the kind of problems that are likely to crop up here, and has
 examined previous versions.  I also have some interest in looking at
 it myself.  But I doubt that either of us (or any other senior hacker)
 can do that by Thursday.  I think all such people are hip-deep in
 other patches at the moment; I certainly am.

 Yeah.  I actually have little doubt that the patch is sane on its own
 terms of discussion, that is that Simon has chosen locking levels that
 are mutually consistent in an abstract sense.  What sank the previous
 iteration was that he'd failed to consider an implementation detail,
 namely possible inconsistencies in SnapshotNow-based catalog fetches.
 I'm afraid that there may be more such problems lurking under the
 surface.

Agreed

 Noah's pretty good at finding such things, but really two
 or three of us need to sit down and think about it for awhile before
 I'd have much confidence in such a fundamental change.  And I sure don't
 have time for this patch right now myself.

I've reviewed Noah's earlier comments, fixed things and also further
reviewed for any similar relcache related issues.

I've also reviewed Hot Standby to see if any locking issues arise,
since the ALTER TABLE now won't generate an AccessExclusiveLock as it
used to do for certain operations. I can't see any problems there.

While doing those reviews, I'd remind everybody that this only affects
roughly half of ALTER TABLE subcommands and definitely nothing that
affects SELECTs. So the risk profile is much less than it sounds at
first glance.

If anybody else has any hints or clues about where to look, please
mention them and I will investigate. Thanks.

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 21:36, Heikki Linnakangas wrote:
 […]
 Fix pushed.

You are right. Thanks. But there is another bug, see

20140128154307.gc24...@defunct.ch

ff. Attached you will find a patch fixing that.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index ac3a9fe..cf590a0 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -380,9 +380,12 @@ CreateAnonymousSegment(Size *size)
 	}
 
 	if (ptr == MAP_FAILED)
+	{
+		int			saved_errno = errno;
+
 		ereport(FATAL,
 (errmsg(could not map anonymous shared memory: %m),
- (errno == ENOMEM) ?
+ (saved_errno == ENOMEM) ?
  errhint(This error usually means that PostgreSQL's request 
 	for a shared memory segment exceeded available memory, 
 	  swap space or huge pages. To reduce the request size 
@@ -390,6 +393,7 @@ CreateAnonymousSegment(Size *size)
 	   memory usage, perhaps by reducing shared_buffers or 
 		 max_connections.,
 		 *size) : 0));
+	}
 
 	*size = allocsize;
 	return ptr;


pgphzz7yu9Gp0.pgp
Description: PGP signature


Re: [HACKERS] Planning time in explain/explain analyze

2014-01-29 Thread Robert Haas
On Wed, Jan 29, 2014 at 2:21 PM, Andreas Karlsson andr...@proxel.se wrote:
 On 01/28/2014 09:39 PM, Tom Lane wrote:

 I'm for doing the measurement in ExplainOneQuery() and not printing
 anything in code paths that don't go through there.

 Reading the discussion here and realizing that my last patch was wrong I am
 now in agreement with this. I have attached a patch which no longer tries to
 measure planning time for prepared statements.

Cool.  I propose adding one parameter rather than two to
ExplainOnePlan() and making it of type instr_time * rather than
passing an instr_time and a bool.  If you don't want to include the
planning time, pass NULL; if you do, pass a pointer to the instr_time
you want to print.  I think that would come out slightly neater than
what you have here.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Josh Berkus
On 01/29/2014 11:54 AM, Robert Haas wrote:
 I agree.  I find it somewhat unlikely that pg_stat_statements is
 fragile enough that these few extra counters are going to make much of
 a difference.  At the same time, I find min and max a dubious value
 proposition.  It seems highly likely to me that stddev will pay its
 way, but I'm much less certain about the others.

What I really want is percentiles, but I'm pretty sure we already shot
that down. ;-)

I could use min/max -- think of performance test runs.  However, I agree
that they're less valuable than stddev.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: show xid and xmin in pg_stat_activity and pg_stat_replication

2014-01-29 Thread Simon Riggs
On 14 January 2014 08:38, Christian Kruse christ...@2ndquadrant.com wrote:
 Hi,

 On 13/01/14 20:06, Heikki Linnakangas wrote:
 On 12/17/2013 04:58 PM, Christian Kruse wrote:
 attached you will find a patch for showing the current transaction id
 (xid) and the xmin of a backend in pg_stat_activty and the xmin in
 pg_stat_replication.

 Docs.

 Thanks, update with updated docs is attached.

Looks simple enough and useful for working out which people are
holding up CONCURRENT activities.

I've not been involved with this patch, so any objections to me doing
final review and commit?

-- 
 Simon Riggs   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2014-01-29 Thread Andreas Karlsson

On 01/29/2014 09:01 PM, Robert Haas wrote:

Cool.  I propose adding one parameter rather than two to
ExplainOnePlan() and making it of type instr_time * rather than
passing an instr_time and a bool.  If you don't want to include the
planning time, pass NULL; if you do, pass a pointer to the instr_time
you want to print.  I think that would come out slightly neater than
what you have here.


Excellent suggestion, thanks! New patch attached.

/Andreas
diff --git a/doc/src/sgml/perform.sgml b/doc/src/sgml/perform.sgml
new file mode 100644
index 2af1738..482490b
*** a/doc/src/sgml/perform.sgml
--- b/doc/src/sgml/perform.sgml
*** EXPLAIN SELECT * FROM tenk1;
*** 89,94 
--- 89,95 
   QUERY PLAN
  -
   Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
+  Planning time: 0.113 ms
  /screen
 /para
  
*** EXPLAIN SELECT * FROM tenk1;
*** 162,167 
--- 163,174 
 /para
  
 para
+ The literalPlanning time/literal shown is the time it took to generate
+ the query plan from the parsed query and optimize it. It does not include
+ rewriting and parsing.
+/para
+ 
+para
  Returning to our example:
  
  screen
*** EXPLAIN SELECT * FROM tenk1;
*** 170,175 
--- 177,183 
   QUERY PLAN
  -
   Seq Scan on tenk1  (cost=0.00..458.00 rows=1 width=244)
+  Planning time: 0.113 ms
  /screen
 /para
  
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 198,203 
--- 206,212 
  
   Seq Scan on tenk1  (cost=0.00..483.00 rows=7001 width=244)
 Filter: (unique1 lt; 7000)
+  Planning time: 0.104 ms
  /screen
  
  Notice that the commandEXPLAIN/ output shows the literalWHERE/
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 234,239 
--- 243,249 
 Recheck Cond: (unique1 lt; 100)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.093 ms
  /screen
  
  Here the planner has decided to use a two-step plan: the child plan
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 262,267 
--- 272,278 
 Filter: (stringu1 = 'xxx'::name)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.089 ms
  /screen
  
  The added condition literalstringu1 = 'xxx'/literal reduces the
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 283,288 
--- 294,300 
  -
   Index Scan using tenk1_unique1 on tenk1  (cost=0.29..8.30 rows=1 width=244)
 Index Cond: (unique1 = 42)
+  Planning time: 0.076 ms
  /screen
  
  In this type of plan the table rows are fetched in index order, which
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 311,316 
--- 323,329 
 Index Cond: (unique1 lt; 100)
   -gt;  Bitmap Index Scan on tenk1_unique2  (cost=0.00..19.78 rows=999 width=0)
 Index Cond: (unique2 gt; 9000)
+  Planning time: 0.094 ms
  /screen
  
  But this requires visiting both indexes, so it's not necessarily a win
*** EXPLAIN SELECT * FROM tenk1 WHERE unique
*** 331,336 
--- 344,350 
 -gt;  Index Scan using tenk1_unique2 on tenk1  (cost=0.29..71.27 rows=10 width=244)
   Index Cond: (unique2 gt; 9000)
   Filter: (unique1 lt; 100)
+  Planning time: 0.087 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 10 AND t1.unique2
*** 364,369 
--- 378,384 
 Index Cond: (unique1 lt; 10)
 -gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..7.91 rows=1 width=244)
   Index Cond: (unique2 = t1.unique2)
+  Planning time: 0.117 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 10 AND t2.unique2
*** 415,420 
--- 430,436 
 -gt;  Materialize  (cost=0.29..8.51 rows=10 width=244)
   -gt;  Index Scan using tenk2_unique2 on tenk2 t2  (cost=0.29..8.46 rows=10 width=244)
 Index Cond: (unique2 lt; 10)
+  Planning time: 0.119 ms
  /screen
  
  The condition literalt1.hundred lt; t2.hundred/literal can't be
*** WHERE t1.unique1 lt; 100 AND t1.unique2
*** 462,467 
--- 478,484 
 Recheck Cond: (unique1 lt; 100)
 -gt;  Bitmap Index Scan on tenk1_unique1  (cost=0.00..5.04 rows=101 width=0)
   Index Cond: (unique1 lt; 100)
+  Planning time: 0.182 ms
  /screen
 /para
  
*** WHERE t1.unique1 lt; 100 AND t1.unique2
*** 492,497 
--- 509,515 
 -gt;  Sort  (cost=197.83..200.33 rows=1000 width=244)
   Sort Key: t2.unique2

Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Heikki Linnakangas

On 01/29/2014 09:59 PM, Christian Kruse wrote:

Hi,

On 29/01/14 21:36, Heikki Linnakangas wrote:

[…]
Fix pushed.


You are right. Thanks. But there is another bug, see

20140128154307.gc24...@defunct.ch

ff. Attached you will find a patch fixing that.


Thanks. There are more cases of that in InternalIpcMemoryCreate, they 
ought to be fixed as well. And should also grep the rest of the codebase 
for more instances of that. And this needs to be back-patched.


- Heikki


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Use MAP_HUGETLB where supported (v3)

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 22:17, Heikki Linnakangas wrote:
 Thanks. There are more cases of that in InternalIpcMemoryCreate, they ought
 to be fixed as well. And should also grep the rest of the codebase for more
 instances of that. And this needs to be back-patched.

I'm way ahead of you ;-) Working on it.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services



pgpCS2lzolGJg.pgp
Description: PGP signature


[HACKERS] review: psql command copy count tag

2014-01-29 Thread Pavel Stehule
related to
http://www.postgresql.org/message-id/bf2827dcce55594c8d7a8f7ffd3ab7713ddb1...@szxeml508-mbx.china.huawei.com

Hello

1. I had to rebase this patch - actualised version is attached - I merged
two patches to one

2. The psql code is compiled without issues after patching

3. All regress tests are passed without errors

5. We like this feature - it shows interesting info without any slowdown -
psql copy command is more consistent with server side copy statement from
psql perspective.

This patch is ready for commit

Regards

Pavel
*** ./src/bin/psql/common.c.orig	2014-01-29 21:09:07.108862537 +0100
--- ./src/bin/psql/common.c	2014-01-29 21:09:42.810920907 +0100
***
*** 631,638 
   * When the command string contained no affected COPY command, this function
   * degenerates to an AcceptResult() call.
   *
!  * Changes its argument to point to the last PGresult of the command string,
!  * or NULL if that result was for a COPY FROM STDIN or COPY TO STDOUT.
   *
   * Returns true on complete success, false otherwise.  Possible failure modes
   * include purely client-side problems; check the transaction status for the
--- 631,637 
   * When the command string contained no affected COPY command, this function
   * degenerates to an AcceptResult() call.
   *
!  * Changes its argument to point to the last PGresult of the command string.
   *
   * Returns true on complete success, false otherwise.  Possible failure modes
   * include purely client-side problems; check the transaction status for the
***
*** 689,709 
  			 * connection out of its COPY state, then call PQresultStatus()
  			 * once and report any error.
  			 */
  			SetCancelConn();
  			if (result_status == PGRES_COPY_OUT)
! success = handleCopyOut(pset.db, pset.queryFout)  success;
  			else
! success = handleCopyIn(pset.db, pset.cur_cmd_source,
! 	   PQbinaryTuples(*results))  success;
  			ResetCancelConn();
  
  			/*
  			 * Call PQgetResult() once more.  In the typical case of a
  			 * single-command string, it will return NULL.	Otherwise, we'll
  			 * have other results to process that may include other COPYs.
  			 */
! 			PQclear(*results);
! 			*results = next_result = PQgetResult(pset.db);
  		}
  		else if (first_cycle)
  			/* fast path: no COPY commands; PQexec visited all results */
--- 688,723 
  			 * connection out of its COPY state, then call PQresultStatus()
  			 * once and report any error.
  			 */
+ 
+ 			/*
+ 			 * If this is second copy; then it will be definately not \copy,
+ 			 * and also it can not be from any user given file. So pass the
+ 			 * value of copystream as NULL, so that read/wrie happens on
+ 			 * already set cur_cmd_source/queryFout.
+ 			 */
  			SetCancelConn();
  			if (result_status == PGRES_COPY_OUT)
! success = handleCopyOut(pset.db,
!  (first_cycle ? pset.copyStream : NULL),
! 			 results)  success;
  			else
! success = handleCopyIn(pset.db,
!  (first_cycle ? pset.copyStream : NULL),
! 	   PQbinaryTuples(*results),
! 			 results)  success;
  			ResetCancelConn();
  
  			/*
  			 * Call PQgetResult() once more.  In the typical case of a
  			 * single-command string, it will return NULL.	Otherwise, we'll
  			 * have other results to process that may include other COPYs.
+ 			 * It keeps last result.
  			 */
! 			if ((next_result = PQgetResult(pset.db)))
! 			{
! PQclear(*results);
! *results = next_result;
! 			}
  		}
  		else if (first_cycle)
  			/* fast path: no COPY commands; PQexec visited all results */
*** ./src/bin/psql/copy.c.orig	2014-01-29 21:09:15.131875660 +0100
--- ./src/bin/psql/copy.c	2014-01-29 21:09:42.811920909 +0100
***
*** 269,276 
  {
  	PQExpBufferData query;
  	FILE	   *copystream;
- 	FILE	   *save_file;
- 	FILE	  **override_file;
  	struct copy_options *options;
  	bool		success;
  	struct stat st;
--- 269,274 
***
*** 287,294 
  
  	if (options-from)
  	{
- 		override_file = pset.cur_cmd_source;
- 
  		if (options-file)
  		{
  			if (options-program)
--- 285,290 
***
*** 308,315 
  	}
  	else
  	{
- 		override_file = pset.queryFout;
- 
  		if (options-file)
  		{
  			if (options-program)
--- 304,309 
***
*** 369,378 
  		appendPQExpBufferStr(query, options-after_tofrom);
  
  	/* Run it like a user command, interposing the data source or sink. */
! 	save_file = *override_file;
! 	*override_file = copystream;
  	success = SendQuery(query.data);
! 	*override_file = save_file;
  	termPQExpBuffer(query);
  
  	if (options-file != NULL)
--- 363,371 
  		appendPQExpBufferStr(query, options-after_tofrom);
  
  	/* Run it like a user command, interposing the data source or sink. */
! 	pset.copyStream = copystream;
  	success = SendQuery(query.data);
! 	pset.copyStream = NULL;
  	termPQExpBuffer(query);
  
  	if (options-file != NULL)
***
*** 433,444 
   * 

Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 13:39, Tom Lane wrote:
 No, what I meant is that the ereport caller needs to save errno, rather
 than assuming that (some subset of) ereport-related subroutines will
 preserve it.
 […]

Your reasoning sounds quite logical to me. Thus I did a

grep -RA 3 ereport src/* | less

and looked for ereport calls with errno in it. I found quite a few,
attached you will find a patch addressing that issue.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index d73e5e8..3705d0b 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -782,10 +782,14 @@ remove_symlink:
 	else
 	{
 		if (unlink(linkloc)  0)
-			ereport(redo ? LOG : (errno == ENOENT ? WARNING : ERROR),
+		{
+			int saved_errno = errno;
+
+			ereport(redo ? LOG : (saved_errno == ENOENT ? WARNING : ERROR),
 	(errcode_for_file_access(),
 	 errmsg(could not remove symbolic link \%s\: %m,
 			linkloc)));
+		}
 	}
 
 	pfree(linkloc_with_version_dir);
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index b4825d2..c79c8ad 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -85,7 +85,8 @@ static void ReleaseSemaphores(int status, Datum arg);
 static IpcSemaphoreId
 InternalIpcSemaphoreCreate(IpcSemaphoreKey semKey, int numSems)
 {
-	int			semId;
+	int			semId,
+saved_errno;
 
 	semId = semget(semKey, numSems, IPC_CREAT | IPC_EXCL | IPCProtection);
 
@@ -107,12 +108,13 @@ InternalIpcSemaphoreCreate(IpcSemaphoreKey semKey, int numSems)
 		/*
 		 * Else complain and abort
 		 */
+		saved_errno = errno;
 		ereport(FATAL,
 (errmsg(could not create semaphores: %m),
  errdetail(Failed system call was semget(%lu, %d, 0%o).,
 		   (unsigned long) semKey, numSems,
 		   IPC_CREAT | IPC_EXCL | IPCProtection),
- (errno == ENOSPC) ?
+ (saved_errno == ENOSPC) ?
  errhint(This error does *not* mean that you have run out of disk space.  
 		  It occurs when either the system limit for the maximum number of 
 			 semaphore sets (SEMMNI), or the system wide maximum number of 
@@ -133,13 +135,14 @@ static void
 IpcSemaphoreInitialize(IpcSemaphoreId semId, int semNum, int value)
 {
 	union semun semun;
+	int			saved_errno = errno;
 
 	semun.val = value;
 	if (semctl(semId, semNum, SETVAL, semun)  0)
 		ereport(FATAL,
 (errmsg_internal(semctl(%d, %d, SETVAL, %d) failed: %m,
  semId, semNum, value),
- (errno == ERANGE) ?
+ (saved_errno == ERANGE) ?
  errhint(You possibly need to raise your kernel's SEMVMX value to be at least 
   %d.  Look into the PostgreSQL documentation for details.,
 		 value) : 0));
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index ac3a9fe..cb297bb 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -68,6 +68,7 @@ InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size)
 {
 	IpcMemoryId shmid;
 	void	   *memAddress;
+	int			saved_errno = 0;
 
 	shmid = shmget(memKey, size, IPC_CREAT | IPC_EXCL | IPCProtection);
 
@@ -137,25 +138,26 @@ InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size)
 		 * it should be.  SHMMNI violation is ENOSPC, per spec.  Just plain
 		 * not-enough-RAM is ENOMEM.
 		 */
+		saved_errno = errno;
 		ereport(FATAL,
 (errmsg(could not create shared memory segment: %m),
 		  errdetail(Failed system call was shmget(key=%lu, size=%zu, 0%o).,
 	(unsigned long) memKey, size,
 	IPC_CREAT | IPC_EXCL | IPCProtection),
- (errno == EINVAL) ?
+ (saved_errno == EINVAL) ?
  errhint(This error usually means that PostgreSQL's request for a shared memory 
 		 segment exceeded your kernel's SHMMAX parameter, or possibly that 
 		 it is less than 
 		 your kernel's SHMMIN parameter.\n
 		The PostgreSQL documentation contains more information about shared 
 		 memory configuration.) : 0,
- (errno == ENOMEM) ?
+ (saved_errno == ENOMEM) ?
  errhint(This error usually means that PostgreSQL's request for a shared 
 		 memory segment exceeded your kernel's SHMALL parameter.  You might need 
 		 to reconfigure the kernel with larger SHMALL.\n
 		The PostgreSQL documentation contains more information about shared 
 		 memory configuration.) : 0,
- (errno == ENOSPC) ?
+ (saved_errno == ENOSPC) ?
  errhint(This error does *not* mean that you have run out of disk space.  
 		 It occurs either if all available shared memory IDs have been taken, 
 		 in which case you need to raise the SHMMNI parameter in your kernel, 


pgpkJvJgiH340.pgp
Description: PGP signature


Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Christian Kruse
Hi,

On 29/01/14 21:37, Christian Kruse wrote:
 […]
 attached you will find a patch addressing that issue.

Maybe we should include the patch proposed in

20140129195930.gd31...@defunct.ch

and do this as one (slightly bigger) patch. Attached you will find
this alternative version.

Best regards,

-- 
 Christian Kruse   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training  Services

diff --git a/src/backend/commands/tablespace.c b/src/backend/commands/tablespace.c
index d73e5e8..3705d0b 100644
--- a/src/backend/commands/tablespace.c
+++ b/src/backend/commands/tablespace.c
@@ -782,10 +782,14 @@ remove_symlink:
 	else
 	{
 		if (unlink(linkloc)  0)
-			ereport(redo ? LOG : (errno == ENOENT ? WARNING : ERROR),
+		{
+			int saved_errno = errno;
+
+			ereport(redo ? LOG : (saved_errno == ENOENT ? WARNING : ERROR),
 	(errcode_for_file_access(),
 	 errmsg(could not remove symbolic link \%s\: %m,
 			linkloc)));
+		}
 	}
 
 	pfree(linkloc_with_version_dir);
diff --git a/src/backend/port/sysv_sema.c b/src/backend/port/sysv_sema.c
index b4825d2..c79c8ad 100644
--- a/src/backend/port/sysv_sema.c
+++ b/src/backend/port/sysv_sema.c
@@ -85,7 +85,8 @@ static void ReleaseSemaphores(int status, Datum arg);
 static IpcSemaphoreId
 InternalIpcSemaphoreCreate(IpcSemaphoreKey semKey, int numSems)
 {
-	int			semId;
+	int			semId,
+saved_errno;
 
 	semId = semget(semKey, numSems, IPC_CREAT | IPC_EXCL | IPCProtection);
 
@@ -107,12 +108,13 @@ InternalIpcSemaphoreCreate(IpcSemaphoreKey semKey, int numSems)
 		/*
 		 * Else complain and abort
 		 */
+		saved_errno = errno;
 		ereport(FATAL,
 (errmsg(could not create semaphores: %m),
  errdetail(Failed system call was semget(%lu, %d, 0%o).,
 		   (unsigned long) semKey, numSems,
 		   IPC_CREAT | IPC_EXCL | IPCProtection),
- (errno == ENOSPC) ?
+ (saved_errno == ENOSPC) ?
  errhint(This error does *not* mean that you have run out of disk space.  
 		  It occurs when either the system limit for the maximum number of 
 			 semaphore sets (SEMMNI), or the system wide maximum number of 
@@ -133,13 +135,14 @@ static void
 IpcSemaphoreInitialize(IpcSemaphoreId semId, int semNum, int value)
 {
 	union semun semun;
+	int			saved_errno = errno;
 
 	semun.val = value;
 	if (semctl(semId, semNum, SETVAL, semun)  0)
 		ereport(FATAL,
 (errmsg_internal(semctl(%d, %d, SETVAL, %d) failed: %m,
  semId, semNum, value),
- (errno == ERANGE) ?
+ (saved_errno == ERANGE) ?
  errhint(You possibly need to raise your kernel's SEMVMX value to be at least 
   %d.  Look into the PostgreSQL documentation for details.,
 		 value) : 0));
diff --git a/src/backend/port/sysv_shmem.c b/src/backend/port/sysv_shmem.c
index ac3a9fe..511be72 100644
--- a/src/backend/port/sysv_shmem.c
+++ b/src/backend/port/sysv_shmem.c
@@ -68,6 +68,7 @@ InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size)
 {
 	IpcMemoryId shmid;
 	void	   *memAddress;
+	int			saved_errno = 0;
 
 	shmid = shmget(memKey, size, IPC_CREAT | IPC_EXCL | IPCProtection);
 
@@ -137,25 +138,26 @@ InternalIpcMemoryCreate(IpcMemoryKey memKey, Size size)
 		 * it should be.  SHMMNI violation is ENOSPC, per spec.  Just plain
 		 * not-enough-RAM is ENOMEM.
 		 */
+		saved_errno = errno;
 		ereport(FATAL,
 (errmsg(could not create shared memory segment: %m),
 		  errdetail(Failed system call was shmget(key=%lu, size=%zu, 0%o).,
 	(unsigned long) memKey, size,
 	IPC_CREAT | IPC_EXCL | IPCProtection),
- (errno == EINVAL) ?
+ (saved_errno == EINVAL) ?
  errhint(This error usually means that PostgreSQL's request for a shared memory 
 		 segment exceeded your kernel's SHMMAX parameter, or possibly that 
 		 it is less than 
 		 your kernel's SHMMIN parameter.\n
 		The PostgreSQL documentation contains more information about shared 
 		 memory configuration.) : 0,
- (errno == ENOMEM) ?
+ (saved_errno == ENOMEM) ?
  errhint(This error usually means that PostgreSQL's request for a shared 
 		 memory segment exceeded your kernel's SHMALL parameter.  You might need 
 		 to reconfigure the kernel with larger SHMALL.\n
 		The PostgreSQL documentation contains more information about shared 
 		 memory configuration.) : 0,
- (errno == ENOSPC) ?
+ (saved_errno == ENOSPC) ?
  errhint(This error does *not* mean that you have run out of disk space.  
 		 It occurs either if all available shared memory IDs have been taken, 
 		 in which case you need to raise the SHMMNI parameter in your kernel, 
@@ -380,9 +382,12 @@ CreateAnonymousSegment(Size *size)
 	}
 
 	if (ptr == MAP_FAILED)
+	{
+		int			saved_errno = errno;
+
 		ereport(FATAL,
 (errmsg(could not map anonymous shared memory: %m),
- (errno == ENOMEM) ?
+ (saved_errno == ENOMEM) ?
  errhint(This error usually means that PostgreSQL's request 
 	for a shared memory segment exceeded available memory, 
 	  swap space or 

Re: [HACKERS] jsonb and nested hstore

2014-01-29 Thread Merlin Moncure
On Wed, Jan 29, 2014 at 12:03 PM, Andrew Dunstan and...@dunslane.net wrote:
 Only change in functionality is the addition of casts between jsonb and
 json.

 The other changes are the merge with the new json functions code, and
 rearrangement of the docs changes to make them less ugly. Essentially I
 moved the indexterm tags right out of the table as is done in some other
 parts pf the docs. That makes the entry tags much clearer to read.

I think the opening paragraphs contrasting json/jsonb be needs
refinement.  json is going to be slightly faster than jsonb for input
*and* output.  For example, in one application I store fairly large
json objects containing pre-compiled static polygon data that is
simply flipped up to google maps.  This case will likely be pessimal
for jsonb.  For the next paragaph, I'd like to expand it a bit on
'specialized needs' and boil it down to specific uses cases.
Basically, json will likely be more compact in most cases and slightly
faster for input/output;  jsonb would be preferred in any context
where processing, or searching or extensive server side parsing is
employed.

If you agree, I'd be happy to do that...

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-01-29 Thread Josh Berkus
On 01/29/2014 12:46 PM, Merlin Moncure wrote:
 I think the opening paragraphs contrasting json/jsonb be needs
 refinement.  json is going to be slightly faster than jsonb for input
 *and* output.  For example, in one application I store fairly large
 json objects containing pre-compiled static polygon data that is
 simply flipped up to google maps.  This case will likely be pessimal
 for jsonb.  For the next paragaph, I'd like to expand it a bit on
 'specialized needs' and boil it down to specific uses cases.
 Basically, json will likely be more compact in most cases and slightly
 faster for input/output;  jsonb would be preferred in any context
 where processing, or searching or extensive server side parsing is
 employed.
 
 If you agree, I'd be happy to do that...

Please take a stab at it, I'll be happy to revise it.

I was working on doing a two-column table comparison chart; I still
think that's the best way to go.

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Removing xloginsert_slots?

2014-01-29 Thread Andres Freund
On 29. Januar 2014 20:51:38 MEZ, Robert Haas robertmh...@gmail.com wrote:
On Wed, Jan 29, 2014 at 8:22 AM, Andres Freund and...@2ndquadrant.com
wrote:
 Hi,

 On 2014-01-29 21:59:05 +0900, Michael Paquier wrote:
 The undocumented GUC called xloginsert_slots has been introduced by
 commit 9a20a9b. It is mentioned by the commit that this parameter
 should be removed before the release. Wouldn't it be a good time to
 remove this parameter soon? I imagine that removing it before the
beta
 would make sense so now is perhaps too early... Either way, attached
 is a patch doing so...

 I'd rather wait till somebody actually has done some benchmarks. I
don't
 think we're more clueful about it now than back when the patch went
 in. And such benchmarking is more likely during beta, so...

Well, it's either got to go away, or get documented, IMHO.

Yes, all I am saying is that I'd like to wait till things have calmed down a 
bit, so it actually makes sense to run bigger benchmarks. I don't think 
removing the option is that urgent.

Andres


-- 
Please excuse brevity and formatting - I am writing this on my mobile phone.

Andres Freund  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2014-01-29 Thread Robert Haas
On Wed, Jan 29, 2014 at 3:13 PM, Andreas Karlsson andr...@proxel.se wrote:
 On 01/29/2014 09:01 PM, Robert Haas wrote:
 Cool.  I propose adding one parameter rather than two to
 ExplainOnePlan() and making it of type instr_time * rather than
 passing an instr_time and a bool.  If you don't want to include the
 planning time, pass NULL; if you do, pass a pointer to the instr_time
 you want to print.  I think that would come out slightly neater than
 what you have here.

 Excellent suggestion, thanks! New patch attached.

Committed with minor doc changes.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Peter Geoghegan
On Wed, Jan 29, 2014 at 6:06 AM, Andrew Dunstan and...@dunslane.net wrote:
 mportance is in the eye of the beholder.  As far as I'm concerned, min and
 max are of FAR less value than stddev. If stddev gets left out I'm going to
 be pretty darned annoyed, especially since the benchmarks seem to show the
 marginal cost as being virtually unmeasurable. If those aren't enough for
 you, perhaps you'd like to state what sort of tests would satisfy you.

I certainly agree that stddev is far more valuable than min and max.
I'd be inclined to not accept the latter on the grounds that they
aren't that useful.

So, am I correct in my understanding: The benchmark performed here [1]
was of a standard pgbench SELECT workload, with no other factor
influencing the general overhead (unlike the later test that queried
pg_stat_statements as well)? I'd prefer to have seen the impact on a
32 core system, where spinlock contention would naturally be more
likely to appear, but even still I'm duly satisfied.

There was no testing of the performance impact prior to 6 days ago,
and I'm surprised it took Mitsumasa that long to come up with
something like this, since I raised this concern about 3 months ago.

[1] http://www.postgresql.org/message-id/52e10e6a.5060...@lab.ntt.co.jp
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Planning time in explain/explain analyze

2014-01-29 Thread Andreas Karlsson

On 01/29/2014 10:10 PM, Robert Haas wrote:

Committed with minor doc changes.


Thanks!

/Andreas




--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Andrew Dunstan


On 01/29/2014 04:14 PM, Peter Geoghegan wrote:

On Wed, Jan 29, 2014 at 6:06 AM, Andrew Dunstan and...@dunslane.net wrote:

mportance is in the eye of the beholder.  As far as I'm concerned, min and
max are of FAR less value than stddev. If stddev gets left out I'm going to
be pretty darned annoyed, especially since the benchmarks seem to show the
marginal cost as being virtually unmeasurable. If those aren't enough for
you, perhaps you'd like to state what sort of tests would satisfy you.

I certainly agree that stddev is far more valuable than min and max.
I'd be inclined to not accept the latter on the grounds that they
aren't that useful.

So, am I correct in my understanding: The benchmark performed here [1]
was of a standard pgbench SELECT workload, with no other factor
influencing the general overhead (unlike the later test that queried
pg_stat_statements as well)? I'd prefer to have seen the impact on a
32 core system, where spinlock contention would naturally be more
likely to appear, but even still I'm duly satisfied.


I could live with just stddev. Not sure others would be so happy.

Glad we're good on the performance impact front.

cheers

andrew



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] tests for client programs

2014-01-29 Thread Pavel Stehule
Hello

I am looking on this patch.

It is great idea, and I am sure, so we want this patch - it was requested
and proposed more time.

Some tips:

a) possibility to test only selected tests
b) possibility to verify generated file against expected file
c) detection some warnings (expected/unexpected)

p.s. some tests fails when other Postgres is up. It should be checked on
start.and raise warning or stop testing.

Regards

Pavel


ok 4 - pg_ctl initdb
waiting for server to startLOG:  could not bind IPv6 socket: Address
already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a
few seconds and retry.
LOG:  could not bind IPv4 socket: Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a
few seconds and retry.
WARNING:  could not create listen socket for localhost
FATAL:  could not create any TCP/IP sockets
 stopped waiting
pg_ctl: could not start server
Examine the log output.
not ok 5 - pg_ctl start -w

#   Failed test 'pg_ctl start -w'
#   at
/home/pavel/src/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm
line 89.
waiting for server to startLOG:  could not bind IPv6 socket: Address
already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a
few seconds and retry.
LOG:  could not bind IPv4 socket: Address already in use
HINT:  Is another postmaster already running on port 5432? If not, wait a
few seconds and retry.
WARNING:  could not create listen socket for localhost
FATAL:  could not create any TCP/IP sockets
 stopped waiting
pg_ctl: could not start server
Examine the log output.
not ok 6 - second pg_ctl start succeeds

#   Failed test 'second pg_ctl start succeeds'
#   at
/home/pavel/src/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm
line 89.
pg_ctl: PID file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.pid
does not exist
Is server running?
not ok 7 - pg_ctl stop -w

#   Failed test 'pg_ctl stop -w'
#   at
/home/pavel/src/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm
line 89.
ok 8 - second pg_ctl stop fails
pg_ctl: PID file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.pid
does not exist
Is server running?
starting server anyway
pg_ctl: could not read file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.opts
not ok 9 - pg_ctl restart with server not running

#   Failed test 'pg_ctl restart with server not running'
#   at
/home/pavel/src/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm
line 89.
pg_ctl: PID file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.pid
does not exist
Is server running?
starting server anyway
pg_ctl: could not read file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.opts
not ok 10 - pg_ctl restart with server running

#   Failed test 'pg_ctl restart with server running'
#   at
/home/pavel/src/postgresql/src/bin/pg_ctl/../../../src/test/perl/TestLib.pm
line 89.
pg_ctl: PID file
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data/postmaster.pid
does not exist
Is server running?
Bailout called.  Further testing stopped:  system pg_ctl stop -D
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data -m fast failed: 256
Bail out!  system pg_ctl stop -D
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data -m fast failed: 256
FAILED--Further testing stopped: system pg_ctl stop -D
/home/pavel/src/postgresql/src/bin/pg_ctl/testf2KW/data -m fast failed: 256
make[1]: *** [check] Error 255
make[1]: Leaving directory `/home/pavel/src/postgresql/src/bin/pg_ctl'
make: *** [check-pg_ctl-recurse] Error 2
make: Leaving directory `/home/pavel/src/postgresql/src/bin'


Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Tom Lane
Christian Kruse christ...@2ndquadrant.com writes:
 Your reasoning sounds quite logical to me. Thus I did a
 grep -RA 3 ereport src/* | less
 and looked for ereport calls with errno in it. I found quite a few,
 attached you will find a patch addressing that issue.

Excellent, thanks for doing the legwork.  I'll take care of getting
this committed and back-patched.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] commit fest 2014-01 week 2 report

2014-01-29 Thread Peter Eisentraut
Last week:

Status Summary. Needs Review: 66, Waiting on Author: 16, Ready for
Committer: 9, Committed: 20, Returned with Feedback: 2. Total: 113.

This week:

Status Summary. Needs Review: 49, Waiting on Author: 18, Ready for
Committer: 9, Committed: 33, Returned with Feedback: 3, Rejected: 1.
Total: 113.

I wrote to all patch authors who didn't sign up as reviewers yet, and a
number of them signed up.  Several asked me for beginner-level patches
to look at, but it seems we're all out of beginner-level patches this time.

A lot of patches don't have any reviewer yet, for the same reason.

I reminded all reviewers to send in their first review.  In the past, we
would soon start unassigning inactive reviewers, but seems a bit futile
here, for the reason above.

So while there is decent progress, you can do your own math about where
this is likely to end.  There are a lot of complex patches under
consideration, and we might have a painful end game.



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-01-29 Thread Merlin Moncure
On Wed, Jan 29, 2014 at 3:56 PM, Andrew Dunstan and...@dunslane.net wrote:

 On 01/29/2014 01:03 PM, Andrew Dunstan wrote:


 On 01/27/2014 10:43 PM, Andrew Dunstan wrote:


 On 01/26/2014 05:42 PM, Andrew Dunstan wrote:


 Here is the latest set of patches for nested hstore and jsonb.

 Because it's so large I've broken this into two patches and compressed
 them. The jsonb patch should work standalone. The nested hstore patch
 depends on it.

 All the jsonb functions now use the jsonb API - there is no more turning
 jsonb into text and reparsing it.

 At this stage I'm going to be starting cleanup on the jsonb code
 (indentation, error messages, comments etc.) as well get getting up some
 jsonb docs.





 Here is an update of the jsonb part of this. Charges:

  * there is now documentation for jsonb
  * most uses of elog() in json_funcs.c are replaced by ereport().
  * indentation fixes and other tidying.

 No changes in functionality.



 Further update of jsonb portion.

 Only change in functionality is the addition of casts between jsonb and
 json.

 The other changes are the merge with the new json functions code, and
 rearrangement of the docs changes to make them less ugly. Essentially I
 moved the indexterm tags right out of the table as is done in some other
 parts pf the docs. That makes the entry tags much clearer to read.

 Updated to apply cleanly after recent commits.

ok, great.  This is really fabulous.  So far most everything feels
natural and good.

I see something odd in terms of the jsonb use case coverage.  One of
the major headaches with json deserialization presently is that
there's no easy way to easily move a complex (record- or array-
containing) json structure into a row object.  For example,

create table bar(a int, b int[]);
postgres=# select jsonb_populate_record(null::bar, '{a: 1, b:
[1,2]}'::jsonb, false);
ERROR:  cannot populate with a nested object unless use_json_as_text is true

If find the use_json_as_text argument here to be pretty useless
(unlike in the json_build to_record variants where it least provides
some hope for an escape hatch) for handling this since it will just
continue to fail:

postgres=# select jsonb_populate_record(null::bar, '{a: 1, b:
[1,2]}'::jsonb, true);
ERROR:  missing ] in array dimensions

OTOH, the nested hstore handles this no questions asked:

postgres=# select * from populate_record(null::bar, 'a=1,
b={1,2}'::hstore);
 a |   b
---+---
 1 | {1,2}

So, if you need to convert a complex json to a row type, the only
effective way to do that is like this:
postgres=# select* from  populate_record(null::bar, '{a: 1, b:
[1,2]}'::json::hstore);
 a |   b
---+---
 1 | {1,2}

Not a big deal really. But it makes me wonder (now that we have the
internal capability of properly mapping to a record) why *both* the
json/jsonb populate record variants shouldn't point to what the nested
hstore behavior is when the 'as_text' flag is false.  That would
demolish the error and remove the dependency on hstore in order to do
effective rowtype mapping.  In an ideal world the json_build
'to_record' variants would behave similarly I think although there's
no existing hstore analog so I'm assuming it's a non-trival amount of
work.

Now, if we're agreed on that, I then also wonder if the 'as_text'
argument needs to exist at all for the populate functions except for
backwards compatibility on the json side (not jsonb).  For non-complex
structures it does best effort casting anyways so the flag is moot.

merlin


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] jsonb and nested hstore

2014-01-29 Thread Andrew Dunstan


On 01/29/2014 05:37 PM, Merlin Moncure wrote:

On Wed, Jan 29, 2014 at 3:56 PM, Andrew Dunstan and...@dunslane.net wrote:

On 01/29/2014 01:03 PM, Andrew Dunstan wrote:


On 01/27/2014 10:43 PM, Andrew Dunstan wrote:


On 01/26/2014 05:42 PM, Andrew Dunstan wrote:


Here is the latest set of patches for nested hstore and jsonb.

Because it's so large I've broken this into two patches and compressed
them. The jsonb patch should work standalone. The nested hstore patch
depends on it.

All the jsonb functions now use the jsonb API - there is no more turning
jsonb into text and reparsing it.

At this stage I'm going to be starting cleanup on the jsonb code
(indentation, error messages, comments etc.) as well get getting up some
jsonb docs.





Here is an update of the jsonb part of this. Charges:

  * there is now documentation for jsonb
  * most uses of elog() in json_funcs.c are replaced by ereport().
  * indentation fixes and other tidying.

No changes in functionality.



Further update of jsonb portion.

Only change in functionality is the addition of casts between jsonb and
json.

The other changes are the merge with the new json functions code, and
rearrangement of the docs changes to make them less ugly. Essentially I
moved the indexterm tags right out of the table as is done in some other
parts pf the docs. That makes the entry tags much clearer to read.

Updated to apply cleanly after recent commits.

ok, great.  This is really fabulous.  So far most everything feels
natural and good.

I see something odd in terms of the jsonb use case coverage.  One of
the major headaches with json deserialization presently is that
there's no easy way to easily move a complex (record- or array-
containing) json structure into a row object.  For example,

create table bar(a int, b int[]);
postgres=# select jsonb_populate_record(null::bar, '{a: 1, b:
[1,2]}'::jsonb, false);
ERROR:  cannot populate with a nested object unless use_json_as_text is true

If find the use_json_as_text argument here to be pretty useless
(unlike in the json_build to_record variants where it least provides
some hope for an escape hatch) for handling this since it will just
continue to fail:

postgres=# select jsonb_populate_record(null::bar, '{a: 1, b:
[1,2]}'::jsonb, true);
ERROR:  missing ] in array dimensions

OTOH, the nested hstore handles this no questions asked:

postgres=# select * from populate_record(null::bar, 'a=1,
b={1,2}'::hstore);
  a |   b
---+---
  1 | {1,2}

So, if you need to convert a complex json to a row type, the only
effective way to do that is like this:
postgres=# select* from  populate_record(null::bar, '{a: 1, b:
[1,2]}'::json::hstore);
  a |   b
---+---
  1 | {1,2}

Not a big deal really. But it makes me wonder (now that we have the
internal capability of properly mapping to a record) why *both* the
json/jsonb populate record variants shouldn't point to what the nested
hstore behavior is when the 'as_text' flag is false.  That would
demolish the error and remove the dependency on hstore in order to do
effective rowtype mapping.  In an ideal world the json_build
'to_record' variants would behave similarly I think although there's
no existing hstore analog so I'm assuming it's a non-trival amount of
work.

Now, if we're agreed on that, I then also wonder if the 'as_text'
argument needs to exist at all for the populate functions except for
backwards compatibility on the json side (not jsonb).  For non-complex
structures it does best effort casting anyways so the flag is moot.



Well, I could certainly look at making the populate_record{set} and 
to_record{set} logic handle types that are arrays or composites inside 
the record. It might not be terribly hard to do - not sure.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Changeset Extraction v7.3

2014-01-29 Thread Christian Convey
It seems to me that the terms physical, logical, and binary are
always relative to the perspective of the component being worked on.

Physical often means one level of abstraction below mine, and upon which
my work builds.  Logical means my work's level of abstraction.  And
Binary means data which I'm not going to pretend I know or care how to
interpret.

So I'd suggest block and tuple, perhaps.


On Wed, Jan 29, 2014 at 4:25 AM, Albe Laurenz laurenz.a...@wien.gv.atwrote:

 Andreas Karlsson wrote:
  On 01/28/2014 10:56 PM, Andres Freund wrote:
  On 2014-01-28 21:48:09 +, Thom Brown wrote:
  On 28 January 2014 21:37, Robert Haas robertmh...@gmail.com wrote:
  On Tue, Jan 28, 2014 at 11:53 AM, Robert Haas robertmh...@gmail.com
 wrote:
  The point of Andres's patch set is to introduce a new technology
  called logical decoding; that is, the ability to get a replication
  stream that is based on changes to tuples rather than changes to
  blocks.  It could also be called logical replication.  In these
  patches, our existing replication is referred to as physical
  replication, which sounds kind of funny to me.  Anyone have another
  suggestion?
 
  Logical and Binary replication?
 
  Unfortunately changeset extraction output's can be binary data...
 
  I think physical and logical are fine and they seem to be well known
  terminology. Oracle uses those words and I have also seen many places
  use physical backup and logical backup, for example on Barman's
  homepage.

 +1

 I think it also fits well with the well-known terms physical [database]
 design and logical design.  Not that it is the same thing, but I
 believe that every database person, when faced with the distiction
 physical versus logical, will conclude that the former refers to
 data placement or block structure, while the latter refers to the
 semantics of the data being stored.

 Yours,
 Laurenz Albe

 --
 Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
 To make changes to your subscription:
 http://www.postgresql.org/mailpref/pgsql-hackers



Re: [HACKERS] jsonb and nested hstore

2014-01-29 Thread Josh Berkus
On 01/29/2014 02:37 PM, Merlin Moncure wrote:
 create table bar(a int, b int[]);
 postgres=# select jsonb_populate_record(null::bar, '{a: 1, b:
 [1,2]}'::jsonb, false);
 ERROR:  cannot populate with a nested object unless use_json_as_text is true

Hmmm. What about just making any impossibly complex objects type JSON?

-- 
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread Tom Lane
Andrew Dunstan and...@dunslane.net writes:
 I could live with just stddev. Not sure others would be so happy.

FWIW, I'd vote for just stddev, on the basis that min/max appear to add
more to the counter update time than stddev does; you've got
this:

+   e-counters.total_sqtime += total_time * total_time;

versus this:
  
+   if (e-counters.min_time  total_time || e-counters.min_time 
== EXEC_TIME_INIT)
+   e-counters.min_time = total_time;
+   if (e-counters.max_time  total_time)
+   e-counters.max_time = total_time;

I think on most modern machines, a float multiply-and-add is pretty
darn cheap; a branch that might or might not be taken, OTOH, is a
performance bottleneck.

Similarly, the shared memory footprint hit is more: two new doubles
for min/max versus one for total_sqtime (assuming we're happy with
the naive stddev calculation).

If we felt that min/max were of similar value to stddev then this
would be mere nitpicking.  But since people seem to agree they're
worth less, I'm thinking the cost/benefit ratio isn't there.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Failure while inserting parent tuple to B-tree is not fun

2014-01-29 Thread Peter Geoghegan
On Mon, Jan 27, 2014 at 10:54 AM, Peter Geoghegan p...@heroku.com wrote:
 On Mon, Jan 27, 2014 at 10:27 AM, Heikki Linnakangas
 hlinnakan...@vmware.com wrote:
 I think I see some bugs in _bt_moveright(). If you examine
 _bt_finish_split() in detail, you'll see that it doesn't just drop the
 write buffer lock that the caller will have provided (per its
 comments) - it also drops the buffer pin. It calls _bt_insert_parent()
 last, which was previously only called last thing by _bt_insertonpg()
 (some of the time), and _bt_insertonpg() is indeed documented to drop
 both the lock and the pin. And if you look at _bt_moveright() again,
 you'll see that there is a tacit assumption that the buffer pin isn't
 dropped, or at least that opaque (the BTPageOpaque BT special page
 area pointer) continues to point to the same page held in the same
 buffer, even though the code doesn't set the opaque' pointer again
 and it may not point to a pinned buffer or even the appropriate
 buffer. Ditto page. So opaque may point to the wrong thing on
 subsequent iterations - you don't do anything with the return value of
 _bt_getbuf(), unlike all of the existing call sites. I believe you
 should store its return value, and get the held page and the special
 pointer on the page, and assign that to the opaque pointer before
 the next iteration (an iteration in which you very probably really do
 make progress not limited to completing a page split, the occurrence
 of which the _bt_moveright() loop gets stuck on). You know, do what
 is done in the non-split-handling case. It may not always be the same
 buffer as before, obviously.


 Yep, fixed.

 Can you explain what the fix was, please?

Ping? I would like to hear some details here.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated emacs configuration

2014-01-29 Thread Bruce Momjian
On Wed, Jan 29, 2014 at 12:53:02AM -0500, Tom Lane wrote:
 Robert Haas robertmh...@gmail.com writes:
  If this only affects a handful of places, then sure, let's go ahead
  and fix it.  But if it's going to create a massive enough diff that
  we've gotta think about back-patching it, then IMHO it's totally not
  worth it.
 
 A quick grep search says there are well over 3000 comment lines containing
 '.' followed by a tab.  grep isn't smart enough to tell if the tabs expand
 to exactly two spaces, but I bet the vast majority do.  So it might not
 be quite as bad as the 8.1 wrap-margin change, but it'd be extensive.

I have cleaned up entab.c so I am ready to add a new option that removes
tabs from only comments.  Would you like me to create that and provide a
diff at a URL?  It would have to be run against all back branches.

If it goes well, it could prove to be a way to incrementally improve
pgindent.  If not, I am afraid we are stuck with our current pgindent
output.

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated emacs configuration

2014-01-29 Thread Tom Lane
Bruce Momjian br...@momjian.us writes:
 I have cleaned up entab.c so I am ready to add a new option that removes
 tabs from only comments.  Would you like me to create that and provide a
 diff at a URL?  It would have to be run against all back branches.

If you think you can actually tell the difference reliably in entab,
sure, give it a go.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New option for pg_basebackup, to specify a different directory for pg_xlog

2014-01-29 Thread Haribabu Kommi
On Tue, Jan 28, 2014 at 1:17 PM, Peter Eisentraut wrote:

 On 11/30/13, 6:59 AM, Haribabu kommi wrote:
  To detect provided data and xlog directories are same or not, I reused
 the
  Existing make_absolute_path() code as follows.

 I note that initdb does not detect whether the data and xlog directories
 are the same.  I think there is no point in addressing this only in
 pg_basebackup.  If we want to forbid it, it should be done in initdb
 foremost.


 Thanks for pointing it. if the following approach is fine for identifying
the identical directories
 then I will do the same for initdb also.


 I'm not sure it's worth the trouble, but if I were to do it, I'd just
 stat() the two directories and compare their inodes.  That seems much
 easier and more robust than comparing path strings


stat() is having problems in windows, because of that reason the patch is
written to identify the directories with string comparison.

Regards,
Hari Babu


Re: [HACKERS] GIN improvements part2: fast scan

2014-01-29 Thread Tomas Vondra
On 28.1.2014 08:29, Heikki Linnakangas wrote:
 On 01/28/2014 05:54 AM, Tomas Vondra wrote:
 Then I ran those scripts on:

* 9.3
* 9.4 with Heikki's patches (9.4-heikki)
* 9.4 with Heikki's and first patch (9.4-alex-1)
* 9.4 with Heikki's and both patches (9.4-alex-2)
 
 It would be good to also test with unpatched 9.4 (ie. git master). The
 packed posting lists patch might account for a large part of the
 differences between 9.3 and the patched 9.4 versions.
 
 - Heikki
 

Hi,

the e-mail I sent yesterday apparently did not make it into the list,
probably because of the attachments, so I'll just link them this time.

I added the results from 9.4 master to the spreadsheet:

https://docs.google.com/spreadsheet/ccc?key=0Alm8ruV3ChcgdHJfZTdOY2JBSlkwZjNuWGlIaGM0REE

It's a bit cumbersome to analyze though, so I've quickly hacked up a
simple jqplot page that allows comparing the results. It's available
here: http://www.fuzzy.cz/tmp/gin/

It's likely there are some quirks and issues - let me know about them.

The ODT with the data is available here:

   http://www.fuzzy.cz/tmp/gin/gin-scan-benchmarks.ods


Three quick basic observations:

(1) The current 9.4 master is consistently better than 9.3 by about 15%
on rare words, and up to 30% on common words. See the charts for
6-word queries:

   http://www.fuzzy.cz/tmp/gin/6-words-rare-94-vs-93.png
   http://www.fuzzy.cz/tmp/gin/6-words-rare-94-vs-93.png

With 3-word queries the effects are even stronger  clearer,
especially with the common words.

(2) Heikki's patches seem to work OK, i.e. improve the performance, but
only with rare words.

  http://www.fuzzy.cz/tmp/gin/heikki-vs-94-rare.png

With 3 words the impact is much stronger than with 6 words,
presumably because it depends on how frequent the combination of
words is (~ multiplication of probabilities). See

  http://www.fuzzy.cz/tmp/gin/heikki-vs-94-3-common-words.png
  http://www.fuzzy.cz/tmp/gin/heikki-vs-94-6-common-words.png

for comparison of 9.4 master vs. 9.4+heikki's patches.

(3) A file with explain plans for 4 queries suffering ~2x slowdown,
and explain plans with 9.4 master and Heikki's patches is available
here:

  http://www.fuzzy.cz/tmp/gin/queries.txt

All the queries have 6 common words, and the explain plans look
just fine to me - exactly like the plans for other queries.

Two things now caught my eye. First some of these queries actually
have words repeated - either exactly like database  database or
in negated form like !anything  anything. Second, while
generating the queries, I use dumb frequency, where only exact
matches count. I.e. write != written etc. But the actual number
of hits may be much higher - for example write matches exactly
just 5% documents, but using @@ it matches more than 20%.

I don't know if that's the actual cause though.

regards
Tomas



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Suspicion of a compiler bug in clang: using ternary operator in ereport()

2014-01-29 Thread Tom Lane
Christian Kruse christ...@2ndquadrant.com writes:
 Your reasoning sounds quite logical to me. Thus I did a
 grep -RA 3 ereport src/* | less
 and looked for ereport calls with errno in it. I found quite a few,
 attached you will find a patch addressing that issue.

Committed.  I found a couple of errors in your patch, but I think
everything is addressed in the patch as committed.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] FOREIGN KEY ... CONCURRENTLY

2014-01-29 Thread Alvaro Herrera
David Fetter wrote:

 2) Is there another way to solve the problem of adding a foreign
 key constraint that points at a busy table?

Add it as NOT VALID and do a later VALIDATE CONSTRAINT step, after the
patch to reduce lock levels for ALTER TABLE goes in, perhaps?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training  Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] updated emacs configuration

2014-01-29 Thread Bruce Momjian
On Wed, Jan 29, 2014 at 07:31:29PM -0500, Tom Lane wrote:
 Bruce Momjian br...@momjian.us writes:
  I have cleaned up entab.c so I am ready to add a new option that removes
  tabs from only comments.  Would you like me to create that and provide a
  diff at a URL?  It would have to be run against all back branches.
 
 If you think you can actually tell the difference reliably in entab,
 sure, give it a go.

OK, I have modified entab.c in a private patch to only process text
inside comments, and not process leading whitespace, patch attached.  I
basically ran 'entab -o -t4 -d' on the C files.

The result are here, in context, plain, and unified format:

http://momjian.us/expire/entab_comment.cdiff
http://momjian.us/expire/entab_comment.pdiff
http://momjian.us/expire/entab_comment.udiff

and their line counts:

89741 entab_comment.cdiff
26351 entab_comment.pdiff
50503 entab_comment.udiff

I compute 6627 lines as modified.  What I did not do is handle _only_
cases with periods before the tabs.  Should I try that?

-- 
  Bruce Momjian  br...@momjian.ushttp://momjian.us
  EnterpriseDB http://enterprisedb.com

  + Everyone has their own god. +
diff --git a/src/tools/entab/entab.c b/src/tools/entab/entab.c
new file mode 100644
index 3b849f2..3fd2997
*** a/src/tools/entab/entab.c
--- b/src/tools/entab/entab.c
*** main(int argc, char **argv)
*** 51,63 
  {
  	int			tab_size = 8,
  min_spaces = 2,
  protect_quotes = FALSE,
  del_tabs = FALSE,
  clip_lines = FALSE,
  prv_spaces,
  col_in_tab,
  escaped,
! nxt_spaces;
  	char		in_line[BUFSIZ],
  out_line[BUFSIZ],
  			   *src,
--- 51,66 
  {
  	int			tab_size = 8,
  min_spaces = 2,
+ only_comments = FALSE,
  protect_quotes = FALSE,
  del_tabs = FALSE,
  clip_lines = FALSE,
+ in_comment = FALSE,
  prv_spaces,
  col_in_tab,
  escaped,
! nxt_spaces,
! leading_whitespace;
  	char		in_line[BUFSIZ],
  out_line[BUFSIZ],
  			   *src,
*** main(int argc, char **argv)
*** 74,80 
  	if (strcmp(cp, detab) == 0)
  		del_tabs = 1;
  
! 	while ((ch = getopt(argc, argv, cdhqs:t:)) != -1)
  		switch (ch)
  		{
  			case 'c':
--- 77,83 
  	if (strcmp(cp, detab) == 0)
  		del_tabs = 1;
  
! 	while ((ch = getopt(argc, argv, cdhoqs:t:)) != -1)
  		switch (ch)
  		{
  			case 'c':
*** main(int argc, char **argv)
*** 83,88 
--- 86,94 
  			case 'd':
  del_tabs = TRUE;
  break;
+ 			case 'o':
+ only_comments = TRUE;
+ break;
  			case 'q':
  protect_quotes = TRUE;
  break;
*** main(int argc, char **argv)
*** 97,102 
--- 103,109 
  fprintf(stderr, USAGE: %s [ -cdqst ] [file ...]\n\
  	-c (clip trailing whitespace)\n\
  	-d (delete tabs)\n\
+ 	-o (only comments)\n\
  	-q (protect quotes)\n\
  	-s minimum_spaces\n\
  	-t tab_width\n,
*** main(int argc, char **argv)
*** 134,146 
  			if (escaped == FALSE)
  quote_char = ' ';
  			escaped = FALSE;
  
  			/* process line */
  			while (*src != NUL)
  			{
  col_in_tab++;
  /* Is this a potential space/tab replacement? */
! if (quote_char == ' '  (*src == ' ' || *src == '\t'))
  {
  	if (*src == '\t')
  	{
--- 141,163 
  			if (escaped == FALSE)
  quote_char = ' ';
  			escaped = FALSE;
+ 			leading_whitespace = TRUE;
  
  			/* process line */
  			while (*src != NUL)
  			{
  col_in_tab++;
+ 
+ /* look backward so we handle slash-star-slash properly */
+ if (!in_comment  src  in_line 
+ 	*(src - 1) == '/'  *src == '*')
+ 	in_comment = TRUE;
+ else if (in_comment  *src == '*'  *(src + 1) == '/')
+ 	in_comment = FALSE;
+ 
  /* Is this a potential space/tab replacement? */
! if ((!only_comments || (in_comment  !leading_whitespace)) 
! 	quote_char == ' '  (*src == ' ' || *src == '\t'))
  {
  	if (*src == '\t')
  	{
*** main(int argc, char **argv)
*** 192,197 
--- 209,218 
  /* Not a potential space/tab replacement */
  else
  {
+ 	/* allow leading stars in comments */
+ 	if (leading_whitespace  *src != ' '  *src != '\t' 
+ 		(!in_comment || *src != '*'))
+ 		leading_whitespace = FALSE;
  	/* output accumulated spaces */
  	output_accumulated_spaces(prv_spaces, dst);
  	/* This can only happen in a quote. */

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add min and max execute statement time in pg_stat_statement

2014-01-29 Thread KONDO Mitsumasa

(2014/01/29 16:58), Peter Geoghegan wrote:

On Tue, Jan 28, 2014 at 10:51 PM, Tom Lane t...@sss.pgh.pa.us wrote:

KONDO Mitsumasa kondo.mitsum...@lab.ntt.co.jp writes:

By the way, latest pg_stat_statement might affect performance in Windows system.
Because it uses fflush() system call every creating new entry in
pg_stat_statements, and it calls many fread() to warm file cache.


This statement doesn't seem to have much to do with the patch as
committed.


You could make a strong case for the patch improving performance in
practice for many users, by allowing us to increase the
pg_stat_statements.max default value to 5,000, 5 times the previous
value. The real expense of creating a new entry is the exclusive
locking of the hash table, which blocks *everything* in
pg_stat_statements. That isn't expanded at all, since queries are only
written with a shared lock, which only blocks the creation of new
entries which was already relatively expensive. In particular, it does
not block the maintenance of costs for all already observed entries in
the hash table. It's obvious that simply reducing the pressure on the
cache will improve matters a lot, which for many users the external
texts patch does.

Since Mitsumasa has compared that patch and at least one of his
proposed pg_stat_statements patches on several occasions, I will
re-iterate the difference: any increased overhead from the external
texts patch is paid only when a query is first entered into the hash
table. Then, there is obviously and self-evidently no additional
overhead from contention for any future execution of the same query,
no matter what, indefinitely (the exclusive locking section of
creating a new entry does not do I/O, except in fantastically unlikely
circumstances). So for many of the busy production systems I work
with, that's the difference between a cost paid perhaps tens of
thousands of times a second, and a cost paid every few days or weeks.
Even if he is right about a write() taking an unreasonably large
amount of time on Windows, the consequences felt as pg_stat_statements
LWLock contention would be very limited.

I am not opposed in principle to adding new things to the counters
struct in pg_stat_statements. I just think that the fact that the
overhead of installing the module on a busy production system is
currently so low is of *major* value, and therefore any person that
proposes to expand that struct should be required to very conclusively
demonstrate that there is no appreciably increase in overhead. Having
a standard deviation column would be nice, but it's still not that
important. Maybe when we have portable atomic addition we can afford
to be much more inclusive of that kind of thing.


I'd like to know the truth and the fact in your patch, rather than your 
argument:-)

So I create detail pg_stat_statements benchmark tool using pgbench.
This tool can create 1 pattern unique sqls in a file, and it is only
for measuring pg_stat_statements performance. Because it only updates
pg_stat_statements data and doesn't write to disk at all. It's fair benchmark.

[usage]
perl create_sql.pl file.sql
psql -h -h xxx.xxx.xxx.xxx mitsu-ko -c 'SELECT pg_stat_statements_reset()'
pgbench -h xxx.xxx.xxx.xxx mitsu-ko -c64 -j32 -n -T 180 -f file.sql

[part of sample sqls file, they are executed in pgbench]
SELECT 
1-1/9+5*8*6+5+9-6-3-4/9%7-2%7/5-9/7+2+9/7-1%5%9/3-4/4-9/1+5+5/6/5%2+1*2*3-8/8+5-3-8-4/8+5%2*2-2-5%6+4
SELECT 
1%9%7-8/5%6-1%2*2-7+9*6-2*6-9+1-2*9+6+7*8-4-2*1%3/7-1%4%9+4+7/5+4/2-3-5%8/3/7*6-1%8*6*1/7/2%5*6/2-3-9
SELECT 
1%3*2/8%6%5-3%1+1/6*6*5/9-2*5+6/6*5-1/2-6%4%8/7%2*7%5%9%5/9%1%1-3-9/2*1+1*6%8/2*4/1+6*7*1+5%6+9*1-9*6

...

I test some methods that include old pgss, old pgss with my patch, and new pgss.
Test server and postgresql.conf are same in I tested last day in this ML-thread.
And test methods and test results are here,

[methods]
method 1: with old pgss, pg_stat_statements.max=1000(default)
method 2: with old pgss with my patch, pg_stat_statements.max=1000(default)
peter 1 : with new pgss(Peter's patch), pg_stat_statements.max=5000(default)
peter 2 : with new pgss(Peter's patch), pg_stat_statements.max=1000

[for reference]
method 5:with old pgss, pg_stat_statements.max=5000
method 6:with old pgss with my patch, pg_stat_statements.max=5000

[results]
 method   |  try1  |  try2  |  try3  | degrade performance ratio
-
 method 1 | 6.546  | 6.558  | 6.638  | 0% (reference score)
 method 2 | 6.527  | 6.556  | 6.574  | 1%
 peter 1  | 5.204  | 5.203  | 5.216  |20%
 peter 2  | 4.241  | 4.236  | 4.262  |35%

 method 5 | 5.931  | 5.848  | 5.872  |11%
 method 6 | 5.794  | 5.792  | 5.776  |12%


New pgss is extremely degrade performance than old pgss, and I cannot see 
performance scaling.
I understand that his argument was My patch reduces time of LWLock in 
pg_stat_statements,

it is good for performance. However, Kondo's patch will be degrade performance 
in

  1   2   >