Re: [HACKERS] tab completion for alter extension

2016-08-16 Thread Gerdan Santos
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I did some tests and found nothing special. The stated resource is implemented 
correctly.
He passes all regression tests and enables the use of the new features 
specified.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] psql: tab completion for \l

2016-08-16 Thread Gerdan Santos
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

I did some tests and found nothing special. The stated resource is implemented 
correctly.
He passes all regression tests and enables the use of the new features 
specified.

The new status of this patch is: Ready for Committer

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-08-16 Thread Ashutosh Bapat
> +relid_is_partition(Oid relid)
> +{
> +   return SearchSysCacheExists1(PARTRELID, ObjectIdGetDatum(relid));
> +}
>
> This is used in a lot of places, and the overhead of checking it in
> all of those places is not necessarily nil.  Syscache lookups aren't
> free.  What if we didn't create a new catalog for this and instead
> just added a relpartitionbound attribute to pg_class?  It seems a bit
> silly to have a whole extra catalog to store one extra column...
>
>
>

It looks like in most of the places where this function is called it's
using relid_is_partition(RelationGetRelid(rel)). Instead probably we should
check existence of rd_partdesc or rd_partkey within Relation() and make
sure that those members are always set for a partitioned table. That will
avoid cache lookup and may give better performance.

That brings up another question. Can we have rd_partdesc non null and
rd_partkey null or vice-versa. If not, should we club those into a single
structure like Partition (similar to Relation)?



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] Why we lost Uber as a user

2016-08-16 Thread Craig Ringer
On 17 August 2016 at 08:36, Jim Nasby  wrote:

> Something I didn't see mentioned that I think is a critical point: last I
> looked, HOT standby (and presumably SR) replays full page writes.


Yes, that's right, all WAL-based physical replication replays FPWs.

We could, at the cost of increased WAL size, retain both the original WAL
buffer that triggered the FPW and the FPW page image. That's what wal_level
= logical does in some cases. I'm not sure it's that compelling though, it
just introduces another redo path that can go wrong.



> Ultimately, people really need to understand the trade-offs to the
> different solutions so they can make an informed decision on which ones
> (yes, plural) they want to use. The same can be said about pg_upgrade vs
> something else, and the different ways of doing backups.
>

Right.

It's really bugging me that people are talking about "statement based"
replication in MySQL as if it's just sending SQL text around. MySQL's
statemnet based replication is a lot smarter than that, and in the
actually-works-properly form it's a hybrid of row and statement based
replication ("MIXED" mode). As I understand it it lobs around something
closer to parsetrees with some values pre-computed rather than SQL text
where possible. It stores some computed values of volatile functions in the
binlog and reads them from there rather than computing them again when
running the statement on replicas, which is why AUTO_INCREMENT etc works.
It also falls back to row based replication where necessary for
correctness. Even then it has a significant list of caveats, but it's
pretty damn impressive. I didn't realise how clever the hybrid system was
until recently.

I can see it being desirable to do something like that eventually as an
optimisation to logical decoding based replication. Where we can show that
the statement is safe or make it safe by doing things like evaluating and
substituting volatile function calls, xlog a modified parsetree with oids
changed to qualified object names etc, send that when decoding, and execute
that on the downstream(s). If there's something we can't show to be safe
then replay the logical rows instead. That's way down the track though; I
think it's more important to focus on completing logical row-based
replication to the point where we handle table rewrites seamlessly and it
"just works" first.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Craig Ringer
On 17 August 2016 at 09:49, Andres Freund  wrote:


>
> You need to include the files surrounded by extern "C" { }.
>

I'd really like to adopt the convention used by many libraries etc of doing
this automatically - detecting a c++ compiler in the preprocessor and
wrapping in "extern "C"" .

Having the codebase c++-clean enough to compile with a c++ compiler seems
to be the easiest way to maintain that, but means more "extern "C""
droppings in the .c files, not just the headers. Still, pretty ignoreable.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] patch proposal

2016-08-16 Thread Venkata B Nagothi
On Wed, Aug 17, 2016 at 12:06 AM, Stephen Frost  wrote:

> Greetings,
>
> * Venkata B Nagothi (nag1...@gmail.com) wrote:
> > The above said parameters can be configured to pause, shutdown or prevent
> > promotion only after reaching the recovery target point.
> > To clarify, I am referring to a scenario where recovery target point is
> not
> > reached at all ( i mean, half-complete or in-complete recovery) and there
> > are lots of WALs still pending to be replayed - in this situation,
>
> PG doesn't know that there are still WALs to be replayed.
>

PG doesn't know that there are still WALs to be replayed. Since, i have
given an particular recovery target and PG knows the current replay
position,
I would say, it would be good if PG warns and pauses there by saying
recovery target point is not reached.

> It would be nice if PostgreSQL pauses the recovery in-case its not
> complete
> > (because of missing or corrupt WAL), shutdown the cluster and allows the
> > DBA to restart the replay of the remaining WAL Archive files to continue
> > recovery (from where it stopped previously) until the recovery target
> point
> > is reached.
>

Agreed. Reaching end-of-WAL is not an error. It sounds more like a
limitation in certain scenarios.

Reaching the end of WAL isn't an error and I don't believe it makes any
> sense to treat it like it is.  You can specify any recovery target point
> you wish, including ones that don't exist, and that's not an error
> either.
>
> I could see supporting an additional "pause" option that means "pause at
> the end of WAL if you don't reach the recovery target point".  I'd also
> be happy with a warning being emitted in the log if the recovery target
> point isn't reached before reaching the end of WAL, but I don't think it
> makes sense to change the existing behavior.
>

Agreed. Additional option like "pause" would. As long as there is an option
to ensure following happens if the recovery target is not reached -

 a) PG pauses the recovery at the end of the WAL
 b) Generates a warning in the log file saying that recovery target point
is not reached (there is a patch being worked upon on by Thom on this)
 c) Does not open-up the database exiting from the recovery process by
giving room to resume the replay of WALs



Regards,
Venkata B N

Fujitsu Australia


Re: [HACKERS] support for NEXT VALUE FOR expression

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> Here is a patch for implementing the NEXT VALUE FOR expression.  This is
> the SQL-standard conforming version of our nextval() function, and it's
> also used by Oracle, MS SQL, DB2.

BTW, several of the earlier threads complained of needing to make NEXT
a fully-reserved word in order to get this to parse without shift/reduce
conflicts.  How did you avoid that?  I notice that your patch puts the
new production into c_expr not func_expr_common_subexpr which would
seem like the obvious place.  If that is what's making the difference
it seems rather fragile, and it would mean that NEXT VALUE FOR doesn't
act like a function in some syntactic contexts like a FROM-function.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] support for NEXT VALUE FOR expression

2016-08-16 Thread Tom Lane
Thomas Munro  writes:
> On Wed, Aug 17, 2016 at 3:52 PM, Tom Lane  wrote:
>> We discussed this before and concluded that NEXT VALUE FOR is in fact
>> *not* an exact semantic equivalent of nextval():
>> https://www.postgresql.org/message-id/14790.1083955136%40sss.pgh.pa.us

> And also again 10 years later when I proposed it :-)
> https://www.postgresql.org/message-id/flat/CADLWmXUY2oo4XObQWF3yPUSK%3D5uEiSV%3DeTyLrnu%3DRzteOy%2B3Lg%40mail.gmail.com

And that links to yet another thread, from 2002 ;-)

The 2004 thread does contain some speculation about how to implement the
spec's semantics.  It seems like the first problem is nailing down what
is meant by "once per row", particularly in cases with nested execution
contexts.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] support for NEXT VALUE FOR expression

2016-08-16 Thread Thomas Munro
On Wed, Aug 17, 2016 at 3:52 PM, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> Here is a patch for implementing the NEXT VALUE FOR expression.  This is
>> the SQL-standard conforming version of our nextval() function, and it's
>> also used by Oracle, MS SQL, DB2.  Example:
>
> We discussed this before and concluded that NEXT VALUE FOR is in fact
> *not* an exact semantic equivalent of nextval():
>
> https://www.postgresql.org/message-id/14790.1083955136%40sss.pgh.pa.us

And also again 10 years later when I proposed it :-)

https://www.postgresql.org/message-id/flat/CADLWmXUY2oo4XObQWF3yPUSK%3D5uEiSV%3DeTyLrnu%3DRzteOy%2B3Lg%40mail.gmail.com

> I remain of the opinion that using spec-compliant syntax for
> non-spec-compliant behavior isn't a great advance.

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] support for NEXT VALUE FOR expression

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> Here is a patch for implementing the NEXT VALUE FOR expression.  This is
> the SQL-standard conforming version of our nextval() function, and it's
> also used by Oracle, MS SQL, DB2.  Example:

We discussed this before and concluded that NEXT VALUE FOR is in fact
*not* an exact semantic equivalent of nextval():

https://www.postgresql.org/message-id/14790.1083955136%40sss.pgh.pa.us

I remain of the opinion that using spec-compliant syntax for
non-spec-compliant behavior isn't a great advance.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] support for NEXT VALUE FOR expression

2016-08-16 Thread Peter Eisentraut
Here is a patch for implementing the NEXT VALUE FOR expression.  This is
the SQL-standard conforming version of our nextval() function, and it's
also used by Oracle, MS SQL, DB2.  Example:

SELECT NEXT VALUE FOR foo_seq;

The second patch changes the serial column to use this new expression
for its generated default values.  This doesn't make an external
difference except perhaps that the generated expression looks less weird
to the user.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From c012df68bdfc8711d142f7a91f8ea0ee4ecef813 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 11 Aug 2016 12:00:00 -0400
Subject: [PATCH 1/2] Add NEXT VALUE FOR expression

This is the SQL-standard-conforming version of the nextval function.  It
functions the same way, but has a more SQL-like syntax and separate
parse and executor nodes.
---
 doc/src/sgml/func.sgml | 29 ++---
 doc/src/sgml/ref/create_sequence.sgml  | 17 -
 src/backend/catalog/dependency.c   |  7 +++
 src/backend/commands/sequence.c|  3 +--
 src/backend/executor/execQual.c| 21 +
 src/backend/nodes/copyfuncs.c  | 34 ++
 src/backend/nodes/equalfuncs.c | 24 
 src/backend/nodes/nodeFuncs.c  | 18 ++
 src/backend/nodes/outfuncs.c   | 12 
 src/backend/nodes/readfuncs.c  | 16 
 src/backend/parser/gram.y  | 12 
 src/backend/parser/parse_expr.c| 23 +++
 src/backend/parser/parse_target.c  |  3 +++
 src/backend/utils/adt/ruleutils.c  |  9 +
 src/bin/psql/tab-complete.c|  6 ++
 src/include/commands/sequence.h|  2 ++
 src/include/nodes/nodes.h  |  2 ++
 src/include/nodes/parsenodes.h | 10 ++
 src/include/nodes/primnodes.h  | 10 ++
 src/test/regress/expected/sequence.out |  8 
 src/test/regress/sql/sequence.sql  |  2 +-
 21 files changed, 249 insertions(+), 19 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 426e562..dc21e6b 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -11476,6 +11476,9 @@ Sequence Manipulation Functions
sequence
   
   
+   NEXT VALUE FOR
+  
+  
nextval
   
   
@@ -11489,7 +11492,7 @@ Sequence Manipulation Functions
   
 
   
-   This section describes functions for operating on sequence
+   This section describes functions and other expressions for operating on sequence
objects, also called sequence generators or just sequences.
Sequence objects are special single-row tables created with .
@@ -11512,7 +11515,7 @@ Sequence Functions
 currval(regclass)
 bigint
 Return value most recently obtained with
-nextval for specified sequence
+nextval/NEXT VALUE FOR for specified sequence
   
   
 lastval()
@@ -11526,6 +11529,11 @@ Sequence Functions
 Advance sequence and return new value
   
   
+NEXT VALUE FOR sequence_name
+bigint
+Advance sequence and return new value
+  
+  
 setval(regclass, bigint)
 bigint
 Set sequence's current value
@@ -11540,7 +11548,8 @@ Sequence Functions

 
   
-   The sequence to be operated on by a sequence function is specified by
+   For the function syntax,
+   the sequence to be operated on by a sequence function is specified by
a regclass argument, which is simply the OID of the sequence in the
pg_class system catalog.  You do not have to look up the
OID by hand, however, since the regclass data type's input
@@ -11601,11 +11610,21 @@ Sequence Functions
   
 
   
+   For NEXT VALUE FOR, the sequence argument is a regular
+   identifier, e.g.,
+
+NEXT VALUE FOR foo
+NEXT VALUE FOR "Foo"
+
+  
+
+  
The available sequence functions are:
 
 
  
   nextval
+  NEXT VALUE FOR
   

 Advance the sequence object to its next value and return that
@@ -11640,6 +11659,10 @@ Sequence Functions
 

 
+   
+NEXT VALUE FOR is the syntax specified by the SQL
+standard.  All the function call variants are PostgreSQL extensions.
+   
   
  
 
diff --git a/doc/src/sgml/ref/create_sequence.sgml b/doc/src/sgml/ref/create_sequence.sgml
index c959146..bcb5aa8 100644
--- a/doc/src/sgml/ref/create_sequence.sgml
+++ b/doc/src/sgml/ref/create_sequence.sgml
@@ -49,11 +49,12 @@ Description
   
 
   
-   After a sequence is created, you use the functions
+   After a sequence is created, you use the expression NEXT VALUE FOR
+   and the functions
nextval,
currval, and
setval
-   to operate on the sequence.  These functions are documented in
+   to operate on 

Re: [HACKERS] LWLocks in DSM memory

2016-08-16 Thread Robert Haas
On Tue, Aug 16, 2016 at 5:03 PM, Andres Freund  wrote:
> On 2016-08-15 18:15:23 -0400, Robert Haas wrote:
>> On Thu, Aug 11, 2016 at 2:19 PM, Robert Haas  wrote:
>> > Therefore, I plan to commit this patch, removing the #include
>> >  unless someone convinces me we need it, shortly after
>> > development for v10 opens, unless there are objections before then.
>>
>> Hearing no objections, done.
>
> I'd have objected, if I hadn't been on vacation.  While I intuitively
> *do* think that the increased wait-list overhead won't be relevant, I
> also know that my intuition has frequently been wrong around the lwlock
> code.  This needs some benchmarks on a 4+ socket machine,
> first. Something exercising the slow path obviously. E.g. a pgbench with
> a small number of writers, and a large number of writers.

I have to admit that I totally blanked about you being on vacation.
Thanks for mentioning the workload you think might be adversely
affected, but to be honest, even if there's some workload where this
causes a small regression, I'm not really sure what you think we
should do instead.  Should we have a separate copy of lwlock.c just
for parallel query and other stuff that uses DSM?  Won't that slow
down every error-handling path in the system, if they all have to
release two kinds of lwlocks rather than one?  And bloat the binary?
Or are you going to argue that parallel query doesn't really need
LWLocks?  I'm sure that's not true.  We got by without it for this
release, but that's because the only truly parallel operation as yet
is Parallel Seq Scan whose requirements are simple enough to be
handled with a spinlock.

Anyway, I guess we should wait for the benchmark results and then see,
but if we're not going to do this then we need some reasonable
alternative.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)

2016-08-16 Thread Michael Paquier
On Wed, Aug 17, 2016 at 8:05 AM, Andres Freund  wrote:
> ISTM that the correct fix would be to actually introduce something like
> quote_path_for_shell() which either adds proper quotes, or fails if
> that'd be hard (e.g. if the path contains quotes, and we're on
> windows).

You are looking for appendShellString in fe_utils/string_utils.c.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Andres Freund
On 2016-08-16 21:40:32 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-07-31 17:57:12 -0400, Tom Lane wrote:
> >> Yeah.  The extended query protocol was designed to offer a lot of
> >> functionality that people had asked for, like plan re-use and
> >> introspection of the data types assigned to query parameters, but that
> >> doesn't come at zero cost.  I think the tie-in to the plan cache is a
> >> significant part of the added overhead, and so is the fact that we have to
> >> iterate the per-message loop in PostgresMain five times not once, with
> >> overheads like updating the process title incurred several times in that.
> 
> > One approach to solving this, without changing the protocol, would be to
> > "fuse" parse/bind/execute/sync together, by peeking ahead in the
> > protocol stream.
> 
> I do not think that would move the needle noticeably, because we'd still
> have to do basically all the same work, due to not knowing whether the
> statement is going to be used over again.  If we'd specified that the
> unnamed statement could be used only once, and that the unnamed portal
> had to be executed to completion on first use, there would be more room
> for optimization.  The joys of hindsight :-(

ISTM that with the current prepared statement search path behaviour
(i.e. we replan on relevant changes anyway), we can store the unnamed
statement's sql for that case.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread 'Andres Freund'
On 2016-08-17 11:51:04 +1000, dandl wrote:
> > > From my particular perspective it would be enough if all the
> > internal
> > > headers (that one needs to use in writing server-side extensions)
> > were
> > > completely usable in C++.
> > 
> > That should already be the case.  There's even a dirty hack^WWscript
> > that checks that that remains the case
> > (src/tools/pginclude/cpluspluscheck).
> 
> The source code for my project is here:
> https://github.com/davidandl/Andl/tree/master/plandl
> https://github.com/davidandl/Andl/blob/master/plandl/plandl.c
> 
> I was not able to get this file to compile correctly in C++, and my
> recollection is that at the time I asked on this list and that was the
> advice.

You need to include the files surrounded by extern "C" { }.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread dandl
> > From my particular perspective it would be enough if all the
> internal
> > headers (that one needs to use in writing server-side extensions)
> were
> > completely usable in C++.
> 
> That should already be the case.  There's even a dirty hack^WWscript
> that checks that that remains the case
> (src/tools/pginclude/cpluspluscheck).

The source code for my project is here:
https://github.com/davidandl/Andl/tree/master/plandl
https://github.com/davidandl/Andl/blob/master/plandl/plandl.c

I was not able to get this file to compile correctly in C++, and my
recollection is that at the time I asked on this list and that was the
advice. 

Sorry, I don't remember the error but it seemed to be too deeply embedded to
worry about. I just wrote the C code and moved on.

Since the Windows COM in the other part is C++ only, I finished up with a
mixed build. It works fine, but is not the ideal outcome.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org







-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2016-07-31 17:57:12 -0400, Tom Lane wrote:
>> Yeah.  The extended query protocol was designed to offer a lot of
>> functionality that people had asked for, like plan re-use and
>> introspection of the data types assigned to query parameters, but that
>> doesn't come at zero cost.  I think the tie-in to the plan cache is a
>> significant part of the added overhead, and so is the fact that we have to
>> iterate the per-message loop in PostgresMain five times not once, with
>> overheads like updating the process title incurred several times in that.

> One approach to solving this, without changing the protocol, would be to
> "fuse" parse/bind/execute/sync together, by peeking ahead in the
> protocol stream.

I do not think that would move the needle noticeably, because we'd still
have to do basically all the same work, due to not knowing whether the
statement is going to be used over again.  If we'd specified that the
unnamed statement could be used only once, and that the unnamed portal
had to be executed to completion on first use, there would be more room
for optimization.  The joys of hindsight :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Improve formatting of comments in plpgsql.h

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> I propose the attached patch to clean up the comment formatting in
> plpgsql.h.

Looks reasonable in a quick once-over.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Improve formatting of comments in plpgsql.h

2016-08-16 Thread Peter Eisentraut
I propose the attached patch to clean up the comment formatting in
plpgsql.h.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 1924c0d822b36af32556e49cd0a70da9ab5c87b2 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 16 Aug 2016 12:00:00 -0400
Subject: [PATCH] Improve formatting of comments in plpgsql.h

This file had some unusual comment layout.  Most of the comments
introducing structs ended up to the right of the screen and following
the start of the struct.  Some comments for struct members ended up
after the member definition.

Fix that by moving comments consistently before what they are
describing.  Also add missing struct tags where missing so that it is
easier to tell what the struct is.
---
 src/pl/plpgsql/src/plpgsql.h | 417 ++-
 1 file changed, 255 insertions(+), 162 deletions(-)

diff --git a/src/pl/plpgsql/src/plpgsql.h b/src/pl/plpgsql/src/plpgsql.h
index 140bf4b..cd2e4ef 100644
--- a/src/pl/plpgsql/src/plpgsql.h
+++ b/src/pl/plpgsql/src/plpgsql.h
@@ -34,9 +34,8 @@
 #undef _
 #define _(x) dgettext(TEXTDOMAIN, x)
 
-/* --
+/*
  * Compiler's namespace item types
- * --
  */
 enum
 {
@@ -46,9 +45,8 @@ enum
 	PLPGSQL_NSTYPE_REC
 };
 
-/* --
+/*
  * A PLPGSQL_NSTYPE_LABEL stack entry must be one of these types
- * --
  */
 enum PLpgSQL_label_types
 {
@@ -57,9 +55,8 @@ enum PLpgSQL_label_types
 	PLPGSQL_LABEL_OTHER			/* anything else */
 };
 
-/* --
+/*
  * Datum array node types
- * --
  */
 enum
 {
@@ -71,9 +68,8 @@ enum
 	PLPGSQL_DTYPE_EXPR
 };
 
-/* --
+/*
  * Variants distinguished in PLpgSQL_type structs
- * --
  */
 enum
 {
@@ -83,9 +79,8 @@ enum
 	PLPGSQL_TTYPE_PSEUDO		/* other pseudotypes */
 };
 
-/* --
+/*
  * Execution tree node types
- * --
  */
 enum PLpgSQL_stmt_types
 {
@@ -115,10 +110,8 @@ enum PLpgSQL_stmt_types
 	PLPGSQL_STMT_PERFORM
 };
 
-
-/* --
+/*
  * Execution node return codes
- * --
  */
 enum
 {
@@ -128,9 +121,8 @@ enum
 	PLPGSQL_RC_CONTINUE
 };
 
-/* --
+/*
  * GET DIAGNOSTICS information items
- * --
  */
 enum
 {
@@ -149,9 +141,8 @@ enum
 	PLPGSQL_GETDIAG_SCHEMA_NAME
 };
 
-/* 
+/*
  * RAISE statement options
- * 
  */
 enum
 {
@@ -166,9 +157,8 @@ enum
 	PLPGSQL_RAISEOPTION_SCHEMA
 };
 
-/* 
+/*
  * Behavioral modes for plpgsql variable resolution
- * 
  */
 typedef enum
 {
@@ -182,9 +172,11 @@ typedef enum
  * Node and structure definitions
  **/
 
-
-typedef struct
-{/* Postgres data type */
+/*
+ * Postgres data type
+ */
+typedef struct PLpgSQL_type
+{
 	char	   *typname;		/* (simple) name of the type */
 	Oid			typoid;			/* OID of the data type */
 	int			ttype;			/* PLPGSQL_TTYPE_ code */
@@ -197,31 +189,37 @@ typedef struct
 	int32		atttypmod;		/* typmod (taken from someplace else) */
 } PLpgSQL_type;
 
-
 /*
+ * Generic datum array item
+ *
  * PLpgSQL_datum is the common supertype for PLpgSQL_expr, PLpgSQL_var,
  * PLpgSQL_row, PLpgSQL_rec, PLpgSQL_recfield, and PLpgSQL_arrayelem
  */
-typedef struct
-{/* Generic datum array item		*/
+typedef struct PLpgSQL_datum
+{
 	int			dtype;
 	int			dno;
 } PLpgSQL_datum;
 
 /*
+ * Scalar or composite variable
+ *
  * The variants PLpgSQL_var, PLpgSQL_row, and PLpgSQL_rec share these
  * fields
  */
-typedef struct
-{/* Scalar or composite variable */
+typedef struct PLpgSQL_variable
+{
 	int			dtype;
 	int			dno;
 	char	   *refname;
 	int			lineno;
 } PLpgSQL_variable;
 
+/*
+ * SQL Query to plan and execute
+ */
 typedef struct PLpgSQL_expr
-{/* SQL Query to plan and execute	*/
+{
 	int			dtype;
 	int			dno;
 	char	   *query;
@@ -252,9 +250,11 @@ typedef struct PLpgSQL_expr
 	LocalTransactionId expr_simple_lxid;
 } PLpgSQL_expr;
 
-
-typedef struct
-{/* Scalar variable */
+/*
+ * Scalar variable
+ */
+typedef struct PLpgSQL_var
+{
 	int			dtype;
 	int			dno;
 	char	   *refname;
@@ -273,19 +273,20 @@ typedef struct
 	bool		freeval;
 } PLpgSQL_var;
 
-
-typedef struct
-{/* Row variable */
+/*
+ * Row variable
+ */
+typedef struct PLpgSQL_row
+{
 	int			dtype;
 	int			dno;
 	char	   *refname;
 	int			lineno;
 
+	/* Note: TupleDesc is only set up for named rowtypes, else it is NULL. */
 	TupleDesc	rowtupdesc;
 
 	/*
-	 * Note: TupleDesc is only set up for named rowtypes, else it is NULL.
-	 *
 	 * Note: if the underlying rowtype contains a dropped column, the
 	 * corresponding fieldnames[] entry will be NULL, and there is no
 	 * corresponding var (varnos[] will be -1).
@@ -295,9 +296,11 @@ typedef struct
 	int		   *varnos;
 } PLpgSQL_row;
 
-
-typedef struct
-{/* Record variable (non-fixed structure) */
+/*
+ * Record variable (non-fixed structure)
+ */
+typedef struct PLpgSQL_rec
+{
 	int			dtype;
 	

Re: [HACKERS] anyelement -> anyrange

2016-08-16 Thread Tom Lane
Jim Nasby  writes:
> I can't think of any reason you'd want two different range types on a 
> single element type.

We would not have built it that way if there were not clear use-cases.
An easy example is you might want both a continuous timestamp range
and one that is quantized to hour boundaries.  Primarily what the
range type brings in besides the element type is a canonicalization
function; and we can't guess which one you want.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Peter Geoghegan
> I think that the best thing about C++ is the ability to encapsulate and
> simplify some aspects of resource management quite well, which necessitates
> reimplementing PG_TRY/CATCH. The worst thing about C++ is that ABI
> compatibility is far messier. This makes a C++ port seem less compelling
> to me than the idea first appears.

From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Christopher
> Further, it's not as if C++ is particularly newer than C.  C is about 45
> years old; C++, at 33, hardly seems like a "spry young whippersnapper"
> whose inclusion ought to lead to vast excitement.
>
> The would-be "spry young things" that head to my mind are Rust and Go.  I'm
> not sure it's terribly plausible to have parts of Postgres written in both
> C and (Rust|Go); they're different enough that I'm not sure what
> functionality would mix sensibly.  But I think that would be more
> interesting, all the same.  Perhaps it would work out well to be able to
> create background workers in Rust, or to implement a stored procedure
> language in Go.


First, I'm neither for nor against rewriting PostgreSQL in C++.  But I wonder 
whether it would really pay for the cost.  I'm worried about these, for example:

* Wouldn't it increase the coding and testing burdon?  Coding and testing in C, 
and coding and testing in C++.  PostgreSQL seem to have many features to 
develop, and I'm not sure C++ will help to speed up the development of them.

* Would it really attract more developers of PostgreSQL itself, not extensions? 
 FYI, Tiobe Index says C is nearly twice as popular as C++.

http://www.tiobe.com/tiobe-index/

* Wouldn't it distance some developers if they don't want to learn C++?  As 
Christopher said, C++ is old and there are many newer languages that attract 
developers -- C#, Swift, Go, Java, JavaScript, etc.  I wonder whether recent 
developers want to spend time in learning complex C++ now.  I learned C++ 
because it is still the most popular language in game development, but maybe I 
would not want to learn C++ anymore if I didn't know C++.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Andres Freund
On 2016-08-17 10:45:25 +1000, dandl wrote:
> From my particular perspective it would be enough if all the internal
> headers (that one needs to use in writing server-side extensions) were
> completely usable in C++.

That should already be the case.  There's even a dirty hack^WWscript
that checks that that remains the case (src/tools/pginclude/cpluspluscheck).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] anyelement -> anyrange

2016-08-16 Thread Jim Nasby

On 8/16/16 6:56 PM, David G. Johnston wrote:

On Tue, Aug 16, 2016 at 7:47 PM, Jim Nasby >wrote:

On 8/15/16 10:12 PM, Tom Lane wrote:

Jim Nasby  writes:

Any reason why we can create a function that accepts
anyelement and
returns anyarray, but can't do the same with anyrange?


Because there can be more than one range type over the same element
type, so we couldn't deduce which one should be used for anyrange.

The other direction (inferring anyelement from anyrange) does work.


Is there an actual use case for that? I'm not seeing what it would be...


​https://www.postgresql.org/docs/9.5/static/functions-range.html

lower() and upper() both use it.


Nothing built in uses what Tom mentioned: having multiple *range types* 
for a single base type. lower() and upper() use *anyrange*, which is a 
completely different animal.


I can't think of any reason you'd want two different range types on a 
single element type. If we made that a constraint, we could resolve an 
anyrange from an anyelement. That would be very useful in some cases 
(one example being the range_from_array() functions I just created).


BTW, another option would be to allow marking a specific range type as 
being "primary", so if you did need to define some other variation on 
int4range you could do so, but you'd have to decide whether it or 
int4range was the primary one that anyelement->anyrange would use.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread dandl
> Well, getting so that we can at least compile in both systems would
> certainly increase the chances of somebody being willing to work on
> such a design.  

>From my particular perspective it would be enough if all the internal headers 
>(that one needs to use in writing server-side extensions) were completely 
>usable in C++. It's not so much hacking on the internals, it's more about 
>being to build an extension DLL in C++ that makes extensive use of calls to 
>internals without having to write shim layers. That looks like a lot less work 
>than a full C++ port.

And if nobody ever does, then at least people who want
> to fork and do research projects based on PostgreSQL will have
> slightly less work to do when they want to hack it up.  PostgreSQL
> seems to be a very popular starting point for research work, but a
> paper I read recently complained about the antiquity of our code base.
> I prefer to call that backward-compatibility, but at some point people
> stop thinking of you as backward-compatible and instead think of you
> as simply backward.

Certainly the positive arguments for sticking with pure C are diminishing over 
time, perhaps faster in perception than in fact.

> > A lot of the other things people have muttered about, such as
> heavier
> > use of inline functions instead of macros, don't particularly need
> C++
> > at all.

My point is only that C++ can be used to provide better type safety, with 
little of any effect on performance.

Regards
David M Bennett FACS

Andl - A New Database Language - andl.org









-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Joy Arulraj
On Tue, Aug 16, 2016 at 4:22 PM, Jim Nasby  wrote:

> On 8/16/16 12:53 PM, Joy Arulraj wrote:
>
>> > The whole thing would make a lot more sense given a credible design
>> > for error handling that keeps both languages happy.
>>
>> Well, getting so that we can at least compile in both systems would
>> certainly increase the chances of somebody being willing to work on
>> such a design.  And if nobody ever does, then at least people who want
>> to fork and do research projects based on PostgreSQL will have
>> slightly less work to do when they want to hack it up.  PostgreSQL
>> seems to be a very popular starting point for research work, but a
>> paper I read recently complained about the antiquity of our code base.
>> I prefer to call that backward-compatibility, but at some point people
>> stop thinking of you as backward-compatible and instead think of you
>> as simply backward.
>>
>> I agree, this was the main reason why we wanted to add support for C++.
>>
>
> Joy, do you have an idea what a *minimally invasive* patch for C++ support
> would look like? That's certainly the first step here.
>
>
Jim -- I believe that the patch will be roughly 6K lines long. The majority
of the changes correspond to handling language keyword conflicts.

https://github.com/jarulraj/postgresql-cpp/compare/182656bf32b99c96e5cd9dc59ece4c20149787fb...7ef6f472b53a83a4cedd0222b41345c0f74fae1e

I must mention that some of the changes I have made preclude the
possibility of supporting compilation with both C and C++ compilers.
However, I am certain that this limitation can be circumvented with some
clever hacking.


> --
> Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
> Experts in Analytics, Data Architecture and PostgreSQL
> Data in Trouble? Get it in Treble! http://BlueTreble.com
> 855-TREBLE2 (855-873-2532)   mobile: 512-569-9461
>


Re: [HACKERS] Why we lost Uber as a user

2016-08-16 Thread Jim Nasby
Something I didn't see mentioned that I think is a critical point: last 
I looked, HOT standby (and presumably SR) replays full page writes. That 
means that *any* kind of corruption on the master is *guaranteed* to 
replicate to the slave the next time that block is touched. That's 
completely the opposite of trigger-based replication.


On 8/3/16 3:51 PM, Kevin Grittner wrote:

Personally, I can't imagine running logical replication of
supposedly matching sets of data without something equivalent.


I think it depends heavily on the replication solution. I ran londiste 
for 6+ years with no major issues, but of course there was at least one 
other major company running that. I also took the time to completely 
read all the source code; something that's a reasonable prospect with a 
few thousand lines of python. For streaming rep it's difficult just to 
draw the line at where the code is.


Ultimately, people really need to understand the trade-offs to the 
different solutions so they can make an informed decision on which ones 
(yes, plural) they want to use. The same can be said about pg_upgrade vs 
something else, and the different ways of doing backups.


Something I think a lot of folks fail to understand is the value of 
having a system that has simple technology in the mix. Keeping something 
like londiste running has a non-zero cost, but the day you discover 
corruption has replicated through your entire infrastructure you'll 
probably be REALLY happy you have it. Similarly, I always encourage 
people to run a weekly or monthly pg_dump if it's at all feasible... 
just to be safe.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Keeping CURRENT_DATE and similar constructs in original format

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> On 5/12/16 6:14 PM, Tom Lane wrote:
>> So what I've wanted to do for some time is invent a new expression node
>> type that represents any one of these functions and can be reverse-listed
>> in the same format that the input had.  The attached proposed patch does
>> that.

> I was experimenting with this as well when I found your patch, and I
> think this is the right solution.  Your patch works fine for me.

Thanks for reviewing this patch.  I've pushed it now.

>> (I'm not particularly in love with the node type name
>> ValueFunction; anybody got a better idea?)

> I think this is fine.  The only other idea I have would be
> SQLValueFunction, to emphasize that this is about SQL-mandated special
> cases.

I went with SQLValueFunction.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

2016-08-16 Thread Jim Nasby

On 8/16/16 11:59 AM, Robert Haas wrote:
...

That doesn't really solve the problem, because OTHER backends won't be
able to see them.  So, if I create a fast temporary table in one
session that depends on a permanent object, some other session can
drop the permanent object.  If there were REAL catalog entries, that
wouldn't work, because the other session would see the dependency.


Some discussion about TEMP functions is happening on -general right now, 
and there's other things where temp objects are good to have, so it'd be 
nice to have a more generic fix for this stuff. Is the idea of 
"partitioning" the catalogs to store temp objects separate from 
permanent fatally flawed?

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] anyelement -> anyrange

2016-08-16 Thread David G. Johnston
On Tue, Aug 16, 2016 at 7:47 PM, Jim Nasby  wrote:

> On 8/15/16 10:12 PM, Tom Lane wrote:
>
>> Jim Nasby  writes:
>>
>>> Any reason why we can create a function that accepts anyelement and
>>> returns anyarray, but can't do the same with anyrange?
>>>
>>
>> Because there can be more than one range type over the same element
>> type, so we couldn't deduce which one should be used for anyrange.
>>
>> The other direction (inferring anyelement from anyrange) does work.
>>
>
> Is there an actual use case for that? I'm not seeing what it would be...


​https://www.postgresql.org/docs/9.5/static/functions-range.html

lower() and upper() both use it.

David J.
​


Re: [HACKERS] anyelement -> anyrange

2016-08-16 Thread Jim Nasby

On 8/15/16 10:12 PM, Tom Lane wrote:

Jim Nasby  writes:

Any reason why we can create a function that accepts anyelement and
returns anyarray, but can't do the same with anyrange?


Because there can be more than one range type over the same element
type, so we couldn't deduce which one should be used for anyrange.

The other direction (inferring anyelement from anyrange) does work.


Is there an actual use case for that? I'm not seeing what it would be...
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Thomas Munro
On Wed, Aug 17, 2016 at 4:50 AM, Robert Haas  wrote:
> On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro
>  wrote:
>> On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro
>>  wrote:
>>> On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane  wrote:
 amul sul  writes:
> When I am calling dsm_create on Linux using the POSIX DSM implementation 
> can succeed, but result in SIGBUS when later try to access the memory.  
> This happens because of my system does not have enough shm space &  
> current allocation in dsm_impl_posix does not allocate disk blocks[1]. I 
> wonder can we use fallocate system call (i.e. Zero-fill the file) to 
> ensure that all the file space has really been allocated, so that we 
> don't later seg fault when accessing the memory mapping. But here we will 
> endup by loop calling ‘write’ squillions of times.

 Wouldn't that just result in a segfault during dsm_create?

 I think probably what you are describing here is kernel misbehavior
 akin to memory overcommit.  Maybe it *is* memory overcommit and can
 be turned off the same way.  If not, you have material for a kernel
 bug fix/enhancement request.
>>>
>>> [...] But it
>>> looks like if we used fallocate or posix_fallocate in the
>>> dsm_impl_posix case we'd get a nice ESPC error, instead of
>>> success-but-later-SIGBUS-on-access.
>>
>> Here's a simple test extension that creates jumbo dsm segments, and
>> then accesses all pages.  If you ask it to write cheques that your
>> Linux 3.10 machine can't cash on unpatched master, it does this:
>>
>> postgres=# create extension foo;
>> CREATE EXTENSION
>> postgres=# select test_dsm(16::bigint * 1024 * 1024 * 1024);
>> server closed the connection unexpectedly
>> ...
>> LOG:  server process (PID 15105) was terminated by signal 7: Bus error
>>
>> If I apply the attached experimental patch I get:
>>
>> postgres=# select test_dsm(16::bigint * 1024 * 1024 * 1024);
>> ERROR:  could not resize shared memory segment
>> "/PostgreSQL.1938734921" to 17179869184 bytes: No space left on device
>>
>> It should probably be refactored a bit to separate the error messages
>> for ftruncate and posix_fallocate, and we could possibly use the same
>> approach for dsm_impl_mmap instead of that write() loop, but this at
>> least demonstrates the problem Amul reported.  Thoughts?
>
> Seems like it could be a reasonable change.  I wonder what happens on
> other platforms.

FreeBSD 10.3 returns successfully from shm_open and then displays
classic overcommit symptoms when you try to access the memory:

LOG:  server process (PID 22714) was terminated by signal 9: Killed
DETAIL:  Failed process was running: select test_dsm(16::bigint * 1024
* 1024 * 1024);
>From OS logs: pid 22714 (postgres), uid 1001, was killed: out of swap space

Unfortunately it doesn't like posix_fallocate on a fd returned by
shm_open, and barfs with ENODEV.  So this would need to be a
Linux-only trick.

I still think it's worth thinking about something along these lines on
Linux only, where holey Swiss tmpfs files can bite you.  Otherwise
disabling overcommit on your OS isn't enough to prevent something
which is really a kind of deferred overcommit with a surprising
failure mode (SIGBUS rather than OOM SIGKILL).

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-08-16 Thread Alfred Perlstein



On 8/3/16 3:29 AM, Greg Stark wrote:


Honestly the take-away I see in the Uber story is that they apparently
had nobody on staff that was on -hackers or apparently even -general
and tried to go it alone rather than involve experts from outside
their company. As a result they misdiagnosed their problems based on
prejudices seeing what they expected to see rather than what the real
problem was.


Agree strongly, but there are still lessons to be learned on the psql side.

-Alfred


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Why we lost Uber as a user

2016-08-16 Thread Alfred Perlstein



On 8/2/16 10:02 PM, Mark Kirkwood wrote:

On 03/08/16 02:27, Robert Haas wrote:


Personally, I think that incremental surgery on our current heap
format to try to fix this is not going to get very far.  If you look
at the history of this, 8.3 was a huge release for timely cleanup of
dead tuple.  There was also significant progress in 8.4 as a result of
5da9da71c44f27ba48fdad08ef263bf70e43e689.   As far as I can recall, we
then made no progress at all in 9.0 - 9.4.  We made a very small
improvement in 9.5 with 94028691609f8e148bd4ce72c46163f018832a5b, but
that's pretty niche.  In 9.6, we have "snapshot too old", which I'd
argue is potentially a large improvement, but it was big and invasive
and will no doubt pose code maintenance hazards in the years to come;
also, many people won't be able to use it or won't realize that they
should use it.  I think it is likely that further incremental
improvements here will be quite hard to find, and the amount of effort
will be large relative to the amount of benefit.  I think we need a
new storage format where the bloat is cleanly separated from the data
rather than intermingled with it; every other major RDMS works that
way.  Perhaps this is a case of "the grass is greener on the other
side of the fence", but I don't think so.


Yeah, I think this is a good summary of the state of play.

The only other new db development to use a non-overwriting design like 
ours that I know of was Jim Starky's Falcon engine for (ironically) 
Mysql 6.0. Not sure if anyone is still progressing that at all now.


I do wonder if Uber could have successfully tamed dead tuple bloat 
with aggressive per-table autovacuum settings (and if in fact they 
tried), but as I think Robert said earlier, it is pretty easy to come 
up with a highly update (or insert + delete) workload that makes for a 
pretty ugly bloat component even with real aggressive autovacuuming.
I also wonder if they had used "star schema" which to my understanding 
would mean multiple tables to replace the single-table that has multiple 
indecies to work around the write amplification problem in postgresql.




Cheers

Mark







--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Christopher Browne
On 16 August 2016 at 17:08, Piotr Stefaniak  wrote:
> On 2016-08-16 18:33, Robert Haas wrote:
>> It wouldn't be that much work to maintain, either: we'd
>> just set up some buildfarm members that compiled using C++ and when
>> they turned red, we'd go fix it.
>
> I think that there exist subtle differences between C and C++ that
> without compile-time diagnostic could potentially lead to different
> run-time behavior.

It seems to me that if we were really keen on attaching in another
"totally compiled" language, that C++ wouldn't seem like the best
choice.

As you say, it's subtly different, which seems a bit dangerous to me.

Further, it's not as if C++ is particularly newer than C.  C is about 45
years old; C++, at 33, hardly seems like a "spry young whippersnapper"
whose inclusion ought to lead to vast excitement.

The would-be "spry young things" that head to my mind are Rust and
Go.  I'm not sure it's terribly plausible to have parts of Postgres
written in both C and (Rust|Go); they're different enough that
I'm not sure what functionality would mix sensibly.  But I think
that would be more interesting, all the same.  Perhaps it would
work out well to be able to create background workers in Rust,
or to implement a stored procedure language in Go.
-- 
When confronted by a difficult problem, solve it by reducing it to the
question, "How would the Lone Ranger handle this?"


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)

2016-08-16 Thread Andres Freund
Hi,

On 2016-08-14 10:12:45 -0500, Ryan Murphy wrote:
> Hello Postgres Team!
> but this command did not work for me because my data directory
> contained a space.  The src/bin/initdb/initdb.c source code
> did already have a QUOTE_PATH constant, but it was the empty
> string for non-windows cases.
> 
> Therefore, added quotes via existing QUOTE_PATH constant:
> 
> pg_ctl -D '/some/path/to/data' -l logfile start

> From 275d045bc41b136df8c413eedba12fbd21609de1 Mon Sep 17 00:00:00 2001
> From: ryanfmurphy 
> Date: Sun, 14 Aug 2016 08:56:50 -0500
> Subject: [PATCH] initdb: "'" for QUOTE_PATH (non-windows)
> 
> fix issue when running initdb
> 
> at the end of a successful initdb it says:
> 
> Success. You can now start the database server using:
> pg_ctl -D /some/path/to/data -l logfile start
> 
> but this command will not work if the data directory contains a space
> therefore, added quotes via existing QUOTE_PATH constant:
> 
> pg_ctl -D '/some/path/to/data' -l logfile start
> ---
>  src/bin/initdb/initdb.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
> index 73cb7ee..6dc1e23 100644
> --- a/src/bin/initdb/initdb.c
> +++ b/src/bin/initdb/initdb.c
> @@ -332,7 +332,7 @@ do { \
>  } while (0)
>  
>  #ifndef WIN32
> -#define QUOTE_PATH   ""
> +#define QUOTE_PATH   "'"
>  #define DIR_SEP "/"
>  #else
>  #define QUOTE_PATH   "\""

ISTM that the correct fix would be to actually introduce something like
quote_path_for_shell() which either adds proper quotes, or fails if
that'd be hard (e.g. if the path contains quotes, and we're on
windows). 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: initdb: "'" for QUOTE_PATH (non-windows)

2016-08-16 Thread Ryan Murphy
I've submitted my patch to Commitfest 2016-09.

https://commitfest.postgresql.org/10/718/

My username on postgresql.org is murftown

On Tue, Aug 16, 2016 at 1:02 AM, Ryan Murphy  wrote:

> Ok, I'll do that!
> Thanks Michael!
> Ryan
>
>
> On Monday, August 15, 2016, Michael Paquier 
> wrote:
>
>> On Mon, Aug 15, 2016 at 12:12 AM, Ryan Murphy 
>> wrote:
>> > This is to fix an issue that came up for me when running initdb.
>> >
>> > At the end of a successful initdb it says:
>> >
>> > Success. You can now start the database server using:
>> > pg_ctl -D /some/path/to/data -l logfile start
>> >
>> > but this command did not work for me because my data directory
>> > contained a space.  The src/bin/initdb/initdb.c source code
>> > did already have a QUOTE_PATH constant, but it was the empty
>> > string for non-windows cases.
>> >
>> > Therefore, added quotes via existing QUOTE_PATH constant:
>> >
>> > pg_ctl -D '/some/path/to/data' -l logfile start
>>
>> You may want to register this patch to the next commit fest:
>> https://commitfest.postgresql.org/10/
>> --
>> Michael
>>
>


Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
On Sat, Aug 13, 2016 at 4:18 PM, Thomas Munro
 wrote:
> First, you initialise a Barrier object somewhere in shared memory,
> most likely in the DSM segment used by parallel query, by calling
> BarrierInit(, nworkers).  Then workers can call
> BarrierWait() when they want to block until all workers arrive
> at the barrier.  When the final worker arrives, BarrierWait returns in
> all workers, releasing them to continue their work.  One arbitrary
> worker receives a different return value as a way of "electing" it to
> perform serial phases of computation.  For parallel phases of
> computation, the return value can be ignored.  For example, there may
> be preparation, merging, or post-processing phases which must be done
> by just one worker, interspersed with phases where all workers do
> something.

I think that this mechanism could be quite useful for sorting with
partitioning, which doesn't exist yet. What does exist is unlikely to
benefit from this over and above what Robert's "condition variables"
offer, because as it happens there is no need to "elect" a single
worker at all. The ordering dependencies happen to be quite naturally
across one leader process and one or more worker processes.

I do see value in this, though.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
On Mon, Aug 15, 2016 at 6:55 AM, Robert Haas  wrote:
> A sort of dumb way of handling all this is to assume that once a
> worker joins the hash join, it won't go off and do anything else until
> the hash join is done.  Under that assumption, you just need some sort
> of BarrierAttach() operation; workers that have never attached the
> barrier aren't participating in the hash join at all and so they are
> irrelevant - and now you know how many workers you need to await,
> because you can keep a count how many have attached.  Perhaps you
> simply turn away any workers that arrive after batch 0 is complete.

Is that really so bad? In general, I don't tend to think of workers as
the cost to worry about. Rather, we should be concerned about the
active use of CPU cores as our major cost.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Barriers

2016-08-16 Thread Peter Geoghegan
On Mon, Aug 15, 2016 at 6:55 AM, Robert Haas  wrote:
> The simple version of this is that when a worker gets done with its
> own probe phase for batch X, it can immediately start building the
> hash table for phase X+1, stopping if it fills up the unused portion
> of work_mem before the old hash table goes away.  Of course, there are
> some tricky issues with reading tapes that were originally created by
> other backends, but if I understand correctly, Peter Geoghegan has
> already done some work on that problem, and it seems like something we
> can eventually solve, even if not in the first version.

The tape vs. BufFile vs. fd.c file handle distinctions get
*confusing*. Thomas and I have hashed this out (pun intended), but I
should summarize.

Currently, and without bringing parallelism into it, Hash joins have
multiple BufFiles (two per batch -- innerBatchFile and
outerBatchFile), which are accessed as needed. External sorts have
only one BufFile, with multiple "logical tapes" within a single
"tapeset" effectively owning space within the BufFile -- that space
doesn't have to be contiguous, and can be reused *eagerly* within and
across logical tapes in tuplesort.c's tapeset. logtape.c is a kind of
block-orientated rudimentary filesystem built on top of one BufFile.
The only real advantage of having the logtape.c abstraction is that
moving stuff around (to sort it, when multiple passes are required)
can be accomplished with minimal wasted disk space (it's eagerly
reclaimed). This is less important today than it would have been in
the past.

Clearly, it doesn't make much sense to talk about logtape.c and
anything that isn't sorting, because it is very clearly written with
that purpose alone in mind. To avoid confusion, please only talk about
tapes when talking about sorting.

So:

* tuplesort.c always talks to logtape.c, which talks to buffile.c
(which talks to fd.c).

* Hash joins use buffile.c directly, though (and have multiple
buffiles, as already noted).

Now, I might still have something that Thomas can reuse, because
buffile.c was made to support "unification" of worker BufFiles in
general. Thomas would be using that interface, if any. I haven't
studied parallel hash join at all, but presumably the difference would
be that *multiple* BufFiles would be unified, such that a
concatenated/unified BufFile would be addressable within each worker,
one per batch. All of this assumes that there is a natural way of
unifying the various batches involved across all workers, of course.

This aspect would present some complexity for Thomas, I think
(comments from hashjoin.h):

 * It is possible to increase nbatch on the fly if the in-memory hash table
 * gets too big.  The hash-value-to-batch computation is arranged so that this
 * can only cause a tuple to go into a later batch than previously thought,
 * never into an earlier batch.  When we increase nbatch, we rescan the hash
 * table and dump out any tuples that are now of a later batch to the correct
 * inner batch file.  Subsequently, while reading either inner or outer batch
 * files, we might find tuples that no longer belong to the current batch;
 * if so, we just dump them out to the correct batch file.

I'd be concerned about managing which backend was entitled to move
tuples across batches, and so on. One thing that I haven't had to
contend with is which backend "owns" which BufFile (or underlying fd.c
file handles). There is no ambiguity about that for me. Owners delete
the temp files on Xact end, and are the only ones entitled to write to
files, and only before unification. These latter restrictions might be
lifted if there was a good reason to do so.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-08-16 Thread Rahila Syed
>I think I like the option of having psql issue an error.  On the
>server side, the transaction would still be open, but the user would
>receive a psql error message and the autocommit setting would not be
>changed.  So the user could type COMMIT or ROLLBACK manually and then
>retry changing the value of the setting.

Throwing psql error comes out to be most accepted outcome on this thread. I
agree it is safer than guessing user intention.

Although according to the default behaviour of psql, error will abort the
current transaction and roll back all the previous commands. This can be
user unfriendly making user rerun all the commands just because of
autocommit switch. So probably behaviour of 'ON_ERROR_ROLLBACK on' needs to
be implemented along with the error display. This will rollback just the
autocommit switch command.

Also, psql error instead of a simple commit will lead to script
terminations. Hence issuing a COMMIT seems more viable here. However,
script termination can be avoided by default behaviour of ON_ERROR_STOP
which will execute subsequent commands successfully.(However subsequent
commands won't be executed in autocommit mode which I think should be OK as
it will be notified via ERROR).

So summarizing my view of the discussion on this thread, issuing a psql
error seems to be the best option. I will post a patch regarding this if
there is no objection.


Thank you,

Rahila Syed


Re: [HACKERS] Detecting skipped data from logical slots (data silently skipped)

2016-08-16 Thread Andres Freund
On 2016-08-08 10:59:20 +0800, Craig Ringer wrote:
> Right. Though if we flush lazily I'm surprised the effect is that big,
> you're the one who did the work and knows the significance of it.

It will be. Either you're increasing bloat (by not increasing the
slot's wal position / catalog xmin), or you're adding frequent syncs on
an idle connection.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Andres Freund
On 2016-08-16 16:59:56 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On 2016-08-16 13:40:06 -0700, Peter Geoghegan wrote:
> >> Actually, come to think of it, I guess this is wrong. The problem with
> >> what I say here is that longjmp() and setjmp() are incompatible with
> >> the stack unwinding used by C++ destructors in general (exceptions are
> >> another issue). I think that the practical implication of that is that
> >> we can never use any C++ feature that hides the complexity of resource
> >> management, unless and until elog() is reimplemented to not use
> >> longjmp() and setjmp().
> 
> > FWIW, IIRC that's not true for gcc/glibc, because they IIRC use common
> > codepaths. But obviously that's not all-encompassing enough to rely on that.
> 
> I wonder whether it'd be possible to implement the PG_TRY/CATCH macros
> to use C++ exceptions when building in C++.

Yea, I suggested that somewhere nearby. I think that'd be fairly easy -
to me the hard part is making it possible to compile postgres with C++, not
changing the exception handling itself.


> This would probably mean that C and C++ builds would be incompatible
> as far as loadable extensions are concerned, because it'd amount to an
> ABI difference.  But maybe that's OK.  We could certainly have the
> PG_MODULE_MAGIC macro guard against the case.

Right.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
On Tue, Aug 16, 2016 at 1:59 PM, Tom Lane  wrote:
>> FWIW, IIRC that's not true for gcc/glibc, because they IIRC use common
>> codepaths. But obviously that's not all-encompassing enough to rely on that.
>
> I wonder whether it'd be possible to implement the PG_TRY/CATCH macros
> to use C++ exceptions when building in C++.  This would probably mean
> that C and C++ builds would be incompatible as far as loadable extensions
> are concerned, because it'd amount to an ABI difference.  But maybe
> that's OK.  We could certainly have the PG_MODULE_MAGIC macro guard
> against the case.

Maybe.

I think that the best thing about C++ is the ability to encapsulate
and simplify some aspects of resource management quite well, which
necessitates reimplementing PG_TRY/CATCH. The worst thing about C++ is
that ABI compatibility is far messier. This makes a C++ port seem less
compelling to me than the idea first appears.

Note, for example, that ICU is implemented in C++, but still has C
stub functions, not necessarily for the exclusive benefit of C client
code.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Piotr Stefaniak
On 2016-08-16 18:33, Robert Haas wrote:
> It wouldn't be that much work to maintain, either: we'd
> just set up some buildfarm members that compiled using C++ and when
> they turned red, we'd go fix it.

I think that there exist subtle differences between C and C++ that 
without compile-time diagnostic could potentially lead to different 
run-time behavior. As an artificial example:

$ cat ./test.c
#include 

int main(void) {
FILE *f = fopen("test.bin", "w");
if (f == NULL)
return 1;
fwrite("1", sizeof '1', 1, f);
fclose(f);
return 0;
}
$ clang ./test.c -o test
$ ./test
$ hexdump test.bin
000 0031 
004
$ clang++ ./test.c -o test
clang-3.9: warning: treating 'c' input as 'c++' when in C++ mode, this 
behavior is deprecated
$ ./test
$ hexdump test.bin
000 0031
001

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LWLocks in DSM memory

2016-08-16 Thread Andres Freund
On 2016-08-15 18:15:23 -0400, Robert Haas wrote:
> On Thu, Aug 11, 2016 at 2:19 PM, Robert Haas  wrote:
> > Therefore, I plan to commit this patch, removing the #include
> >  unless someone convinces me we need it, shortly after
> > development for v10 opens, unless there are objections before then.
> 
> Hearing no objections, done.

I'd have objected, if I hadn't been on vacation.  While I intuitively
*do* think that the increased wait-list overhead won't be relevant, I
also know that my intuition has frequently been wrong around the lwlock
code.  This needs some benchmarks on a 4+ socket machine,
first. Something exercising the slow path obviously. E.g. a pgbench with
a small number of writers, and a large number of writers.

Regards,

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Tom Lane
Andres Freund  writes:
> On 2016-08-16 13:40:06 -0700, Peter Geoghegan wrote:
>> Actually, come to think of it, I guess this is wrong. The problem with
>> what I say here is that longjmp() and setjmp() are incompatible with
>> the stack unwinding used by C++ destructors in general (exceptions are
>> another issue). I think that the practical implication of that is that
>> we can never use any C++ feature that hides the complexity of resource
>> management, unless and until elog() is reimplemented to not use
>> longjmp() and setjmp().

> FWIW, IIRC that's not true for gcc/glibc, because they IIRC use common
> codepaths. But obviously that's not all-encompassing enough to rely on that.

I wonder whether it'd be possible to implement the PG_TRY/CATCH macros
to use C++ exceptions when building in C++.  This would probably mean
that C and C++ builds would be incompatible as far as loadable extensions
are concerned, because it'd amount to an ABI difference.  But maybe
that's OK.  We could certainly have the PG_MODULE_MAGIC macro guard
against the case.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Wait events monitoring future development

2016-08-16 Thread Andres Freund
Hi,

On 2016-08-07 14:03:17 +0200, Ilya Kosmodemiansky wrote:
> Wait event monitoring looks ones again stuck on the way through community
> approval in spite of huge progress done last year in that direction.

I see little evidence of that. If you consider "please do some
reasonable benchmarks" as being stuck...


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Andres Freund
On 2016-08-16 13:40:06 -0700, Peter Geoghegan wrote:
> On Tue, Aug 16, 2016 at 1:29 PM, Peter Geoghegan  wrote:
> > IMV, it would be useful to use C++ classes (and even template classes)
> > for a small number of data structures, while still largely adhering to
> > earlier practices (this is what GCC did). Specifically, a few modules
> > such as StringInfo, could be made to follow the RAII/scope bound
> > resource management usefully, which doesn't seem incompatible with
> > memory contexts. However, this doesn't seem terribly exciting to me.
> 
> Actually, come to think of it, I guess this is wrong. The problem with
> what I say here is that longjmp() and setjmp() are incompatible with
> the stack unwinding used by C++ destructors in general (exceptions are
> another issue). I think that the practical implication of that is that
> we can never use any C++ feature that hides the complexity of resource
> management, unless and until elog() is reimplemented to not use
> longjmp() and setjmp().

FWIW, IIRC that's not true for gcc/glibc, because they IIRC use common
codepaths. But obviously that's not all-encompassing enough to rely on that.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Jim Nasby

On 8/16/16 3:29 PM, Andres Freund wrote:

Well, having typed pg_list.h style lists, ilist.h linked lists,
hash-tables, and proper typechecks for pg_nodes.h instead of the NodeTag
stuff, would surely make life easier.


I certainly wish parts of the system brought code and "data" together in 
a better way. Nodes are an example; all the Walker stuff in the 
planner/executor is another. (I'm not saying C++ would make that better, 
just saying those are parts of the code I find it much harder to grok.)



But given the small subset of C++ available on all our supported
platforms... I think we'd first need to make the decision to cut support
for some platforms, before using C++.  Which imo is a distinct task from
*allowing* to compile with a C++ compiler.


Exactly. If we at least maintain support for compiling that means people 
can experiment with other enhancements in a way that's much more 
compatible with normal community contribution practices, which makes it 
far more likely for that stuff to be accepted.


As for the backwards compatibility... the stance I've seen the community 
take is cost vs benefit. Right now the benefits are completely 
hypothetical, because no one could realistically propose a patch to use 
C++ (or maybe even Rust) features.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Andres Freund
On 2016-07-31 17:57:12 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > FWIW, I've observed the same with (a bit) more complicated queries. A
> > part of this is that the extended protocol simply does
> > more. PQsendQueryGuts() sends Parse/Bind/Describe/Execute/Sync - that's
> > simply more work and data over the wire than a single Q message.
> 
> Yeah.  The extended query protocol was designed to offer a lot of
> functionality that people had asked for, like plan re-use and
> introspection of the data types assigned to query parameters, but that
> doesn't come at zero cost.  I think the tie-in to the plan cache is a
> significant part of the added overhead, and so is the fact that we have to
> iterate the per-message loop in PostgresMain five times not once, with
> overheads like updating the process title incurred several times in that.

One approach to solving this, without changing the protocol, would be to
"fuse" parse/bind/execute/sync together, by peeking ahead in the
protocol stream. When that combination is seen looking ahead (without
blocking), optimize it by handing it to something closer to
exec_simple_query() which also handles parameters.  Even if we don't
recognize that pattern everytime, e.g. because later messages are in
different, not yet arrived, tcp packets, that'd speed up the common
case.  As our client socket is nearly always is in non-blocking mode
these days, that shouldn't be too expensive.


Not that that analogy is fitting perfectl;y, but the above approach
seems to work quite well on the CPU level ("macro op fusion"), to
increase execution throughput...


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
On Tue, Aug 16, 2016 at 1:29 PM, Peter Geoghegan  wrote:
> IMV, it would be useful to use C++ classes (and even template classes)
> for a small number of data structures, while still largely adhering to
> earlier practices (this is what GCC did). Specifically, a few modules
> such as StringInfo, could be made to follow the RAII/scope bound
> resource management usefully, which doesn't seem incompatible with
> memory contexts. However, this doesn't seem terribly exciting to me.

Actually, come to think of it, I guess this is wrong. The problem with
what I say here is that longjmp() and setjmp() are incompatible with
the stack unwinding used by C++ destructors in general (exceptions are
another issue). I think that the practical implication of that is that
we can never use any C++ feature that hides the complexity of resource
management, unless and until elog() is reimplemented to not use
longjmp() and setjmp().


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Andres Freund
On 2016-08-16 12:59:24 -0400, Tom Lane wrote:
> I'm pretty dubious that it really helps people
> to develop extensions in C++.  Almost invariably, if you ask *why* they
> want to do that, you'll get an answer involving C++ libraries that are
> not going to play very nice with our error handling or memory management
> conventions.  I do not see how we could C++-ify the error handling without
> making a complete break with C compilers ... which is a step I don't
> really want to take.

I don't think it's *that* hard to make our and C++ error handling play
well together, at least when compiled with a C++ compiler.


> The whole thing would make a lot more sense given a credible design
> for error handling that keeps both languages happy.

Using C++ exceptions instead of sigsetjmp()/siglongjmp, when compiled
with a C++ compiler, seems not that hard, and could easily be hidden
behind PG_TRY/CATCH/RE_THROW/END_TRY.  We'd have to hide the "bottom of
the exception stack" cases like PostgresMain() behind another macro, but
to me that doesn't sound like a bad idea anyway.


I do think it makes sense to work towards being able to compile postgres
with both C++ and C compilers. Most of the work towards that is pretty
boring...


> A lot of the other things people have muttered about, such as heavier
> use of inline functions instead of macros, don't particularly need C++
> at all.

I think the more exciting bit is type safe lists, hash tables, ...,
without having to use huge amounts of macro magic. Using actual
inheritance for Node* stuff would also surely make some code better
checked (checked casts) and easier to write (less pointless
downcasting/upcasting from Node).


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Jim Nasby
I'm sure this wasn't your intent, but the tone of your response is part 
of why people don't get involved with Postgres development...


On 8/16/16 10:39 AM, Aleksander Alekseev wrote:

Well, well, well. Another C vs C++ holly war, really?


Please note that you're the only person in the entire thread that's said 
anything to the effect of a holy war...



Who are these "folks"? How many more developers it would attract
_exactly_, not potentially?


As someone else (Robert?) said, there's a decent chance of it attracting 
some, and it should be rather non-invasive, so why not try?



One again, which "people"? I've seen people complained that there is
not enough code reviewers and testers. I doubt very much its something
C++ will help with.


Will it suddenly draw 20 people? Probably not. But if the community 
actually welcomes the effort Joy put forth and encourages him then we've 
very likely gained at least one more; maybe several.


OTOH, if the community takes the stance of "WTF WHY DO WE NEED THIS?!", 
we've just driven Joy and anyone else that's a C++ fan away.


When it comes specifically to reviewing and testing, you need to provide 
some kind of reason for people to do that grunt work. A big form of that 
is supporting people who want to change something about Postgres. (It's 
certainly possible to get non-hackers to help with this stuff, but 
that's a different discussion entirely.)



And I wrote a blog post (in Russian) [1] in 2016 why nobody should (if
they can) write a new code in C++. In my opinion Rust looks way more
promising. However I personally prefer to wait like 5 years before
using a new and shiny technology. This is also something I blogged
about (in Russian [2], translation [3]).


I agree that Rust is more interesting than C++. I think it'd be great if 
we supported it as well, but I don't know how practical that would 
actually be. Note I said support, not use... it's going to be far more 
challenging to make Rust (or even C++) a requirement to build Postgres. 
Maybe we'll eventually go that route, after demonstrating the 
significant benefits that would need to exist to make that work 
worthwhile. It's going to be FAR easier to demonstrate that if the 
native project at least supports using it, vs needing a complete fork.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Peter Geoghegan
On Tue, Aug 16, 2016 at 9:59 AM, Tom Lane  wrote:
> I think this might have advantages purely from the standpoint of new
> compilers possibly offering useful warnings we don't get now.  But
> if we only go this far, I'm pretty dubious that it really helps people
> to develop extensions in C++.  Almost invariably, if you ask *why* they
> want to do that, you'll get an answer involving C++ libraries that are
> not going to play very nice with our error handling or memory management
> conventions.

FWIW, it's not uncommon to opt-out of C++ exceptions entirely, for
various reasons. For example, the Google C++ style guide forbids it
(if only for historical reasons), as does the GCC style guide (since
GCC was a C program until several years ago [1]). Sometimes, these
third party libraries that mandate the use of exceptions do indeed
create significant headaches for Postgres, compatibility-wise, but
that isn't a given.

IMV, it would be useful to use C++ classes (and even template classes)
for a small number of data structures, while still largely adhering to
earlier practices (this is what GCC did). Specifically, a few modules
such as StringInfo, could be made to follow the RAII/scope bound
resource management usefully, which doesn't seem incompatible with
memory contexts. However, this doesn't seem terribly exciting to me.

[1] https://lwn.net/Articles/542457/
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Andres Freund
On 2016-08-16 18:52:39 +0300, Heikki Linnakangas wrote:
> On 08/16/2016 05:47 PM, Jim Nasby wrote:
> > I realize there's little technical reason why we *need* C++ support. The
> > level if discipline applied to our codebase negates some of the benefits
> > of C++. But maintaining the discipline takes a lot of time and effort,
> > and makes it more difficult to attract new contributors.
> 
> I suspect that it would take as much
> discipline to keep a C++ codebase
> readable, as the current C codebase. If
> not more.

Well, having typed pg_list.h style lists, ilist.h linked lists,
hash-tables, and proper typechecks for pg_nodes.h instead of the NodeTag
stuff, would surely make life easier.

But given the small subset of C++ available on all our supported
platforms... I think we'd first need to make the decision to cut support
for some platforms, before using C++.  Which imo is a distinct task from
*allowing* to compile with a C++ compiler.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-08-16 Thread Robert Haas
On Wed, Aug 10, 2016 at 7:09 AM, Amit Langote
 wrote:
> 0003-Catalog-and-DDL-for-partition-bounds.patch
>
> Partition DDL includes both a way to create new partition and "attach" an
> existing table as a partition of parent partitioned table.  Attempt to
> drop a partition using DROP TABLE causes an error. Instead a partition
> needs first to be "detached" from parent partitioned table.  On the other
> hand, dropping the parent drops all the partitions if CASCADE is specified.

Hmm, I don't think I like this.  Why should it be necessary to detach
a partition before dropping it?  That seems like an unnecessary step.

 [ DEFERRABLE | NOT DEFERRABLE ] [ INITIALLY DEFERRED | INITIALLY
IMMEDIATE ]
+
 

Unnecessary hunk.

+ 
+  If this table is a partition, one cannot perform DROP
NOT NULL
+  on a column if it is marked not null in the parent table.
+  not null.
+ 

Sentence fragment.

+ 
+  Note that unlike the ATTACH PARTITION command, a partition
+  being detached can be itself partitioned.  In that case, it continues
+  to exist as such.
+ 

This is another restriction I don't understand.  Why can't I attach a
partitioned table?

+indicate that descendant tables are included.  Note that whether
+ONLY or * is specified has no effect in case
+of a partitioned table; descendant tables (in this case, partitions)
+are always included.

Ugh, why?  I think this should work exactly the same way for
partitioned tables that it does for any other inheritance hierarchy.
Sure, you'll get no rows, but who cares?

+CREATE FOREIGN TABLE measurement_y2016m07
+PARTITION OF measurement FOR VALUES START ('2016-07-01') END
('2016-08-01');
+SERVER server_07;

Extra semicolon?

+  A partition cannot have columns other than those inherited from the
+  parent.  That includes the oid column, which can be

I think experience suggests that this is a good restriction, but then
why does the syntax synopsis indicate that PARTITION BY can be
specified along with column definitions?  Similarly for CREATE FOREIGN
TABLE.

+  When specifying for a table being created as partition, one needs to
+  use column names from the parent table as part of the key.

This is not very clear.

-   /* Remove NO INHERIT flag if rel is a partitioned table */
-   if (relid_is_partitioned(relid))
+   /* Discard NO INHERIT, if relation is a partitioned table or a
partition */
+   if (relid_is_partitioned(relid) || relid_is_partition(relid))
is_no_inherit = false;

It might be right to disallow NO INHERIT in this case, but I don't
think it can be right to just silently ignore it.

+ * Not flushed from the cache by RelationClearRelation() unless changed because
+ * of addition or removal of partitions.

This seems unlikely to be safe, unless I'm missing something.

+   form = (Form_pg_inherits) GETSTRUCT(tuple);
+
+   systable_endscan(scan);
+   heap_close(catalogRelation, AccessShareLock);
+
+   return form->inhparent;

This is unsafe.  After systable_endscan, it is no longer OK to access
form->inhparent.

Try building with CLOBBER_CACHE_ALWAYS to find other cache flush hazards.

There should probably be a note in the function header comment that it
is unsafe to use this for an inheritance child that is not a
partition, because there could be more than one parent in that case.
Or maybe the whole idea of this function just isn't very sound...

+static List *
+get_partitions(Oid relid, int lockmode)
+{
+   return find_inheritance_children(relid, lockmode);
+}

What's the point?  If we're going to have a wrapper here at all, then
shouldn't it have a name that matches the existing convention - e.g.
find_partitions() or find_child_partitions()?  But I think you might
as well just use find_inheritance_children() directly.

+* Happens when we have created the pg_inherits entry
but not the
+* pg_partition entry yet.

Why do we ever allow the flow of control to reach this point while we
are in such an intermediate state?

+free_partition_info(PartitionInfoData **p, int num)

Seems very error-prone.  Isn't this why MemoryContextReset was invented?

+relid_is_partition(Oid relid)
+{
+   return SearchSysCacheExists1(PARTRELID, ObjectIdGetDatum(relid));
+}

This is used in a lot of places, and the overhead of checking it in
all of those places is not necessarily nil.  Syscache lookups aren't
free.  What if we didn't create a new catalog for this and instead
just added a relpartitionbound attribute to pg_class?  It seems a bit
silly to have a whole extra catalog to store one extra column...

/*
+* If this foreign table is a partition, check that the FDW supports
+* insert.
+*/
+   if (stmt->base.partbound != NULL)
+   {
+   FdwRoutine *fdw_routine;
+
+   fdw_routine = GetFdwRoutine(fdw->fdwhandler);

Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Jim Nasby

On 8/16/16 12:53 PM, Joy Arulraj wrote:

> The whole thing would make a lot more sense given a credible design
> for error handling that keeps both languages happy.

Well, getting so that we can at least compile in both systems would
certainly increase the chances of somebody being willing to work on
such a design.  And if nobody ever does, then at least people who want
to fork and do research projects based on PostgreSQL will have
slightly less work to do when they want to hack it up.  PostgreSQL
seems to be a very popular starting point for research work, but a
paper I read recently complained about the antiquity of our code base.
I prefer to call that backward-compatibility, but at some point people
stop thinking of you as backward-compatible and instead think of you
as simply backward.

I agree, this was the main reason why we wanted to add support for C++.


Joy, do you have an idea what a *minimally invasive* patch for C++ 
support would look like? That's certainly the first step here.

--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] parallel.c is not marked as test covered

2016-08-16 Thread Peter Eisentraut
On 6/20/16 11:16 PM, Tom Lane wrote:
>> > I think this test would only fail if it runs out of workers, and that 
>> > would only happen in an installcheck run against a server configured in 
>> > a nonstandard way or that is doing something else -- which doesn't 
>> > happen on the buildfarm.
> Um, if you're speaking of select_parallel, that already runs in parallel
> with two other regression tests, and there is no annotation in the
> parallel_schedule file suggesting that adding more scripts to that group
> would be bad.  But yes, perhaps putting this test into its own standalone
> group would be enough of a fix.

Maybe now would be a good time to address this by applying the attached
patch to master and seeing what happens?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 53bbb416a74d18b14a91619246a5043fe34d3d61 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 16 Aug 2016 12:00:00 -0400
Subject: [PATCH] Run select_parallel test by itself

Remove the plpgsql wrapping that hides the context.  So now the test
will fail if the work doesn't actually happen in a parallel worker.  Run
the test in its own test group to ensure it won't run out of resources
for that.
---
 src/test/regress/expected/select_parallel.out | 13 +++--
 src/test/regress/parallel_schedule|  5 -
 src/test/regress/sql/select_parallel.sql  |  9 +
 3 files changed, 8 insertions(+), 19 deletions(-)

diff --git a/src/test/regress/expected/select_parallel.out b/src/test/regress/expected/select_parallel.out
index 2286faf..1efcfc2 100644
--- a/src/test/regress/expected/select_parallel.out
+++ b/src/test/regress/expected/select_parallel.out
@@ -111,14 +111,7 @@ explain (costs off)
  Index Cond: (unique1 = 1)
 (5 rows)
 
-do $$begin
-  -- Provoke error, possibly in worker.  If this error happens to occur in
-  -- the worker, there will be a CONTEXT line which must be hidden.
-  perform stringu1::int2 from tenk1 where unique1 = 1;
-  exception
-	when others then
-		raise 'SQLERRM: %', sqlerrm;
-end$$;
-ERROR:  SQLERRM: invalid input syntax for integer: "BA"
-CONTEXT:  PL/pgSQL function inline_code_block line 7 at RAISE
+select stringu1::int2 from tenk1 where unique1 = 1;
+ERROR:  invalid input syntax for integer: "BA"
+CONTEXT:  parallel worker
 rollback;
diff --git a/src/test/regress/parallel_schedule b/src/test/regress/parallel_schedule
index 3815182..1cb5dfc 100644
--- a/src/test/regress/parallel_schedule
+++ b/src/test/regress/parallel_schedule
@@ -92,7 +92,10 @@ test: brin gin gist spgist privileges init_privs security_label collate matview
 test: alter_generic alter_operator misc psql async dbsize misc_functions
 
 # rules cannot run concurrently with any test that creates a view
-test: rules psql_crosstab select_parallel amutils
+test: rules psql_crosstab amutils
+
+# run by itself so it can run parallel workers
+test: select_parallel
 
 # --
 # Another group of parallel tests
diff --git a/src/test/regress/sql/select_parallel.sql b/src/test/regress/sql/select_parallel.sql
index 38d3166..3474947 100644
--- a/src/test/regress/sql/select_parallel.sql
+++ b/src/test/regress/sql/select_parallel.sql
@@ -44,13 +44,6 @@
 explain (costs off)
   select stringu1::int2 from tenk1 where unique1 = 1;
 
-do $$begin
-  -- Provoke error, possibly in worker.  If this error happens to occur in
-  -- the worker, there will be a CONTEXT line which must be hidden.
-  perform stringu1::int2 from tenk1 where unique1 = 1;
-  exception
-	when others then
-		raise 'SQLERRM: %', sqlerrm;
-end$$;
+select stringu1::int2 from tenk1 where unique1 = 1;
 
 rollback;
-- 
2.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Dmitry Igrishin
2016-08-16 18:52 GMT+03:00 Heikki Linnakangas :
> On 08/16/2016 05:47 PM, Jim Nasby wrote:
>>
>> I realize there's little technical reason why we *need* C++ support. The
>> level if discipline applied to our codebase negates some of the benefits
>> of C++. But maintaining the discipline takes a lot of time and effort,
>> and makes it more difficult to attract new contributors.
>
>
> I suspect that it would take as much discipline to keep a C++ codebase
> readable, as the current C codebase. If not more.
For example, its easier and less error prone to define structures with
virtual functions in C++ than write vtables manually in C. So, the adequate
subset of the C++ features can be useful to write more readable and
maintainable C-style code. These features are:

  - abstract classes (well, structures with virtual functions);
  - RTTI;
  - lambda functions;
  - constexpr functions;
  - destructors;
  - templates (very reservedly).

But these features should be avoided (as least for now):

  - exceptions;
  - the parts of the standard library which generates exceptions
(in particular, regex and thread).

-- 
// Dmitry.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Shay Rojansky
Halfway through this mail I suddenly understood something, please read all
the way down before responding...

On Tue, Aug 16, 2016 at 2:16 PM, Vladimir Sitnikov <
sitnikov.vladi...@gmail.com> wrote:

> Shay> your analogy breaks down. Of course L2 was invented to improve
> performance,
> Shay> but that doesn't mean that all caches are the same. More precisely,
> what I
> Shay> find objectionable about your approach is not any caching - it's the
> Shay> implicit or automatic preparation of statements. This practice isn't
> Shay> invisible in that a) it may cause errors that wouldn't have been
> there
> Shay> otherwise (e.g. because of DDL),
>
> Long-lived named server-prepared statements cause problems even if
> server-prepared statements are created manually by developers.
>
> Could you please stop saying "automatic preparation causes ~DDL issues"?
>

I never said that... As I've said many times, the problem is errors caused
by something the user never asked for. If I server-prepare a statement and
then get an error, it's a result of my own action.

Shay> As I said above, I think this is a critical point of misunderstand
> between
> Shay> us. The developers tells the driver which statements should be
> Shay> server-prepared by calling .prepareStatement(). I'm guessing you
> have a
> Shay> totally different understanding here.
>
> Please, quote the document you got that "developers tell the driver which
> statements should be server-prepared by calling ..." from. It never
> works like that.
> Neither in Java, nor in C#. I would admit I've no C# experience, but I did
> find documentation on IDbCommand.Prepare() and examined it.
>

> The proper way to say is "by calling .Prepare() developer passes the
> intention that
> he might be using the same query multiple times".
> That is it. It never means "driver must absolutely use server-prepare
> in the response
> to .Prepare() call".
>
> The same goes for Java's PreparedStatement.
> It never means "the driver must use server-prepared features".
>
> As Microsoft lists in the .Prepare() documentation, modern versions of
> MSSQL just ignore .Prepare() and cache statements automatically.
>
> It is not a developer's business which statements should be in the
> database cache.
> Neither developer should care which statements reside in the driver cache.
>

I'm really baffled here.

First, I never said prepared statements *must* be server-prepared. You're
completely correct that databases APIs don't *require* this, because they
by definition cover many databases and drivers. In Sqlite there's no
server, so there can be no server-prepared statement.

However, where there *is* a server which supports prepared statements as an
optimization, it's completely unthinkable to me that a driver wouldn't
implement prepare as server-prepare. Nobody forces you to do it - it just
seems unthinkable to do otherwise. This reason for this is that if
server-prepared statements are supported by your database, we expect them
to be a significant optimization (otherwise why would they exist), and
therefore not using them when the user calls "prepare" seems like...
foolishness. In other words, whatever client-side "precompilation" or other
optimization is possible is probably going to be negligible when compared
to server-preparation (this seems to be the case with PostgreSQL at the
very least), so why *not* map the database API's prepare method to
server-prepared statements?

I'm going to requote the API note which you quoted above on
Connection.prepareStatement (
https://docs.oracle.com/javase/7/docs/api/java/sql/Connection.html):

> This method is optimized for handling parametric SQL statements that
benefit from precompilation. If the driver supports precompilation, the
method prepareStatement will send the statement to the database for
precompilation. Some drivers may not support precompilation.

Again, my understanding of English may be flawed, or maybe my logic
circuits are malfunctioning. But I read this the following way:
1. preparedStatement is about precompilation.
2. If a driver supports precompilation (i.e. preparation),
"prepareStatement will send the statement to the *database* for
precompilation". Note that the API explicitly mentioned sending *to the
database* - server preparation...
3. A driver may not support precompilation (i.e. preparation). This could
be because it simply hasn't implemented it yet, or because the backend
doesn't support it, or for any other reason. In this case it's a noop,
which doesn't really change anything in this discussion.

A compliant implementation (that is a driver) could just assemble full SQL
> by concatenating the parameters on each execution and send it via 'Q'
> simple
> execute message.
>

I think I may have understood the problem here - there's definitely a Java
vs. C# issue difference this conversation.

>From reading the Java docs, I now realize that JDBC only seems to support
parameters in prepared statements. In other words, the 

Re: [HACKERS] Intermittent "cache lookup failed for type" buildfarm failures

2016-08-16 Thread Alvaro Herrera
Tom Lane wrote:
> Robert Haas  writes:

> > It would sure be nice if those cache lookup failure messages printed
> > the file and line number.  I wonder if we could teach psql to always
> > treat the VERBOSITY as verbose when the error code is XX000.
> 
> I looked around when I saw the earlier ones of these, and had more or less
> convinced myself that the errors were probably coming from one of the
> lsyscache.c convenience subroutines.  If that's true, we'd need a stack
> trace to have much hope of identifying the cause.

Maybe we can have a code path that calls backtrace() somewhere in
errfinish, for platforms that support that.  At least grouse uses gcc,
so I suppose it must also use glibc.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-08-16 Thread Robert Haas
On Wed, Aug 10, 2016 at 7:09 AM, Amit Langote
 wrote:
> 0002-psql-and-pg_dump-support-for-partitioned-tables.patch

+if (pset.sversion >= 90600 && tableinfo.relkind == 'P')

Version check is redundant, right?

+) PARTITION BY RANGE ((a+b));
+\d describe_range_key
+Partitioned table "public.describe_range_key"
+ Column |  Type   | Modifiers
++-+---
+ a  | integer |
+ b  | integer |
+Partition Key: PARTITION BY RANGE (((a + b)))

I understand that it's probably difficult not to end up with two sets
of parentheses here, but can we avoid ending up with three sets?

Also, I wonder if pg_get_partkeydef() should omit "PARTITION BY" and
pg_dump can add that part back.  Then this could say:

Partition Key: RANGE ((a + b))

...which seems a good deal more natural than what you have now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Intermittent "cache lookup failed for type" buildfarm failures

2016-08-16 Thread Tom Lane
Robert Haas  writes:
> On Tue, Aug 16, 2016 at 2:21 PM, Tom Lane  wrote:
>> I grepped through the buildfarm logs and determined that there are exactly
>> zero similar failures going back as far as 2016-04-01.  Now that we've had
>> four in a week, it seems certain that this indicates a bug introduced at
>> most a few days before Aug 9.  A quick trawl through the git logs finds
>> no obvious candidates, though.

> Well, it would have to be something that was back-patched to 9.5,
> right?  That doesn't leave too many candidates.

It's possible that the bug existed longer and was only exposed by a
seemingly unrelated change (eg, a test timing change, since it certainly
looks like it might be timing dependent).  That's little help though :-(

> It would sure be nice if those cache lookup failure messages printed
> the file and line number.  I wonder if we could teach psql to always
> treat the VERBOSITY as verbose when the error code is XX000.

I looked around when I saw the earlier ones of these, and had more or less
convinced myself that the errors were probably coming from one of the
lsyscache.c convenience subroutines.  If that's true, we'd need a stack
trace to have much hope of identifying the cause.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_bsd_indent - improvements around offsetof and sizeof

2016-08-16 Thread Andres Freund
On 2016-08-15 18:09:02 +, Piotr Stefaniak wrote:
> There are more fixes I intend to do, of which the most relevant for 
> Postgres are:
> 1) fixing "function pointer typedef formatting"

This alone would warrant a bottle of something rather expensive.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2016-08-16 Thread Andres Freund
On 2016-08-15 12:02:18 -0400, Robert Haas wrote:
> Thanks for taking a stab at this.  I'd like to throw out a few concerns.
> 
> One, I'm worried that adding an additional layer of pointer-jumping is
> going to slow things down and make Andres' work to speed up the
> executor more difficult.  I don't know that there is a problem there,
> and if there is a problem I don't know what to do about it, but I
> think it's something we need to consider.

I'm quite concerned about that as well.


> I am somewhat inclined to
> believe that we need to restructure the executor in a bigger way so
> that it passes around datums instead of tuples; I'm inclined to
> believe that the current tuple-centric model is probably not optimal
> even for the existing storage format.

I actually prototyped that, and it's not an easy win so far. Column
extraction cost, even after significant optimization, is still often a
significant portion of the runtime. And e.g. projection only extracting
all columns, after evaluating a restrictive qual referring to an "early"
column, can be a significant win.  We'd definitely have to give up on
extracting columns 0..n when accessing later columns... Hm.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Intermittent "cache lookup failed for type" buildfarm failures

2016-08-16 Thread Robert Haas
On Tue, Aug 16, 2016 at 2:21 PM, Tom Lane  wrote:
> There is something rotten in the state of Denmark.  Here are four recent
> runs that failed with unexpected "cache lookup failed for type "
> errors:
>
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grouse=2016-08-16%2008%3A39%3A03
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nudibranch=2016-08-13%2009%3A55%3A09
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2016-08-09%2001%3A46%3A17
> http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern=2016-08-09%2000%3A44%3A18
>
> The first two are on HEAD, the second two on 9.5, which seems to rule out
> my first thought that this has something to do with parallel query.  It's
> notable though that all the failing machines are PPC or S/390 ... maybe
> big-endian related?
>
> I grepped through the buildfarm logs and determined that there are exactly
> zero similar failures going back as far as 2016-04-01.  Now that we've had
> four in a week, it seems certain that this indicates a bug introduced at
> most a few days before Aug 9.  A quick trawl through the git logs finds
> no obvious candidates, though.

Well, it would have to be something that was back-patched to 9.5,
right?  That doesn't leave too many candidates.

[rhaas pgsql]$ git log --format=oneline --before='Aug 10' --after='Aug
6' REL9_5_STABLE src/backend/
04cee8f835bcf95ff80b734c335927aaf6551d2d Fix several one-byte buffer
over-reads in to_number
4da812fa8adb22874a937f1b000253fecf526cb0 Translation updates
98b0c6280667ce1efae763340fb2c13c81e4d706 Fix two errors with nested
CASE/WHEN constructs.
cb5c14984ad327e52dfb470fde466a5aca7d50a1 Fix misestimation of
n_distinct for a nearly-unique column with many nulls.
71dca408c0030ad76044c6b17367c9fbeac511ec Don't propagate a null
subtransaction snapshot up to parent transaction.

Obviously, the third and fourth of those seem like the most likely
candidates, but I don't have any theory on how either of them could be
causing this.

It would sure be nice if those cache lookup failure messages printed
the file and line number.  I wonder if we could teach psql to always
treat the VERBOSITY as verbose when the error code is XX000.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] condition variables

2016-08-16 Thread Andres Freund
On 2016-08-11 21:27:45 -0400, Robert Haas wrote:
> On Thu, Aug 11, 2016 at 6:37 PM, Peter Geoghegan  wrote:
> > I notice that you acquire a spinlock within the implementation of
> > condition variables. Is it worth any effort to consolidate the number
> > of spinlock acquisitions? In other words, maybe the most common idioms
> > should be baked into the ConditionVariable interface, which could save
> > callers from having to use their own mutex variable.
> 
> One thing to keep in mind is that spinlocks are extremely fast as long
> as you don't have too many processes contending for them.

That's one of the conditions. The other is that the system as a whole is
not overcommitted. Because then the chance of processes being put to
sleep inside a spinlock increases.

> With
> parallel groups (or I/O-in-progress wait queues) of single digit
> number of processes, I doubt that consolidating the spinlock
> acquisitions will produce any measurable benefit.  If we get to the
> point of having parallel groups containing scores of processes, that
> could change.

And we have no measures to manage systemwide load with paralellism yet,
I think the issue is a bit more general than the quoted paragraph.


But I also think we shouldn't yet worry about it. It seems likely that
the actual critical bottleneck is elsewhere for now.

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set log_line_prefix and application name in test drivers

2016-08-16 Thread Peter Eisentraut
On 8/16/16 2:23 PM, Peter Eisentraut wrote:
> On 8/10/16 9:36 PM, Peter Eisentraut wrote:
>> %m vs %t is obviously a minor issue that I will gladly adjust, but
>> besides that I prefer to stick with my version.
> 
> Updated patch with %m instead of %t.  Will submit to CF.

attached

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 1fed06603c0c0cacfb78ccc2985d77bee527fad7 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Tue, 16 Aug 2016 12:00:00 -0400
Subject: [PATCH v2] Set log_line_prefix and application name in test drivers

Before pg_regress runs psql, set the application name to the test name.
Similarly, set the application name to the test file name in the TAP
tests.  Also, set a default log_line_prefix that show the application
name, as well as the PID and a time stamp.

That way, the server log output can be correlated to the test input
files, making debugging a bit easier.
---
 src/test/perl/PostgresNode.pm  | 1 +
 src/test/perl/TestLib.pm   | 2 ++
 src/test/regress/pg_regress.c  | 1 +
 src/test/regress/pg_regress_main.c | 7 +++
 4 files changed, 11 insertions(+)

diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index fede1e6..b3a5457 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -402,6 +402,7 @@ sub init
 	open my $conf, ">>$pgdata/postgresql.conf";
 	print $conf "\n# Added by PostgresNode.pm\n";
 	print $conf "fsync = off\n";
+	print $conf "log_line_prefix = '%m [%p]: [%l] %qapp=%a '\n";
 	print $conf "log_statement = all\n";
 	print $conf "port = $port\n";
 
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index 649fd82..27fcc78 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -60,6 +60,8 @@ BEGIN
 	delete $ENV{PGPORT};
 	delete $ENV{PGHOST};
 
+	$ENV{PGAPPNAME} = $0;
+
 	# Must be set early
 	$windows_os = $Config{osname} eq 'MSWin32' || $Config{osname} eq 'msys';
 }
diff --git a/src/test/regress/pg_regress.c b/src/test/regress/pg_regress.c
index 574f5b8..1d6e1d8 100644
--- a/src/test/regress/pg_regress.c
+++ b/src/test/regress/pg_regress.c
@@ -2247,6 +2247,7 @@ regression_main(int argc, char *argv[], init_function ifunc, test_function tfunc
 		fputs("\n# Configuration added by pg_regress\n\n", pg_conf);
 		fputs("log_autovacuum_min_duration = 0\n", pg_conf);
 		fputs("log_checkpoints = on\n", pg_conf);
+		fputs("log_line_prefix = '%m [%p]: [%l] %qapp=%a '\n", pg_conf);
 		fputs("log_lock_waits = on\n", pg_conf);
 		fputs("log_temp_files = 128kB\n", pg_conf);
 		fputs("max_prepared_transactions = 2\n", pg_conf);
diff --git a/src/test/regress/pg_regress_main.c b/src/test/regress/pg_regress_main.c
index d9591c0..2733635 100644
--- a/src/test/regress/pg_regress_main.c
+++ b/src/test/regress/pg_regress_main.c
@@ -34,6 +34,7 @@ psql_start_test(const char *testname,
 	char		expectfile[MAXPGPATH];
 	char		psql_cmd[MAXPGPATH * 3];
 	size_t		offset = 0;
+	char	   *appnameenv;
 
 	/*
 	 * Look for files in the output dir first, consistent with a vpath search.
@@ -63,6 +64,9 @@ psql_start_test(const char *testname,
 		offset += snprintf(psql_cmd + offset, sizeof(psql_cmd) - offset,
 		   "%s ", launcher);
 
+	appnameenv = psprintf("PGAPPNAME=pg_regress/%s", testname);
+	putenv(appnameenv);
+
 	snprintf(psql_cmd + offset, sizeof(psql_cmd) - offset,
 			 "\"%s%spsql\" -X -a -q -d \"%s\" < \"%s\" > \"%s\" 2>&1",
 			 bindir ? bindir : "",
@@ -80,6 +84,9 @@ psql_start_test(const char *testname,
 		exit(2);
 	}
 
+	unsetenv("PGAPPNAME");
+	free(appnameenv);
+
 	return pid;
 }
 
-- 
2.9.3


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Set log_line_prefix and application name in test drivers

2016-08-16 Thread Peter Eisentraut
On 8/10/16 9:36 PM, Peter Eisentraut wrote:
> %m vs %t is obviously a minor issue that I will gladly adjust, but
> besides that I prefer to stick with my version.

Updated patch with %m instead of %t.  Will submit to CF.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Intermittent "cache lookup failed for type" buildfarm failures

2016-08-16 Thread Tom Lane
There is something rotten in the state of Denmark.  Here are four recent
runs that failed with unexpected "cache lookup failed for type "
errors:

http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=grouse=2016-08-16%2008%3A39%3A03
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=nudibranch=2016-08-13%2009%3A55%3A09
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=sungazer=2016-08-09%2001%3A46%3A17
http://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=tern=2016-08-09%2000%3A44%3A18

The first two are on HEAD, the second two on 9.5, which seems to rule out
my first thought that this has something to do with parallel query.  It's
notable though that all the failing machines are PPC or S/390 ... maybe
big-endian related?

I grepped through the buildfarm logs and determined that there are exactly
zero similar failures going back as far as 2016-04-01.  Now that we've had
four in a week, it seems certain that this indicates a bug introduced at
most a few days before Aug 9.  A quick trawl through the git logs finds
no obvious candidates, though.

Any ideas?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Joy Arulraj
On Tue, Aug 16, 2016 at 1:13 PM, Robert Haas  wrote:

> On Tue, Aug 16, 2016 at 12:59 PM, Tom Lane  wrote:
> > Robert Haas  writes:
> >> I'm not really interested in supporting PostgreSQL code written in
> >> other languages entirely, such as Rust, but I do think it would make
> >> sense to write our code so that it can be compiled using either a C
> >> compiler or a C++ compiler.  Even if we don't ever use any C++ code in
> >> core, this would let people who create forks or extensions use it if
> >> they wished.  It wouldn't be that much work to maintain, either: we'd
> >> just set up some buildfarm members that compiled using C++ and when
> >> they turned red, we'd go fix it.
> >
> > I think this might have advantages purely from the standpoint of new
> > compilers possibly offering useful warnings we don't get now.
>
> Yeah, that could be nice.
>
> > But
> > if we only go this far, I'm pretty dubious that it really helps people
> > to develop extensions in C++.  Almost invariably, if you ask *why* they
> > want to do that, you'll get an answer involving C++ libraries that are
> > not going to play very nice with our error handling or memory management
> > conventions.  I do not see how we could C++-ify the error handling
> without
> > making a complete break with C compilers ... which is a step I don't
> > really want to take.
>
> I agree, but we don't have to agree to change everything before we
> agree to change anything.  If we got an agreement to try to make
> everything compile in both languages, that'd be more agreement than
> we've ever had before, and I'd rather take that agreement and run than
> look a gift horse in the mouth.
>
> > The whole thing would make a lot more sense given a credible design
> > for error handling that keeps both languages happy.
>
> Well, getting so that we can at least compile in both systems would
> certainly increase the chances of somebody being willing to work on
> such a design.  And if nobody ever does, then at least people who want
> to fork and do research projects based on PostgreSQL will have
> slightly less work to do when they want to hack it up.  PostgreSQL
> seems to be a very popular starting point for research work, but a
> paper I read recently complained about the antiquity of our code base.
> I prefer to call that backward-compatibility, but at some point people
> stop thinking of you as backward-compatible and instead think of you
> as simply backward.
>
>
I agree, this was the main reason why we wanted to add support for C++.


> > A lot of the other things people have muttered about, such as heavier
> > use of inline functions instead of macros, don't particularly need C++
> > at all.
>
> True.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


Re: [HACKERS] [parallel query] random server crash while running tpc-h query on power2

2016-08-16 Thread Robert Haas
On Tue, Aug 16, 2016 at 1:05 AM, Rushabh Lathia
 wrote:
> I agree, this make sense.
>
> Here is the patch to allocate worker instrumentation into same context
> as the regular instrumentation which is per-query context.

Looks good, committed.  I am not sure it was a very good idea for
af33039317ddc4a0e38a02e2255c2bf453115fd2 by Tom Lane to change the
current memory context for the entire execution of gather_readnext();
this might not be the only or the last bug that results from that
decision.  However, I don't really want to get an argument about that
right now, and this at least fixes the problem we know about.  Thanks
for the report and patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Robert Haas
On Tue, Aug 16, 2016 at 12:59 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> I'm not really interested in supporting PostgreSQL code written in
>> other languages entirely, such as Rust, but I do think it would make
>> sense to write our code so that it can be compiled using either a C
>> compiler or a C++ compiler.  Even if we don't ever use any C++ code in
>> core, this would let people who create forks or extensions use it if
>> they wished.  It wouldn't be that much work to maintain, either: we'd
>> just set up some buildfarm members that compiled using C++ and when
>> they turned red, we'd go fix it.
>
> I think this might have advantages purely from the standpoint of new
> compilers possibly offering useful warnings we don't get now.

Yeah, that could be nice.

> But
> if we only go this far, I'm pretty dubious that it really helps people
> to develop extensions in C++.  Almost invariably, if you ask *why* they
> want to do that, you'll get an answer involving C++ libraries that are
> not going to play very nice with our error handling or memory management
> conventions.  I do not see how we could C++-ify the error handling without
> making a complete break with C compilers ... which is a step I don't
> really want to take.

I agree, but we don't have to agree to change everything before we
agree to change anything.  If we got an agreement to try to make
everything compile in both languages, that'd be more agreement than
we've ever had before, and I'd rather take that agreement and run than
look a gift horse in the mouth.

> The whole thing would make a lot more sense given a credible design
> for error handling that keeps both languages happy.

Well, getting so that we can at least compile in both systems would
certainly increase the chances of somebody being willing to work on
such a design.  And if nobody ever does, then at least people who want
to fork and do research projects based on PostgreSQL will have
slightly less work to do when they want to hack it up.  PostgreSQL
seems to be a very popular starting point for research work, but a
paper I read recently complained about the antiquity of our code base.
I prefer to call that backward-compatibility, but at some point people
stop thinking of you as backward-compatible and instead think of you
as simply backward.

> A lot of the other things people have muttered about, such as heavier
> use of inline functions instead of macros, don't particularly need C++
> at all.

True.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> A brief look through the code and some reading between the lines of the
> documentation shows that it only cleans up shared memory segments that
> are no longer attached to, but there is no such check for semaphores.

Oh, interesting.  It had occurred to me that we might be able to dodge
this issue if we started to recommend using unnamed POSIX semaphores
instead of SysV.  (Obviously we'd want to check performance, but it's
at least a plausible alternative.)  I had not wanted to go there if
it meant that we could have silent loss of SysV shmem with no other
symptoms, because as I said upthread, I'm concerned about that breaking
the multiple-postmaster interlock.  However, if the cleanup kills only
semaphores and not attached-to shmem, then that objection goes away and
this becomes something we should seriously consider.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in REL9_5_STABLE

2016-08-16 Thread Alvaro Herrera
Andres Freund wrote:

> Phew. Alvaro, are you tackling this?

Sure.


Marko Tiikkaja wrote:

> There's a rolled back subtransaction that also did some magic on the rows
> AFAICT.  I can try and spend some time producing a smaller test case over
> the weekend.  No hurry since this missed the today's point release anyway.

Marko, any luck?

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Tom Lane
Robert Haas  writes:
> I'm not really interested in supporting PostgreSQL code written in
> other languages entirely, such as Rust, but I do think it would make
> sense to write our code so that it can be compiled using either a C
> compiler or a C++ compiler.  Even if we don't ever use any C++ code in
> core, this would let people who create forks or extensions use it if
> they wished.  It wouldn't be that much work to maintain, either: we'd
> just set up some buildfarm members that compiled using C++ and when
> they turned red, we'd go fix it.

I think this might have advantages purely from the standpoint of new
compilers possibly offering useful warnings we don't get now.  But
if we only go this far, I'm pretty dubious that it really helps people
to develop extensions in C++.  Almost invariably, if you ask *why* they
want to do that, you'll get an answer involving C++ libraries that are
not going to play very nice with our error handling or memory management
conventions.  I do not see how we could C++-ify the error handling without
making a complete break with C compilers ... which is a step I don't
really want to take.

The whole thing would make a lot more sense given a credible design
for error handling that keeps both languages happy.

A lot of the other things people have muttered about, such as heavier
use of inline functions instead of macros, don't particularly need C++
at all.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Patch] Temporary tables that do not bloat pg_catalog (a.k.a fast temp tables)

2016-08-16 Thread Robert Haas
On Mon, Aug 15, 2016 at 5:12 AM, Aleksander Alekseev
 wrote:
> Just to keep things sane I would like to remind that in this concrete
> patch there _are_ catalog entries:
>
> ```
> [...]
> This file contents imlementation of special type of temporary tables ---
> fast temporary tables (FTT). From user perspective they work exactly as
> regular temporary tables. However there are no records about FTTs in
> pg_catalog. These records are stored in backend's memory instead and
> mixed with regular records during scans of catalog tables. We refer to
> corresponding tuples of catalog tables as "in-memory" or "virtual"
> tuples and to all these tuples together --- as "in-memory" or "virtual"
> catalog.
> [...]
> ```

That doesn't really solve the problem, because OTHER backends won't be
able to see them.  So, if I create a fast temporary table in one
session that depends on a permanent object, some other session can
drop the permanent object.  If there were REAL catalog entries, that
wouldn't work, because the other session would see the dependency.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Joshua D. Drake

On 08/16/2016 09:33 AM, Robert Haas wrote:

On Tue, Aug 16, 2016 at 10:47 AM, Jim Nasby  wrote:

On 8/16/16 2:52 AM, Gavin Flower wrote:



I agree with your statement that one of our biggest problems is
getting more developers interested in working on PostgreSQL.  Even if
there's only a 10% chance that something like this will help, why not?
 We're not talking about moving the earth.


Right. It is just reality that less people are learning C which means 
less people will be interested in joining a project that is focused or 
(required) to be C.


Sincerely,

JD



--
Command Prompt, Inc.  http://the.postgres.company/
+1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.
Unless otherwise stated, opinions are my own.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Server crash due to SIGBUS(Bus Error) when trying to access the memory created using dsm_create().

2016-08-16 Thread Robert Haas
On Fri, Aug 12, 2016 at 9:22 PM, Thomas Munro
 wrote:
> On Sat, Aug 13, 2016 at 8:26 AM, Thomas Munro
>  wrote:
>> On Sat, Aug 13, 2016 at 2:08 AM, Tom Lane  wrote:
>>> amul sul  writes:
 When I am calling dsm_create on Linux using the POSIX DSM implementation 
 can succeed, but result in SIGBUS when later try to access the memory.  
 This happens because of my system does not have enough shm space &  
 current allocation in dsm_impl_posix does not allocate disk blocks[1]. I 
 wonder can we use fallocate system call (i.e. Zero-fill the file) to 
 ensure that all the file space has really been allocated, so that we don't 
 later seg fault when accessing the memory mapping. But here we will endup 
 by loop calling ‘write’ squillions of times.
>>>
>>> Wouldn't that just result in a segfault during dsm_create?
>>>
>>> I think probably what you are describing here is kernel misbehavior
>>> akin to memory overcommit.  Maybe it *is* memory overcommit and can
>>> be turned off the same way.  If not, you have material for a kernel
>>> bug fix/enhancement request.
>>
>> [...] But it
>> looks like if we used fallocate or posix_fallocate in the
>> dsm_impl_posix case we'd get a nice ESPC error, instead of
>> success-but-later-SIGBUS-on-access.
>
> Here's a simple test extension that creates jumbo dsm segments, and
> then accesses all pages.  If you ask it to write cheques that your
> Linux 3.10 machine can't cash on unpatched master, it does this:
>
> postgres=# create extension foo;
> CREATE EXTENSION
> postgres=# select test_dsm(16::bigint * 1024 * 1024 * 1024);
> server closed the connection unexpectedly
> ...
> LOG:  server process (PID 15105) was terminated by signal 7: Bus error
>
> If I apply the attached experimental patch I get:
>
> postgres=# select test_dsm(16::bigint * 1024 * 1024 * 1024);
> ERROR:  could not resize shared memory segment
> "/PostgreSQL.1938734921" to 17179869184 bytes: No space left on device
>
> It should probably be refactored a bit to separate the error messages
> for ftruncate and posix_fallocate, and we could possibly use the same
> approach for dsm_impl_mmap instead of that write() loop, but this at
> least demonstrates the problem Amul reported.  Thoughts?

Seems like it could be a reasonable change.  I wonder what happens on
other platforms.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Peter Eisentraut
On 8/16/16 11:24 AM, Tom Lane wrote:
> Not sure I believe that --- the cases that have been reported to us
> involved postgres processes that were still alive but had had their
> SysV semaphore sets deleted out from under them.  Likely the SysV
> shmem segments too, but that wouldn't cause any observable effects
> for the running cluster.  (It *would* risk breaking the interlock
> against starting a new postmaster, I fear.)
> 
> It might be that both behaviors exist now but more people know about
> how to turn off the killing-processes one.

They are two separate things.

Both are controlled by settings in logind.conf.

RemoveIPC=

controls whether System V IPC objects are removed when a user logs out.
System users are exempt.

This was turned on by default in systemd version 212 (2014-03-25).

RHEL7 ships 219.  Debian stable ships 215.

Apparently, the systemd package in RHEL7 is built with it defaulting to
off.  The package in Debian defaults to on, but I can't actually
reproduce the issue.

A brief look through the code and some reading between the lines of the
documentation shows that it only cleans up shared memory segments that
are no longer attached to, but there is no such check for semaphores.

So there are some issues here to be worked out.


KillUserProcesses=

controls whether all processes of a user should be killed when the user
logs out.  This was turned on by default in systemd version 230
(2016-05-21).  This is not yet shipped widely (Fedora Branched/25,
Debian testing, stable-backports).

There are various ways to adjust that, including the KillOnlyUsers=,
KillExcludeUsers=, loginctl enable-linger, systemd-run.  These are all
explained on the logind.conf man page.  (Being a "system user" has no
influence here.)

This will clearly result in some wide-spread annoyance among users and
some wide-spread rejoicing among system administrators, but other than
that I don't see a potential harm specific to PostgreSQL here.


-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Robert Haas
On Tue, Aug 16, 2016 at 10:47 AM, Jim Nasby  wrote:
> On 8/16/16 2:52 AM, Gavin Flower wrote:
>> In both cases, part of the motivation to change from C was to appeal to
>> new developers - from what I remember of the discussions.
>
> Moving this to -hackers. Original thread at [1].
>
> tl;dr: A C++ port of Postgres has been created, and several folks on general
> have commented that this makes it easier to work with the Postgres codebase.
> This potentially attracts more developers to the project. I hope we can find
> a way to pull these folks into the fold.
>
> People in core have complained that we don't have enough hackers coming in
> (which I agree with). Part of that is lack of familiarity with C.
>
> I think we can all agree that a C++ fork of Postgres would be a huge waste
> of time, so the question becomes should core postgres start supporting C++.
>
> Peter wrote a blog about this in 2013 that makes some good arguments [2]; in
> particular "easing into" this by first officially supporting C++
> compilation. I also like the idea of investigating Rust.

I'm not really interested in supporting PostgreSQL code written in
other languages entirely, such as Rust, but I do think it would make
sense to write our code so that it can be compiled using either a C
compiler or a C++ compiler.  Even if we don't ever use any C++ code in
core, this would let people who create forks or extensions use it if
they wished.  It wouldn't be that much work to maintain, either: we'd
just set up some buildfarm members that compiled using C++ and when
they turned red, we'd go fix it.

I agree with your statement that one of our biggest problems is
getting more developers interested in working on PostgreSQL.  Even if
there's only a 10% chance that something like this will help, why not?
 We're not talking about moving the earth.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Assertion failure in REL9_5_STABLE

2016-08-16 Thread Andres Freund
On 2016-08-11 11:01:18 +0200, Marko Tiikkaja wrote:
> On 11/08/16 8:48 AM, Michael Paquier wrote:
> > On Thu, Aug 11, 2016 at 8:09 AM, Marko Tiikkaja  wrote:
> > > On 2016-08-11 12:09 AM, Alvaro Herrera wrote:
> > > > 
> > > > BTW this is not a regression, right?  It's been broken all along.  Or am
> > > > I mistaken?
> > > 
> > > You're probably right.  I just hadn't realized I could run our app against
> > > 9.5 with --enable-cassert until last week.
> > 
> > Just wondering... If you revert 1f9534b4 and/or b33e81cb do you still
> > see a problem?
> 
> Yeah, no effect.

Phew. Alvaro, are you tackling this?


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Yury Zhuravlev

Aleksander Alekseev wrote:

You are right, there is none.
First: trees in parser, planer and etc. 
Second: normal exception.


Two big projects lately move to C++ from C:
GCC, Mesa

You can read their reasons.
Only C++ we can use without full rewrite currently. (or ObjectC maybe)
If we wish fix C limitations. 


--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Heikki Linnakangas

On 08/16/2016 05:47 PM, Jim Nasby wrote:

I realize there's little technical reason why we *need* C++ support. The
level if discipline applied to our codebase negates some of the benefits
of C++. But maintaining the discipline takes a lot of time and effort,
and makes it more difficult to attract new contributors.


I suspect that it would take as much discipline to keep a C++ codebase 
readable, as the current C codebase. If not more.


- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Magnus Hagander
On Tue, Aug 16, 2016 at 5:24 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Aug 16, 2016 5:11 PM, "Tom Lane"  wrote:
> >> Dunno, it was still working the last time I used Fedora for anything
> much.
> >> Admittedly, that was about three years ago.  But the issue would still
> >> arise if you prefer "pg_ctl start".
>
> > There are two independent changes AFAIK. One is that whenever a user that
> > logged in interactively logs out all their processes are killed,
> regardless
> > of nohup. The other one is the one about shared memory mentioned here.
> They
> > will both independently kill postgres sessions launched manually. Or with
> > pg_ctl.
>
> Not sure I believe that --- the cases that have been reported to us
> involved postgres processes that were still alive but had had their
> SysV semaphore sets deleted out from under them.  Likely the SysV
> shmem segments too, but that wouldn't cause any observable effects
> for the running cluster.  (It *would* risk breaking the interlock
> against starting a new postmaster, I fear.)
>
> It might be that both behaviors exist now but more people know about
> how to turn off the killing-processes one.
>
>
Yes, I think it's the second. See for example
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=825394. You can configure
KillUserProcesses=no in logind.conf to get rid of it (that bug discusses
the debian default behaviour).



-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Aleksander Alekseev
Well, well, well. Another C vs C++ holly war, really?

> > In both cases, part of the motivation to change from C was to
> > appeal to new developers - from what I remember of the
> > discussions.  
> 
> Moving this to -hackers. Original thread at [1].
> 
> tl;dr: A C++ port of Postgres has been created, and several folks on 
> general have commented that this makes it easier to work with the 
> Postgres codebase. This potentially attracts more developers to the 
> project. I hope we can find a way to pull these folks into the fold.

Who are these "folks"? How many more developers it would attract
_exactly_, not potentially?

> People in core have complained that we don't have enough hackers
> coming in (which I agree with). Part of that is lack of familiarity
> with C.

One again, which "people"? I've seen people complained that there is
not enough code reviewers and testers. I doubt very much its something
C++ will help with.

> I think we can all agree that a C++ fork of Postgres would be a huge 
> waste of time, so the question becomes should core postgres start 
> supporting C++.
> 
> Peter wrote a blog about this in 2013 that makes some good arguments 
> [2]; in particular "easing into" this by first officially supporting
> C++ compilation. I also like the idea of investigating Rust.

And I wrote a blog post (in Russian) [1] in 2016 why nobody should (if
they can) write a new code in C++. In my opinion Rust looks way more
promising. However I personally prefer to wait like 5 years before
using a new and shiny technology. This is also something I blogged
about (in Russian [2], translation [3]).

> I realize there's little technical reason why we *need* C++ support.

You are right, there is none.

> The level if discipline applied to our codebase negates some of the
> benefits of C++. But maintaining the discipline takes a lot of time
> and effort, and makes it more difficult to attract new contributors.

There are companies. They hire developers who write code. Doesn't
really matters in which language.

Long story short - I strongly believe you are trying to solve a problem
that doesn't exist. And probably create a few new ones.

[1] http://eax.me/c-vs-cpp/
[2] http://eax.me/cpp-will-never-die/
[3] http://www.viva64.com/en/b/0324/

-- 
Best regards,
Aleksander Alekseev


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Tom Lane
Magnus Hagander  writes:
> On Aug 16, 2016 5:11 PM, "Tom Lane"  wrote:
>> Dunno, it was still working the last time I used Fedora for anything much.
>> Admittedly, that was about three years ago.  But the issue would still
>> arise if you prefer "pg_ctl start".

> There are two independent changes AFAIK. One is that whenever a user that
> logged in interactively logs out all their processes are killed, regardless
> of nohup. The other one is the one about shared memory mentioned here. They
> will both independently kill postgres sessions launched manually. Or with
> pg_ctl.

Not sure I believe that --- the cases that have been reported to us
involved postgres processes that were still alive but had had their
SysV semaphore sets deleted out from under them.  Likely the SysV
shmem segments too, but that wouldn't cause any observable effects
for the running cluster.  (It *would* risk breaking the interlock
against starting a new postmaster, I fear.)

It might be that both behaviors exist now but more people know about
how to turn off the killing-processes one.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: GiST optimizing memmoves in gistplacetopage for fixed-size updates [PoC]

2016-08-16 Thread Anastasia Lubennikova

09.08.2016 19:45, Andrew Borodin:

Here is new version of the patch, now it includes recommendations from
Anastasia Lubennikova.


I've investigated anamalous index size decrease. Most probable version
appeared to be true.
Cube extension, as some others, use Guttman's polynomial time split
algorithm. It is known to generate "needle-like" MBBs (MBRs) for
sorted data due to imbalanced splits (splitting 100 tuples as 98 to
2). Due to previous throw-to-the-end behavior of GiST this imbalance
was further amplificated (most of inserts were going to bigger part
after split). So GiST inserts were extremely slow for sorted data.
There is no need to do exact sorting to trigger this behavior.
This behavior can be fused by implementation of small-m restriction in
split (AFAIR this is described in original R-tree paper from 84),
which prescribes to do a split in a parts no smaller than m, where m
is around 20% of a page capacity (in tuples number). After application
of this patch performance for sorted data is roughly the same as
performance for randomized data.


Thank you for explanation. Now it's clear to me. I did some more testing and
found nothing special. The declared feature is implemented correctly.
It passes all regression tests and improves performance.

I still have a few minor nitpicks about the patch style.
You can find a style guide on 
https://www.postgresql.org/docs/9.6/static/source.html


1) remove extra whitespace in README

2) add whitespace: if(ntup == 1)

3) fix comments in accordance with coding conventions

/* In case of single tuple update GiST calls overwrite
 * In all other cases function gistplacetopage deletes
 * old tuples and place updated at the end
 *  */


+/* Normally here was following assertion
+ * Assert(ItemIdHasStorage(ii));
+ * This assertion was introduced in PageIndexTupleDelete
+ * But if this function will be used from BRIN index
+ * this assertion will fail. Thus, here we do not check that
+ * item has the storage.
+ * */

4) remove unrelated changes
-data += sizeof(OffsetNumber) * xldata->ntodelete;
+data += sizeof(OffsetNumber) *xldata->ntodelete;

5) If the comment is correct, maxalignment is not necessary.
+/* tuples on a page are always maxaligned */
+oldsize = MAXALIGN(oldsize);

6) I'd rather use alignednewsize here.
 +ItemIdSetNormal(tupid, offset + size_diff, newsize);


After the cleanup you can change status to "Ready for Committer"
without waiting for the response.


If data is randomized this effect of Guttman poly-time split is not in
effect; test from the start of the thread will show no statistically
confident improvement by the patch.
To prove performance increase in randomized case I've composed
modified test https://gist.github.com/x4m/856b2e1a5db745f8265c76b9c195f2e1
This test with five runs shows following:
Before patch
Insert Time AVG 78 seconds STD 9.5
Afer patch
Insert Time AVG 68 seconds STD 7.7
This is marginal but statistically viable performance improvement.

For sorted data performance is improved by a factor of 3.
Best regards, Andrey Borodin, Octonica & Ural Federal University.



--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Yury Zhuravlev

Jim Nasby wrote:
My hope is that existing hackers can agree on a reasonable way 
forward and guide/assist new folks that are interested in 
walking that path.


I tried this path. https://github.com/stalkerg/postgres_cpp 
And I fully support this desire. But I'm in the minority.



I also like the idea of investigating Rust.


I am working on it last few weeks. But it's like seek blocks for new DB. I 
don't know how we can insert Rust code into Postgres spaghetti.


--
Yury Zhuravlev
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Magnus Hagander
On Aug 16, 2016 5:11 PM, "Tom Lane"  wrote:
>
> Magnus Hagander  writes:
> > On Aug 16, 2016 4:43 PM, "Tom Lane"  wrote:
> >> Rather, the problem arises when J. Ordinary User does
> >> nohup postmaster &
> >> and then logs out.
>
> > I think this is a partially different issue though. They already broke
the
> > nohup approach earlier with a different change, didn't they?
>
> Dunno, it was still working the last time I used Fedora for anything much.
> Admittedly, that was about three years ago.  But the issue would still
> arise if you prefer "pg_ctl start".
>

There are two independent changes AFAIK. One is that whenever a user that
logged in interactively logs out all their processes are killed, regardless
of nohup. The other one is the one about shared memory mentioned here. They
will both independently kill postgres sessions launched manually. Or with
pg_ctl.

Both are fairly recent changes, certainly less than three years.

/Magnus


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Tom Lane
Magnus Hagander  writes:
> On Aug 16, 2016 4:43 PM, "Tom Lane"  wrote:
>> Rather, the problem arises when J. Ordinary User does
>> nohup postmaster &
>> and then logs out.

> I think this is a partially different issue though. They already broke the
> nohup approach earlier with a different change, didn't they?

Dunno, it was still working the last time I used Fedora for anything much.
Admittedly, that was about three years ago.  But the issue would still
arise if you prefer "pg_ctl start".

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [GENERAL] C++ port of Postgres

2016-08-16 Thread Jim Nasby

On 8/16/16 2:52 AM, Gavin Flower wrote:

In both cases, part of the motivation to change from C was to appeal to
new developers - from what I remember of the discussions.


Moving this to -hackers. Original thread at [1].

tl;dr: A C++ port of Postgres has been created, and several folks on 
general have commented that this makes it easier to work with the 
Postgres codebase. This potentially attracts more developers to the 
project. I hope we can find a way to pull these folks into the fold.


People in core have complained that we don't have enough hackers coming 
in (which I agree with). Part of that is lack of familiarity with C.


I think we can all agree that a C++ fork of Postgres would be a huge 
waste of time, so the question becomes should core postgres start 
supporting C++.


Peter wrote a blog about this in 2013 that makes some good arguments 
[2]; in particular "easing into" this by first officially supporting C++ 
compilation. I also like the idea of investigating Rust.


I realize there's little technical reason why we *need* C++ support. The 
level if discipline applied to our codebase negates some of the benefits 
of C++. But maintaining the discipline takes a lot of time and effort, 
and makes it more difficult to attract new contributors. My hope is that 
existing hackers can agree on a reasonable way forward and guide/assist 
new folks that are interested in walking that path.


1: 
https://www.postgresql.org/message-id/CABgyVxDBd3EvRdo-Rd6eo8QPEqV8=shau2sjfo16wfe0r-h...@mail.gmail.com

2: https://petereisentraut.blogspot.com/2013/05/moving-to-c.html
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
855-TREBLE2 (855-873-2532)   mobile: 512-569-9461


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Magnus Hagander
On Aug 16, 2016 4:43 PM, "Tom Lane"  wrote:
>
> Peter Eisentraut  writes:
> > On 8/16/16 8:53 AM, Greg Stark wrote:
> >> That's a system level change though. How would a normal user manage
this?
>
> > Arguably, if you are a normal user, you probably shouldn't be using
> > systemd to start system services under your own account.
>
> I'm not totally sure, but I think that the complaints were not about
> systemd-driven services.  (In such a case, it's almost certainly possible
> to fix it by adjusting your systemd unit definition file, anyway.)
> Rather, the problem arises when J. Ordinary User does
>
> nohup postmaster &
>
> and then logs out.  That's certainly not much of a recipe for production
> services but people have been known to do it for testing --- in fact,
> that's pretty much what I do every day with test postmasters.  I suppose
> whenever I migrate to a recent-systemd-based distro I'm going to have to
> turn off this miserable excuse for a feature.  I sure hope there's a way
> to do so.

I think this is a partially different issue though. They already broke the
nohup approach earlier with a different change, didn't they?

/Magnus


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Tom Lane
Peter Eisentraut  writes:
> On 8/16/16 8:53 AM, Greg Stark wrote:
>> That's a system level change though. How would a normal user manage this?

> Arguably, if you are a normal user, you probably shouldn't be using
> systemd to start system services under your own account.

I'm not totally sure, but I think that the complaints were not about
systemd-driven services.  (In such a case, it's almost certainly possible
to fix it by adjusting your systemd unit definition file, anyway.)
Rather, the problem arises when J. Ordinary User does

nohup postmaster &

and then logs out.  That's certainly not much of a recipe for production
services but people have been known to do it for testing --- in fact,
that's pretty much what I do every day with test postmasters.  I suppose
whenever I migrate to a recent-systemd-based distro I'm going to have to
turn off this miserable excuse for a feature.  I sure hope there's a way
to do so.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PSA: Systemd will kill PostgreSQL

2016-08-16 Thread Peter Eisentraut
On 8/16/16 8:53 AM, Greg Stark wrote:
> That's a system level change though. How would a normal user manage this?

Arguably, if you are a normal user, you probably shouldn't be using
systemd to start system services under your own account.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Let's get rid of the separate minor version numbers for shlibs

2016-08-16 Thread Tom Lane
Greg Stark  writes:
> It does rule out the possibility of having a minor bump in the soname
> for a point-release, which as you point out wouldn't do much on Linux
> but on other operating systems might be a useful thing.

I believe we could legally set SO_MINOR_VERSION to, say, 10.1 if we had to
(cf comment about it in Makefile.shlib), so a workaround is available for
that case.  And the current scheme isn't any better: if, say, 9.5 is at
libpq.so.5.8, and we wish to bump the soname for its next point release,
we can't use soname 5.9 because that's already taken by 9.6.  We'd have
to go to soname 5.8.1.  So it's pretty much exactly the same thing.

Mechanically, that could look like editing the back branch's makefile
to say

SO_MINOR_VERSION=$(MAJORVERSION).1

This would not need to propagate into any other branch (unless we were
making similar changes for similar reasons in other back branches, of
course).

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Slowness of extended protocol

2016-08-16 Thread Vladimir Sitnikov
Shay> your analogy breaks down. Of course L2 was invented to improve
performance,
Shay> but that doesn't mean that all caches are the same. More precisely, what I
Shay> find objectionable about your approach is not any caching - it's the
Shay> implicit or automatic preparation of statements. This practice isn't
Shay> invisible in that a) it may cause errors that wouldn't have been there
Shay> otherwise (e.g. because of DDL),

Long-lived named server-prepared statements cause problems even if
server-prepared statements are created manually by developers.

Could you please stop saying "automatic preparation causes ~DDL issues"?

Those errors are not inherent to "automatic preparation of statements"
Those are just database defects that need to be cured.

Automatic savepointing is just a workaround for current DB limitation, and
it provides users with a simplified migration path.


Please, don't try to tell me that "IDbCommand.Prepare()" documentation says
that "prepared statement might fail for no reason just because it is prepared".


Shay> As I said above, I think this is a critical point of misunderstand between
Shay> us. The developers tells the driver which statements should be
Shay> server-prepared by calling .prepareStatement(). I'm guessing you have a
Shay> totally different understanding here.

Please, quote the document you got that "developers tell the driver which
statements should be server-prepared by calling ..." from. It never
works like that.
Neither in Java, nor in C#. I would admit I've no C# experience, but I did
find documentation on IDbCommand.Prepare() and examined it.

The proper way to say is "by calling .Prepare() developer passes the
intention that
he might be using the same query multiple times".
That is it. It never means "driver must absolutely use server-prepare
in the response
to .Prepare() call".

The same goes for Java's PreparedStatement.
It never means "the driver must use server-prepared features".

As Microsoft lists in the .Prepare() documentation, modern versions of
MSSQL just ignore .Prepare() and cache statements automatically.

It is not a developer's business which statements should be in the
database cache.
Neither developer should care which statements reside in the driver cache.


Shay> What exactly does "server-prepare on each execution" means? Sending Parse
Shay> on each execution? How can that be considered prepared at all?

Remember, java.sql.PreparedStatement interface is NOT bound to PostgreSQL in any
manner. It is a generic database API.
Thus the word "prepared" does not mean anything specific there.
It gives no promise whether the statement will use "parseOnce,
execMany" PostgreSQL's
feature or not.

A compliant implementation (that is a driver) could just assemble full SQL
by concatenating the parameters on each execution and send it via 'Q' simple
execute message.


Shay> Does pgjdbc consider
Shay> something "prepared" without it being the 2nd option above? Note that I'm
Shay> genuinely interested in case I'm missing something.

Currently the first 5 executions of PreparedStatement use unnamed
statements (Parse/Bind/Exec).
Then pgjdbc assigns a name and uses just Bind/Exec.

So if a particular SQL is rare, then it would not get its own
server-prepared name
even though it is "java.sql.PreparedStatement".

What pgjdbc does is it picks the most used queries and enables them to be cached
at the database level.


Vladimir>> app/component and assign variables to CPU registers.
Vladimir>> This is exactly "programmer knowledge" which the compiler
doesn't have.
Vladimir>> Does it sound good to you?

Shay> Of course not. But I don't think it's a very valid analogy.

The analogy was not supposed to play in a way
you twisted it with ORM=Java, driver=C, etc.

Here's more detailed explanation:

1) You claim that programmers should manually examine all the SQL statements,
and put ".prepare()" call if and only if the specific SQL should be
server-prepared.

2) My analogy: programmers should manually examine all the variables
(think of C#
variables, or Java variables, or C variables, it does not matter),
and assign which variables should use CPU registers, and which ones should
go into the memory.

Named server-prepared statements == CPU registers
SQL statements in the code == variables in the code (e.g. C# variables)

That is very valid analogy. What you say is "programmer has full visibility over
the meaning of the code, thus it knows better which statements should be
server-prepared and which should not".

Well, register allocation is a bit harder problem that "statement name
allocation",
but the essence is the same: there's limited number of registers/named
statements,
so someone (or something) should decide which statements deserve a name.

Just in case: you definitely know what CPU registers are and what is
"register allocation" problem, don't you?

You seem to pick up that "application developer != compiler engineer", however
then you slipped into "nevertheless 

Re: [HACKERS] patch proposal

2016-08-16 Thread Stephen Frost
Greetings,

* Venkata B Nagothi (nag1...@gmail.com) wrote:
> The above said parameters can be configured to pause, shutdown or prevent
> promotion only after reaching the recovery target point.
> To clarify, I am referring to a scenario where recovery target point is not
> reached at all ( i mean, half-complete or in-complete recovery) and there
> are lots of WALs still pending to be replayed - in this situation,

PG doesn't know that there are still WALs to be replayed.

> PostgreSQL just completes the archive recovery until the end of the last
> available WAL (WAL file "0001001E" in my case) and
> starts-up the cluster by generating an error message (saying
> "0001001F" not found).

That's not a PG error, that's an error from cp.  From PG's perspective,
your restore command has said that all of the WAL has been replayed.

If that's not what you want then change your restore command to return
an exit code > 125, which tells PG that it's unable to restore that WAL
segment.

> It would be nice if PostgreSQL pauses the recovery in-case its not complete
> (because of missing or corrupt WAL), shutdown the cluster and allows the
> DBA to restart the replay of the remaining WAL Archive files to continue
> recovery (from where it stopped previously) until the recovery target point
> is reached.

Reaching the end of WAL isn't an error and I don't believe it makes any
sense to treat it like it is.  You can specify any recovery target point
you wish, including ones that don't exist, and that's not an error
either.

I could see supporting an additional "pause" option that means "pause at
the end of WAL if you don't reach the recovery target point".  I'd also
be happy with a warning being emitted in the log if the recovery target
point isn't reached before reaching the end of WAL, but I don't think it
makes sense to change the existing behavior.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Declarative partitioning - another take

2016-08-16 Thread Robert Eckhardt
On Tue, Aug 16, 2016 at 2:30 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

>
>
>
>
>> I think it makes sense to keep calling it a table because it has all the
>> logical properties of a table even though it will differ from a regular
>> table on the basis of physical implementation details such as that it does
>> not own physical storage.  Am I missing something?
>>
>> >
>> > +  partexprs
>> >
>> > There's a certain symmetry between this and what we do for indexes,
>> > but I'm wondering whether there's a use case for partitioning a table
>> > by an expression rather than a column value.  I suppose if you've
>> > already done the work, there's no harm in supporting it.
>>
>> Yeah, it's not a whole lot of code to manage expressions alongside simple
>> column references.
>>
>
> Users who would like to partition their tables by "age" will partition
> those by the month or year extracted out of a date column e.g. order_date.
> They will find it convenient to use an expression (extract(month from
> date)) as a partition key, instead of storing month or year as a separate
> column.
>

In GPDB we have partitioning. It is almost always by date and then often
the partitions are for different sizes, i.e. by day for 30 days then by
month for 3 years then by year. What we also support, but isn't super
performant, is sub-partitioning.

This is where some on the newer indexing strategies is interesting to me. I
see them as synergistic not redundant.


>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>


Re: [HACKERS] [PATCH] bigint txids vs 'xid' type, new txid_recent(bigint) => xid

2016-08-16 Thread Craig Ringer
On 16 August 2016 at 20:58, Greg Stark  wrote:

> On Tue, Aug 16, 2016 at 10:15 AM, Craig Ringer 
> wrote:
> > I'm surprised the 32-bit xid was ever exposed to the user, rather than a
> > 64-bit epoch-extended xid.
>
> Once upon a time we didn't have epoch counting at all.
>

Makes sense. I didn't dig back too far in history.

Sounds like you're in favour of the 2nd part of the proposal (not covered
by the current patch) then.

I haven't yet done the validation required on the epoch logic btw, and I
won't be too surprised if it's a bit off. I'm writing a fast xid burn
function for use in testing now. I doubt it'll be fast enough to use in
routine regression testing since all those clog pages will still take time.
But we'll see.  I'd kind of like to be able to avoid all that - advance the
xid counter and treat all the old xids as frozen. I don't know or if this
is practical within a normal backend though.

Anyway, will follow up with more tests and - probably - a bugfix or three
soon.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


  1   2   >