Re: [HACKERS] [Patch] RBTree iteration interface improvement

2016-09-01 Thread Heikki Linnakangas

On 08/26/2016 04:07 PM, Aleksander Alekseev wrote:

Another unrelated change in this patch is the addition of
rb_rightmost(). It's not used for anything, so I'm not sure what the
point is. Then again, there don't seem to be any callers of
rb_leftmost() either.


It's just something I needed in tests to reach a good percent of code
coverage. Implementation of rb_rightmost is trivial so we probably can do
without it.


Looking closer, we don't currently use any of the iterators besides the 
left-right iterator either. Nor rb_delete().



I think we should something like in the attached patch. It seems to pass
your test suite, but I haven't done any other testing on this. Does it
look OK to you?


Looks good to me.


Ok, committed.

- Heikki



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Declarative partitioning - another take

2016-09-01 Thread Ashutosh Bapat
Here's something I observed with your set of patches posted in June. I have
not checked the latest set of patches. So, if it's something fixed, please
ignore the mail and sorry for me being lazy.

prt1 is partitioned table and it shows following information with \d+

regression=# \d+ prt1
Partitioned table "public.prt1"
 Column |   Type| Modifiers | Storage  | Stats target |
Description
+---+---+--+--+-
 a  | integer   |   | plain|  |
 b  | integer   |   | plain|  |
 c  | character varying |   | extended |  |
Partition Key: PARTITION BY RANGE (a)
Indexes:
"iprt1_a" btree (a)

Shouldn't we show all the partitions of this table and may be their ranges
of lists?

I found the partitions from EXPLAIN plan

regression=# explain verbose select * from prt1;
  QUERY PLAN
---
 Append  (cost=0.00..6.00 rows=301 width=13)
   ->  Seq Scan on public.prt1  (cost=0.00..0.00 rows=1 width=40)
 Output: prt1.a, prt1.b, prt1.c
   ->  Seq Scan on public.prt1_p1  (cost=0.00..2.25 rows=125 width=13)
 Output: prt1_p1.a, prt1_p1.b, prt1_p1.c
   ->  Seq Scan on public.prt1_p3  (cost=0.00..1.50 rows=50 width=13)
 Output: prt1_p3.a, prt1_p3.b, prt1_p3.c
   ->  Seq Scan on public.prt1_p2  (cost=0.00..2.25 rows=125 width=13)
 Output: prt1_p2.a, prt1_p2.b, prt1_p2.c
(9 rows)

Then did \d+ on each of those to find their ranges

regression=# \d+ prt1_p1
 Table "public.prt1_p1"
 Column |   Type| Modifiers | Storage  | Stats target |
Description
+---+---+--+--+-
 a  | integer   |   | plain|  |
 b  | integer   |   | plain|  |
 c  | character varying |   | extended |  |
Partition Of: prt1 FOR VALUES START (0) END (250)
Indexes:
"iprt1_p1_a" btree (a)

regression=# \d+ prt1_p2
 Table "public.prt1_p2"
 Column |   Type| Modifiers | Storage  | Stats target |
Description
+---+---+--+--+-
 a  | integer   |   | plain|  |
 b  | integer   |   | plain|  |
 c  | character varying |   | extended |  |
Partition Of: prt1 FOR VALUES START (250) END (500)
Indexes:
"iprt1_p2_a" btree (a)

regression=# \d+ prt1_p3
 Table "public.prt1_p3"
 Column |   Type| Modifiers | Storage  | Stats target |
Description
+---+---+--+--+-
 a  | integer   |   | plain|  |
 b  | integer   |   | plain|  |
 c  | character varying |   | extended |  |
Partition Of: prt1 FOR VALUES START (500) END (600)
Indexes:
"iprt1_p3_a" btree (a)

As you will observe that the table prt1 can not have any row with a < 0 and
a > 600. But when I execute

regression=# explain verbose select * from prt1 where a > 100;
QUERY PLAN
--
 Append  (cost=0.00..0.00 rows=1 width=40)
   ->  Seq Scan on public.prt1  (cost=0.00..0.00 rows=1 width=40)
 Output: prt1.a, prt1.b, prt1.c
 Filter: (prt1.a > 100)
(4 rows)

it correctly excluded all the partitions, but did not exclude the parent
relation. I guess, we have enough information to exclude it. Probably, we
should add a check constraint on the parent which is OR of the check
constraints on all the partitions. So there are two problems here

1. \d+ doesn't show partitions - this is probably reported earlier, I don't
remember.
2. A combination of constraints on the partitions should be applicable to
the parent. We aren't doing that.


On Wed, Aug 10, 2016 at 4:39 PM, Amit Langote  wrote:

> Hi,
>
> Attached is the latest set of patches to implement declarative
> partitioning.  There is already a commitfest entry for the same:
> https://commitfest.postgresql.org/10/611/
>
> The old discussion is here:
> http://www.postgresql.org/message-id/55d3093c.5010...@lab.ntt.co.jp/
>
> Attached patches are described below:
>
> 0001-Catalog-and-DDL-for-partition-key.patch
> 0002-psql-and-pg_dump-support-for-partitioned-tables.patch
>
> These patches create the infrastructure and DDL for partitioned
> tables.
>
> In addition to a catalog for storing the partition key information, this
> adds a new relkind to pg_class.h. PARTITION BY clause is added to CREATE
> TABLE. Tables so created are RELKIND_PARTITIONED_REL 

Re: [HACKERS] Hash Indexes

2016-09-01 Thread Amit Kapila
On Thu, Sep 1, 2016 at 11:33 PM, Jesper Pedersen
 wrote:
> On 08/05/2016 07:36 AM, Amit Kapila wrote:

>
> Needs a rebase.
>

Done.

>
> +   if (blkno == P_NEW)
> +   elog(ERROR, "hash AM does not use P_NEW");
>
> Left over ?
>

No.  We need this check similar to all other _hash_*buf API's, as we
never expect caller of those API's to pass P_NEW.  The new buckets
(blocks) are created during split and it uses different mechanism to
allocate blocks in bulk.

I have fixed all other issues you have raised.  Updated patch is
attached with this mail.

>
> Ran some tests on a CHAR() based column which showed good results. Will have
> to compare with a run with the WAL patch applied.
>

Okay, Thanks for testing.  I think WAL patch is still not ready for
performance testing, I am fixing few issues in that patch, but you can
do the design or code level review of that patch at this stage.  I
think it is fine even if you share the performance numbers with this
and or Mithun's patch [1].


[1] - https://commitfest.postgresql.org/10/715/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


concurrent_hash_index_v5.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] less expensive pg_buffercache on big shmem

2016-09-01 Thread Peter Geoghegan
On Thu, Sep 1, 2016 at 8:19 PM, Andres Freund  wrote:
> On 2016-09-02 08:31:42 +0530, Robert Haas wrote:
>> I wonder whether we ought to just switch from the consistent method to
>> the semiconsistent method and call it good.
>
> +1. I think, before long, we're going to have to switch away from having
> locks & partitions in the first place. So I don't see a problem relaxing
> this. It's not like that consistency really buys you anything...  I'd
> even consider not using any locks.

Right. ISTM that the consistency guarantee was added on the off chance
that it mattered, without actually being justified. I would like to be
able to run pg_buffercache in production from time to time.


-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] less expensive pg_buffercache on big shmem

2016-09-01 Thread Andres Freund
On 2016-09-02 08:31:42 +0530, Robert Haas wrote:
> I wonder whether we ought to just switch from the consistent method to
> the semiconsistent method and call it good.

+1. I think, before long, we're going to have to switch away from having
locks & partitions in the first place. So I don't see a problem relaxing
this. It's not like that consistency really buys you anything...  I'd
even consider not using any locks.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] less expensive pg_buffercache on big shmem

2016-09-01 Thread Robert Haas
On Wed, Aug 31, 2016 at 8:27 PM, Ivan Kartyshov
 wrote:
> Recently I have finished my work on a patch for pg_buffercache contrib, I
> think it's time to share my results.

Thanks for sharing your results.

> V1.2 implementation contains flexible loop which can collect shared memory
> statistic using three different methods:
> 1) with holding LWLock only on one partition of shared memory
> (semiconsistent method)
> 2) without LWLocks (nonconsistent method),
> 3) or in vanilia way (consistent method)
>
> The aforementioned allow us to launch pg_buffercache in the three different
> ways.

I am a little skeptical about the idea of offering the user three
different choices about how to do this.  That seems like it is
exposing what ought to be an internal implementation detail and I'm
not sure users will really know how to make an intelligent choice
between the three options.  But of course others may have a different
view on that.

I wonder whether we ought to just switch from the consistent method to
the semiconsistent method and call it good.  I agree with you that
taking every buffer partition lock simultaneously seems like too much
locking.  And in the future if we replace the buffer mapping hash with
something that is lock-free or close to it, then we wouldn't even have
buffer partition locks any more, and probably no way at all to get an
entirely consistent snapshot.

What do you think of this?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pgbench - allow to store select results into variables

2016-09-01 Thread Amit Langote

Hi Fabien,

On 2016/07/16 1:33, Fabien COELHO wrote:
> Here is a v2 with more or less this approach, although \into does not end
> the query, but applies to the current or last sql command. A query is
> still terminated with a ";".

This patch needs to be rebased because of commit 64710452 (on 2016-08-19).

Thanks,
Amit




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Confusing docs about GetForeignUpperPaths in fdwhandler.sgml

2016-09-01 Thread Robert Haas
On Mon, Aug 1, 2016 at 5:44 PM, Etsuro Fujita
 wrote:
> I noticed that the following note about direct modification via
> GetForeignUpperPaths in fdwhandler.sgml is a bit confusing.  We have another
> approach using PlanDirectModify, so that should be reflected in the note as
> well.  Please find attached a patch.
>
>  PlanForeignModify and the other callbacks described in
>   are designed around the
> assumption
>  that the foreign relation will be scanned in the usual way and then
>  individual row updates will be driven by a local
> ModifyTable
>  plan node.  This approach is necessary for the general case where an
>  update requires reading local tables as well as foreign tables.
>  However, if the operation could be executed entirely by the foreign
>  server, the FDW could generate a path representing that and insert it
>  into the UPPERREL_FINAL upper relation, where it would
>  compete against the ModifyTable approach.

I suppose this is factually correct, but I don't think it's very
illuminating.  I think that if we're going to document both
approaches, there should be some discussion of the pros and cons of
PlanDirectModify vs. PlanForeignModify.   Of course either should be
better than an iterative ModifyTable, but how should the FDW author
decide between the two of them?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] COPY vs \copy HINT

2016-09-01 Thread Craig Ringer
On 2 September 2016 at 04:28, Tom Lane  wrote:
> Craig Ringer  writes:
>> On 12 August 2016 at 16:34, Christoph Berg  wrote:
>>> Also, I vaguely get what you wanted to say with "a driver ...
>>> wrapper", but it's pretty nonsensical if one doesn't know about the
>>> protocol details. I don't have a better suggestion now, but I think it
>>> needs rephrasing.
>
>> I don't like it either, but didn't come up with anything better. The
>> problem is that every driver calls it something different.
>
> A few thoughts on this patch:

Thanks for the review. I'll leave the first change pending your
comments, the others are made, though I'm not completely sure we
should restrict it to ENOENT.

> 1. I don't really think the HINT is appropriate for the not-absolute-path
> case.

Why? If the user runs

# COPY sometable FROM 'localfile.csv' WITH (FORMAT CSV);
ERROR:  could not open file "localfile.csv" for reading: No such file
or directory

they're going to be just as confused by this error as by:

# COPY batch_demo FROM '/tmp/localfile.csv' WITH (FORMAT CSV);
ERROR:  could not open file "/tmp/localfile.csv" for reading: No such
file or directory

so I don't really see the rationale for this change.

> 2. I don't think it's appropriate for all possible cases of AllocateFile
> failure either, eg surely not for EACCES or similar situations where we
> did find a file.  Maybe print it only for ENOENT?  (See for example
> report_newlocale_failure() for technique.)

I thought about that but figured it didn't really matter too much,
when thinking about examples like

# COPY batch_demo FROM '/root/secret.csv' WITH (FORMAT CSV);
ERROR:  could not open file "/root/secret.csv" for reading: Permission denied

or whatever, where the user doesn't understand why they can't read the
file given that their local client has permission to do so.

I don't feel strongly about this and think that the error on ENOENT is
by far the most important, so I'll adjust it per your recommendation.

> 3. As for the wording, maybe you could do it like this:
>
> HINT:  COPY copies to[from] a file on the PostgreSQL server, not on the
> client.  You may want a client-side facility such as psql's \copy.
>
> That avoids trying to invent a name for other implementations.

I like that wording a lot more, thanks. Adopted.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
From aa4f1fc810c8617d78dec75adad9951faf929909 Mon Sep 17 00:00:00 2001
From: Craig Ringer 
Date: Fri, 12 Aug 2016 15:42:12 +0800
Subject: [PATCH] Emit a HINT when COPY can't find a file

Users often get confused between COPY and \copy and try to use client-side
paths with COPY. The server of course cannot find the file.

Emit a HINT in the most common cases to help users out.

Craig Ringer, review by Tom Lane and Christoph Berg
---
 src/backend/commands/copy.c | 22 +++---
 1 file changed, 19 insertions(+), 3 deletions(-)

diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index f45b330..fe44bd9 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -1753,6 +1753,7 @@ BeginCopyTo(Relation rel,
 		{
 			mode_t		oumask; /* Pre-existing umask value */
 			struct stat st;
+			int			save_errno;
 
 			/*
 			 * Prevent write to relative path ... too easy to shoot oneself in
@@ -1761,16 +1762,23 @@ BeginCopyTo(Relation rel,
 			if (!is_absolute_path(filename))
 ereport(ERROR,
 		(errcode(ERRCODE_INVALID_NAME),
-	  errmsg("relative path not allowed for COPY to file")));
+		 errmsg("relative path not allowed for COPY to file"),
+		 errhint("COPY copies to a file on the PostgreSQL server, not on the client. "
+ "You may want a client-side facility such as psql's \\copy.")));
 
 			oumask = umask(S_IWGRP | S_IWOTH);
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_W);
+			save_errno = errno;
 			umask(oumask);
+
 			if (cstate->copy_file == NULL)
 ereport(ERROR,
 		(errcode_for_file_access(),
 		 errmsg("could not open file \"%s\" for writing: %m",
-cstate->filename)));
+cstate->filename),
+		 (save_errno == ENOENT ?
+		  errhint("COPY copies to a file on the PostgreSQL server, not on the client. "
+  "You may want a client-side facility such as psql's \\copy.") : 0)));
 
 			if (fstat(fileno(cstate->copy_file), ))
 ereport(ERROR,
@@ -2790,13 +2798,21 @@ BeginCopyFrom(Relation rel,
 		else
 		{
 			struct stat st;
+			int			save_errno;
 
 			cstate->copy_file = AllocateFile(cstate->filename, PG_BINARY_R);
+
+			/* in case something in ereport changes errno: */
+			save_errno = errno;
+
 			if (cstate->copy_file == NULL)
 ereport(ERROR,
 		(errcode_for_file_access(),
 		 errmsg("could not open file \"%s\" for reading: %m",
-cstate->filename)));
+cstate->filename),
+		 

Re: [HACKERS] Exclude schema during pg_restore

2016-09-01 Thread Peter Eisentraut
On 8/31/16 4:10 AM, Michael Banck wrote:
> attached is a small patch that adds an -N option to pg_restore, in order
> to exclude a schema, in addition to -n for the restriction to a schema.

I think this is a good idea, and the approach looks sound.  However,
something doesn't work right.  If I take an empty database and dump it,
it will dump the plpgsql extension.  If I run pg_dump in plain-text mode
with -N, then the plpgsql extension is also dumped (since it is not in
the excluded schema).  But if I use the new pg_restore -N option, the
plpgsql extension is not dumped.  Maybe this is because it doesn't have
a schema, but I haven't checked.

pg_dump does not apply --strict-names to -N, but your patch for
pg_restore does that.  I think that should be made the same as pg_dump.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] restoration after crash slowness, any way to improve?

2016-09-01 Thread Jeff Janes
On Wed, Aug 31, 2016 at 6:26 PM, Joshua D. Drake 
wrote:

> -hackers,
>
> So this is more of a spit balling thread than anything. As I understand
> it, if we have a long running transaction or a large number of wal logs and
> we crash, we then have to restore those logs on restart to the last known
> good transaction. No problem.
>

It only has to replay from the start of the last successful checkpoint.  It
doesn't matter whether there was a long-running transaction or not.  If a
long transaction spans many checkpoints, replay still only has to go back
to the start of the last successful checkpoint.  Maybe you just had
checkpoint_segments or max_wal_size st way too high, assuming
checkpoint_timeout to always kick in instead and be the limiting factor.
But then your long-running transaction  invalidated that assumption?


> I recently ran a very long transaction. I was building up a large number
> of rows into a two column table to test index performance. I ended up
> having to kill the connection and thus the transaction after I realized I
> had an extra zero in my generate_series(). (Side note: Amazing the
> difference a single zero can make ;)). When coming back up, I watched the
> machine and I was averaging anywhere from 60MBs to 97MBs on writes.


Was it IO limited?

Killing a session/connection/transaction should not take down the entire
server, so there should be no recovery taking place in the first place.
Are you sure you are seeing recovery, and not just the vacuuming of the
aborted tuples?



> However, since I know this machine can get well over 400MBs when using
> multiple connections I can't help but wonder if there is anything we can do
> to make restoration more efficient without sacrificing the purpose of what
> it is doing?
>
> Can we have multiple readers pull transaction logs into shared_buffers (on
> recovery only), sort the good transactions and then push them back to the
> walwriter or bgwriter to the pages?
>

I don't see how that could work.  Whether a page is consistent or not is
orthogonal to whether the transactions on that page have committed or
aborted.

There are two possibilities that I've considered though for long-running
PITR, which could also apply to crash recovery, and which I think have been
discussed here before.  One is to have a leading recovery process which
would identify pages which will be recovered from a FPI, and send word back
to the lagging process not to bother applying incremental WAL to those
pages.  The other would be for a leading process to asynchronously read
into memory (either FS cache or shared_buffers) pages which it sees the
lagging process will need to write to.

In the first case, you would want the leading process to be leading by a
lot, so that it has the broadest scope to detect FPI.  Basically you would
want it to read all the way to the end of the replay, provided it had
enough memory to store the list of FPI pages.  For the second one, you
would not want it to run so far ahead that it the pages it read in would
get pushed out again before the lagging process got to them.  Controlling
how far ahead that would be seems like it would be hard.

Cheers,

Jeff


Re: [HACKERS] [PATCH] OpenSSL 1.1.0 support

2016-09-01 Thread Tom Lane
Kurt Roeckx  writes:
> Attached is a patch to make it build with OpenSSL 1.1.0.

Hi Kurt,

There's already been some work on this topic:
https://www.postgresql.org/message-id/flat/20160627151604.gd1...@msg.df7cb.de

Maybe you should join forces with Andreas to get it finished.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing checks when malloc returns NULL...

2016-09-01 Thread Michael Paquier
On Thu, Sep 1, 2016 at 11:16 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> On Thu, Sep 1, 2016 at 10:03 PM, Tom Lane  wrote:
>>> Don't see the point really ... it's just more API churn without any
>>> very compelling reason.
>
>> OK, then no objection to your approach. At least I tried.
>
> OK, pushed my version then.  I think we have now dealt with everything
> mentioned in this thread except for the issues in pg_locale.c.  Are
> you planning to tackle that?

Yes, not for this CF though. So I have switched the CF entry as committed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CommitFest 2016-09

2016-09-01 Thread Fabrízio de Royes Mello
On Thu, Sep 1, 2016 at 5:11 PM, Fabrízio de Royes Mello <
fabriziome...@gmail.com> wrote:
>
> Please check the "needs reviewer" list
> (https://commitfest.postgresql.org/9/?reviewer=-2) for patches to
> review.  The committers need our help to work.
>

Just to fix the correct link bellow sent in my previous e-mail

https://commitfest.postgresql.org/10/?reviewer=-2

Regards,

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


[HACKERS] [PATCH] OpenSSL 1.1.0 support

2016-09-01 Thread Kurt Roeckx
Hi,

Attached is a patch to make it build with OpenSSL 1.1.0.

There is probably a minor problem on windows where the name of the
dlls got changed.  Someone probably should look into that.


Kurt

>From efd7aa3499b2b4eedd4c4d4164b75175f3c10d2f Mon Sep 17 00:00:00 2001
From: Kurt Roeckx 
Date: Thu, 1 Sep 2016 23:24:07 +0200
Subject: [PATCH] Add support for OpenSSL 1.1.0

---
 configure| 68 +++-
 configure.in |  4 +-
 src/backend/libpq/be-secure-openssl.c| 49 ++-
 src/interfaces/libpq/fe-secure-openssl.c | 54 -
 4 files changed, 127 insertions(+), 48 deletions(-)

diff --git a/configure b/configure
index 45c8eef..930da6e 100755
--- a/configure
+++ b/configure
@@ -774,6 +774,7 @@ infodir
 docdir
 oldincludedir
 includedir
+runstatedir
 localstatedir
 sharedstatedir
 sysconfdir
@@ -896,6 +897,7 @@ datadir='${datarootdir}'
 sysconfdir='${prefix}/etc'
 sharedstatedir='${prefix}/com'
 localstatedir='${prefix}/var'
+runstatedir='${localstatedir}/run'
 includedir='${prefix}/include'
 oldincludedir='/usr/include'
 docdir='${datarootdir}/doc/${PACKAGE_TARNAME}'
@@ -1148,6 +1150,15 @@ do
   | -silent | --silent | --silen | --sile | --sil)
 silent=yes ;;
 
+  -runstatedir | --runstatedir | --runstatedi | --runstated \
+  | --runstate | --runstat | --runsta | --runst | --runs \
+  | --run | --ru | --r)
+ac_prev=runstatedir ;;
+  -runstatedir=* | --runstatedir=* | --runstatedi=* | --runstated=* \
+  | --runstate=* | --runstat=* | --runsta=* | --runst=* | --runs=* \
+  | --run=* | --ru=* | --r=*)
+runstatedir=$ac_optarg ;;
+
   -sbindir | --sbindir | --sbindi | --sbind | --sbin | --sbi | --sb)
 ac_prev=sbindir ;;
   -sbindir=* | --sbindir=* | --sbindi=* | --sbind=* | --sbin=* \
@@ -1285,7 +1296,7 @@ fi
 for ac_var in	exec_prefix prefix bindir sbindir libexecdir datarootdir \
 		datadir sysconfdir sharedstatedir localstatedir includedir \
 		oldincludedir docdir infodir htmldir dvidir pdfdir psdir \
-		libdir localedir mandir
+		libdir localedir mandir runstatedir
 do
   eval ac_val=\$$ac_var
   # Remove trailing slashes.
@@ -1438,6 +1449,7 @@ Fine tuning of the installation directories:
   --sysconfdir=DIRread-only single-machine data [PREFIX/etc]
   --sharedstatedir=DIRmodifiable architecture-independent data [PREFIX/com]
   --localstatedir=DIR modifiable single-machine data [PREFIX/var]
+  --runstatedir=DIR   modifiable per-process data [LOCALSTATEDIR/run]
   --libdir=DIRobject code libraries [EPREFIX/lib]
   --includedir=DIRC header files [PREFIX/include]
   --oldincludedir=DIR C header files for non-gcc [/usr/include]
@@ -9538,9 +9550,9 @@ else
   as_fn_error $? "library 'crypto' is required for OpenSSL" "$LINENO" 5
 fi
 
- { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_library_init in -lssl" >&5
-$as_echo_n "checking for SSL_library_init in -lssl... " >&6; }
-if ${ac_cv_lib_ssl_SSL_library_init+:} false; then :
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for SSL_CTX_new in -lssl" >&5
+$as_echo_n "checking for SSL_CTX_new in -lssl... " >&6; }
+if ${ac_cv_lib_ssl_SSL_CTX_new+:} false; then :
   $as_echo_n "(cached) " >&6
 else
   ac_check_lib_save_LIBS=$LIBS
@@ -9554,27 +9566,27 @@ cat confdefs.h - <<_ACEOF >conftest.$ac_ext
 #ifdef __cplusplus
 extern "C"
 #endif
-char SSL_library_init ();
+char SSL_CTX_new ();
 int
 main ()
 {
-return SSL_library_init ();
+return SSL_CTX_new ();
   ;
   return 0;
 }
 _ACEOF
 if ac_fn_c_try_link "$LINENO"; then :
-  ac_cv_lib_ssl_SSL_library_init=yes
+  ac_cv_lib_ssl_SSL_CTX_new=yes
 else
-  ac_cv_lib_ssl_SSL_library_init=no
+  ac_cv_lib_ssl_SSL_CTX_new=no
 fi
 rm -f core conftest.err conftest.$ac_objext \
 conftest$ac_exeext conftest.$ac_ext
 LIBS=$ac_check_lib_save_LIBS
 fi
-{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_ssl_SSL_library_init" >&5
-$as_echo "$ac_cv_lib_ssl_SSL_library_init" >&6; }
-if test "x$ac_cv_lib_ssl_SSL_library_init" = xyes; then :
+{ $as_echo "$as_me:${as_lineno-$LINENO}: result: $ac_cv_lib_ssl_SSL_CTX_new" >&5
+$as_echo "$ac_cv_lib_ssl_SSL_CTX_new" >&6; }
+if test "x$ac_cv_lib_ssl_SSL_CTX_new" = xyes; then :
   cat >>confdefs.h <<_ACEOF
 #define HAVE_LIBSSL 1
 _ACEOF
@@ -9644,9 +9656,9 @@ else
   as_fn_error $? "library 'eay32' or 'crypto' is required for OpenSSL" "$LINENO" 5
 fi
 
- { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing SSL_library_init" >&5
-$as_echo_n "checking for library containing SSL_library_init... " >&6; }
-if ${ac_cv_search_SSL_library_init+:} false; then :
+ { $as_echo "$as_me:${as_lineno-$LINENO}: checking for library containing SSL_CTX_new" >&5
+$as_echo_n "checking for library containing SSL_CTX_new... " >&6; }
+if ${ac_cv_search_SSL_CTX_new+:} false; then :
   $as_echo_n "(cached) " >&6
 else
   ac_func_search_save_LIBS=$LIBS
@@ -9659,11 +9671,11 @@ 

Re: [HACKERS] Improve BEGIN tab completion

2016-09-01 Thread Kevin Grittner
On Wed, May 18, 2016 at 8:59 AM, Andreas Karlsson  wrote:

> I noticed that the tab completion was not aware of that TRANSACTION/WORK is
> optional in BEGIN, and that we do not complete [NOT] DEFERRABLE.
>
> While fixing it I also improved the completion support for SET SESSION
> slightly.

Pushed, with an adjustment to handle SET TRANSACTION SNAPSHOT, too.

Thanks!

-- 
Kevin Grittner
EDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incomplete removal of not referenced CTEs

2016-09-01 Thread Tomas Vondra


On 09/01/2016 09:58 PM, Andres Freund wrote:
> On 2016-09-01 15:46:45 -0400, Tom Lane wrote:
>> Tomas Vondra  writes:
>>> While investigating a CTE-related query, I've noticed that we don't
>>> really remove all unreachable CTEs.
>>
>> We expend a grand total of three lines of code on making that happen.
>> I'm pretty much -1 on adding a great deal more code or complexity
>> to make it happen recursively;
> 
> Agreed. And the consequences are pretty much harmless.
> 

I'm not sure I agree with that. The trouble is that we have customers,
and they don't always write "reasonably well written queries",
particularly when those queries are a bit more complex. And I have to
debug issues with those queries from time to time.

I'd actually be much happier if we haven't removed any CTEs at all,
instead of removing just some of them.

This incomplete CTE removal actually confused the hell out of me today,
which is why I started this thread. I only realized some of the CTEs are
useless after the EXPLAIN ANALYZE completed (as this was a performance
issue, it took a very long time).

> 
>> the case simply doesn't arise in reasonably well written queries.
> 
> Well, it might, when the CTE reference can be removed due to some other
> part of the query (e.g. plan time evaluation of immutable function).
> 

We also never remove CTEs with INSERT/UPDATE/DELETE commands, which
makes it a bit more complicated (but those are easy to spot).

FWIW, I'm not planning to hack on this, but I was thinking that this
might be a nice topic for a first patch for new PostgreSQL hackers
(something like EasyHacks from LibreOffice).

regards


-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] LOCK TABLE .. DEFERRABLE

2016-09-01 Thread Simon Riggs
On 5 April 2016 at 18:34, Rod Taylor  wrote:
>
>
> On Tue, Apr 5, 2016 at 1:10 PM, Simon Riggs  wrote:
>>>
>>> If a lock is successfully obtained on one table, but not on all tables,
>>> it releases that lock and will retry to get them as a group in the future.
>>> Since inheritance acts as a group of tables (top + recursive cascade to
>>> children), this implementation is necessary even if only a single table is
>>> specified in the command.
>>
>>
>> I'd prefer to see this as a lock wait mode where it sits in the normal
>> lock queue BUT other lock requestors are allowed to queue jump past it. That
>> should be just a few lines changed in the lock conflict checker and some
>> sleight of hand in the lock queue code.
>>
>> That way we avoid the busy-wait loop and multiple DEFERRABLE lock waiters
>> queue up normally.
>
>
> Yeah, that would be better. I can see how to handle a single structure in
> that way but I'm not at all certain how to handle multiple tables and
> inheritance is multiple tables even with a single command.

Agreed; neither can I.

> X1 inherits from X
>
> There is a long-running task on X1.
>
> Someone requests LOCK TABLE X IN ACCESS EXCLUSIVE MODE WAIT PATIENTLY.
> Internally this also grabs X1.
>
> The lock on X might be granted immediately and now blocks all other access
> to that table.
>
> There would need be a Locking Group kind of thing so various LockTags are
> treated as a single entity to grant them simultaneously. That seems pretty
> invasive; at least I don't see anything like that today.

Multiple locktags would likely be behind different LWLocks anyway, so
I don't see a way to make that work.

So the only way to handle multiple locks is to do this roughly the way
Rod suggests.

The only thing I would add at this stage is that if one lock is
unavailable, unlocking all previous locks is unnecessary. We only need
to unlock if there are lock waiters for the locks we already hold.

The use cases I am thinking of require only one table at a time, so
I'm still inclined towards the non-looping approach.

Thoughts?

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] COPY vs \copy HINT

2016-09-01 Thread Tom Lane
Craig Ringer  writes:
> On 12 August 2016 at 16:34, Christoph Berg  wrote:
>> Also, I vaguely get what you wanted to say with "a driver ...
>> wrapper", but it's pretty nonsensical if one doesn't know about the
>> protocol details. I don't have a better suggestion now, but I think it
>> needs rephrasing.

> I don't like it either, but didn't come up with anything better. The
> problem is that every driver calls it something different.

A few thoughts on this patch:

1. I don't really think the HINT is appropriate for the not-absolute-path
case.

2. I don't think it's appropriate for all possible cases of AllocateFile
failure either, eg surely not for EACCES or similar situations where we
did find a file.  Maybe print it only for ENOENT?  (See for example
report_newlocale_failure() for technique.)

3. As for the wording, maybe you could do it like this:

HINT:  COPY copies to[from] a file on the PostgreSQL server, not on the
client.  You may want a client-side facility such as psql's \copy.

That avoids trying to invent a name for other implementations.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Tom Lane
Gavin Flower  writes:
> On 02/09/16 04:44, Peter Eisentraut wrote:
>> You can try this out by building PostgreSQL this way.  Please save your
>> work first, because you might have to hard-reboot your system.

> Hmm...  I've built several versions of pg this way, without any obvious 
> problems!

I'm a little skeptical of that too.  However, I'd note that with a "make"
you're not likely to care, or possibly even notice, if the thing does
something like go completely to sleep for a little while, or if some
sub-jobs proceed well while others do not.  The fact that "-l 8" works
okay for make doesn't necessarily translate to more-interactive use cases.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] CommitFest 2016-09

2016-09-01 Thread Fabrízio de Royes Mello
Hi all,

The 2016-09 commitfest is officially in progress, and I'm the manager.

The current status summary is

Needs review: 119
Needs *reviewer*: 63

Please check the "needs reviewer" list
(https://commitfest.postgresql.org/10/?reviewer=-2
) for patches to
review.  The committers need our help to work.

If this is you:

Waiting on Author: 15

Then please post an updated patch as soon as possible so it can be
reviewed.  Some of these patches have not seen any activity from the
author in a long time.

The good news is we are already 29% done with the CF:

Committed: 57
Rejected: 6
Returned with Feedback: 4

TOTAL: 219

I'll post a status update on this thread at least once a week and more
often as needed.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Gavin Flower

On 02/09/16 05:01, Peter Eisentraut wrote:

On 8/16/16 3:39 AM, Haribabu Kommi wrote:

[...]


All of this seems very platform specific, too.  You have
Windows-specific code, but the rest seems very Linux-specific.  The
dstat tool I had never heard of before.  There is stuff with cgroups,
which I don't know how portable they are across different Linux
installations.  Something about Solaris was mentioned.  What about the
rest?  How can we maintain this in the long term?  How do we know that
these facilities actually work correctly and not cause mysterious problems?

[...]
I think that we should not hobble pg in Linux, because of limitations of 
other O/S's like those from Microsoft!


On the safe side, if a feature has insufficient evidence of working in a 
particular O/S, then it should not be default enabled for that O/S.


If a feature is useful in Linux, but not elsewhere: then pg should still 
run in the other O/S's but the documentation should reflect that.



Cheers,.
Gavin


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Speedup twophase transactions

2016-09-01 Thread Simon Riggs
On 13 April 2016 at 15:31, Stas Kelvich  wrote:

> Fixed patch attached. There already was infrastructure to skip currently
> held locks during replay of standby_redo() and I’ve extended that with check 
> for
> prepared xids.

Please confirm that everything still works on current HEAD for the new
CF, so review can start.

Thanks

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incomplete removal of not referenced CTEs

2016-09-01 Thread Andres Freund
On 2016-09-01 15:46:45 -0400, Tom Lane wrote:
> Tomas Vondra  writes:
> > While investigating a CTE-related query, I've noticed that we don't
> > really remove all unreachable CTEs.
> 
> We expend a grand total of three lines of code on making that happen.
> I'm pretty much -1 on adding a great deal more code or complexity
> to make it happen recursively;

Agreed. And the consequences are pretty much harmless.


> the case simply doesn't arise in reasonably well written queries.

Well, it might, when the CTE reference can be removed due to some other
part of the query (e.g. plan time evaluation of immutable function).


Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Gavin Flower

On 02/09/16 04:44, Peter Eisentraut wrote:

On 8/1/16 2:17 AM, Gavin Flower wrote:

Possibly look how make does it with the '-l' flag?

'-l 8' don't start more process when load is 8 or greater, works on
Linux at least...

The problem with that approach is that it takes about a minute for the
load averages figures to be updated, by which time you have already
thrashed your system.

You can try this out by building PostgreSQL this way.  Please save your
work first, because you might have to hard-reboot your system.

Hmm...  I've built several versions of pg this way, without any obvious 
problems!


Looking at top, suggests that the load averages never go much above 8, 
and are usually less.


This is the bash script I use:


#!/bin/bash
# postgresql-build.sh


VERSION='9.5.0'

TAR_FILE="postgresql-$VERSION.tar.bz2"
echo 'TAR_FILE['$TAR_FILE']'
tar xvf $TAR_FILE

PORT='--with-pgport=5433'   std is 5432

BASE_DIR="postgresql-$VERSION"
echo 'BASE_DIR['$BASE_DIR']'
cd $BASE_DIR

PREFIX="--prefix=/usr/local/lib/postgres-$VERSION"
echo 'PREFIX['$PREFIX']'

LANGUAGES='--with-python'
echo 'LANGUAGES['$LANGUAGES']'

SECURITY='--with-openssl --with-pam --with-ldap'
echo 'PREFIX['$PREFIX']'

XML='--with-libxml --with-libxslt'
echo 'SECURITY['$SECURITY']'

TZDATA='--with-system-tzdata=/usr/share/zoneinfo'
echo 'TZDATA['$TZDATA']'

##DEBUG='--enable-debug'
##echo 'DEBUG['$DEBUG']'


./configure $PREFIX $LANGUAGES $SECURITY $XML $TZDATA $DEBUG

time make -j7 -l8 && time make -j7 -l8 check


Cheers,
Gavin



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incomplete removal of not referenced CTEs

2016-09-01 Thread Tom Lane
Tomas Vondra  writes:
> While investigating a CTE-related query, I've noticed that we don't
> really remove all unreachable CTEs.

We expend a grand total of three lines of code on making that happen.
I'm pretty much -1 on adding a great deal more code or complexity
to make it happen recursively; the case simply doesn't arise in
reasonably well written queries.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] incomplete removal of not referenced CTEs

2016-09-01 Thread Andres Freund
On 2016-09-01 21:36:13 +0200, Tomas Vondra wrote:
> Of course, it's harmless as none of those CTEs gets actually executed,
> but is this intentional, or do we want/need to fix it? I don't see
> anything about this in the docs, but it seems a bit awkward and
> confusing to remove only some of the CTEs - I think we should either
> remove all or none of them.
> 
> I don't think that should be particularly difficult - ISTM we need to
> make SS_process_ctes a bit smarter, essentially by adding a loop to
> remove the CTEs recursively (and decrease the refcount).

I don't really see a lot of benefit in expanding energy on
this. Skipping the CTE in the easy case saves som eplan cycles. Making more
effort to remove CTEs recursively probably doesn't...

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
Peter van Hardenberg  writes:
> Some kind of units on the parenthetical format would be helpful.

I was really hoping to not re-open that can of worms :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] incomplete removal of not referenced CTEs

2016-09-01 Thread Tomas Vondra
Hi,

While investigating a CTE-related query, I've noticed that we don't
really remove all unreachable CTEs. For example, for this query

with a as (select 1),
 b as (select * from a),
 c as (select * from b)
select 2;

where none of the CTEs if (directly or indirectly) referenced from the
query, we get a plan like this:

   QUERY PLAN
-
 Result  (cost=0.03..0.04 rows=1 width=4)
   CTE a
 ->  Result  (cost=0.00..0.01 rows=1 width=4)
   CTE b
 ->  CTE Scan on a  (cost=0.00..0.02 rows=1 width=4)
(5 rows)

So we only remove the top-level CTE, but we fail to remove the other
CTEs because we don't tweak the refcount in SS_process_ctes().

Of course, it's harmless as none of those CTEs gets actually executed,
but is this intentional, or do we want/need to fix it? I don't see
anything about this in the docs, but it seems a bit awkward and
confusing to remove only some of the CTEs - I think we should either
remove all or none of them.

I don't think that should be particularly difficult - ISTM we need to
make SS_process_ctes a bit smarter, essentially by adding a loop to
remove the CTEs recursively (and decrease the refcount).

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Peter van Hardenberg
On Thu, Sep 1, 2016 at 12:14 PM, Tom Lane  wrote:

> Corey Huinker  writes:
> > On Thu, Sep 1, 2016 at 3:01 PM, Tom Lane  wrote:
> >> Well, that code's on the backend side so we're not going to just call it
> >> in any case.  And I think we don't want to be quite so verbose as to go
> up
> >> to hh:mm:ss.fff as soon as we get past 1 second.  However, comparing
> that
> >> output to what I had suggests that maybe it's better to keep a leading
> >> zero in two-digit fields, that is render times like "00:01.234",
> >> "01:23.456", or "01:23:45.678" rather than suppressing the initial zero
> as
> >> I had in my examples.  It's an extra character but I think it reinforces
> >> the meaning.
>
> > +1
> > The larger jump in widths from no MM:SS to HH:MM:SS is a good visual cue.
> > Jumping from MM:SS to H:MM:SS to HH:MM:SS would be more subtle and
> possibly
> > confusing.
>
> Attached is an updated patch that does it like that.  Sample output
> (generated by forcing specific arguments to PrintTiming):
>
> Time: 0.100 ms
> Time: 1.200 ms
> Time: 1001.200 ms (00:01.001)
> Time: 12001.200 ms (00:12.001)
> Time: 60001.200 ms (01:00.001)
> Time: 720001.200 ms (12:00.001)
> Time: 3660001.200 ms (01:01:00.001)
> Time: 43920001.200 ms (12:12:00.001)
> Time: 176460001.200 ms (2 01:01:00.001)
> Time: 216720001.200 ms (2 12:12:00.001)
> Time: 8816460001.200 ms (102 01:01:00.001)
> Time: 8856720001.200 ms (102 12:12:00.001)
>
> Barring objections I'll commit this soon.
>
> regards, tom lane
>


Some kind of units on the parenthetical format would be helpful. Glancing
at several of these values it takes me a couple of seconds to decide what
I'm reading.

-- 
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
Corey Huinker  writes:
> On Thu, Sep 1, 2016 at 3:01 PM, Tom Lane  wrote:
>> Well, that code's on the backend side so we're not going to just call it
>> in any case.  And I think we don't want to be quite so verbose as to go up
>> to hh:mm:ss.fff as soon as we get past 1 second.  However, comparing that
>> output to what I had suggests that maybe it's better to keep a leading
>> zero in two-digit fields, that is render times like "00:01.234",
>> "01:23.456", or "01:23:45.678" rather than suppressing the initial zero as
>> I had in my examples.  It's an extra character but I think it reinforces
>> the meaning.

> +1
> The larger jump in widths from no MM:SS to HH:MM:SS is a good visual cue.
> Jumping from MM:SS to H:MM:SS to HH:MM:SS would be more subtle and possibly
> confusing.

Attached is an updated patch that does it like that.  Sample output
(generated by forcing specific arguments to PrintTiming):

Time: 0.100 ms
Time: 1.200 ms
Time: 1001.200 ms (00:01.001)
Time: 12001.200 ms (00:12.001)
Time: 60001.200 ms (01:00.001)
Time: 720001.200 ms (12:00.001)
Time: 3660001.200 ms (01:01:00.001)
Time: 43920001.200 ms (12:12:00.001)
Time: 176460001.200 ms (2 01:01:00.001)
Time: 216720001.200 ms (2 12:12:00.001)
Time: 8816460001.200 ms (102 01:01:00.001)
Time: 8856720001.200 ms (102 12:12:00.001)

Barring objections I'll commit this soon.

regards, tom lane

diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 8a66ce7..88e2f66 100644
*** a/doc/src/sgml/ref/psql-ref.sgml
--- b/doc/src/sgml/ref/psql-ref.sgml
*** testdb= \setenv LESS -imx
*** 2789,2796 
 \timing [ on | off ]
  
  
!  Without parameter, toggles a display of how long each SQL statement
!  takes, in milliseconds.  With parameter, sets same.
  
 

--- 2789,2798 
 \timing [ on | off ]
  
  
!  With a parameter, turns displaying of how long each SQL statement
!  takes on or off.  Without a parameter, toggles the display between
!  on and off.  The display is in milliseconds; intervals longer than
!  1 second are also shown in days/hours/minutes/seconds format.
  
 

diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 7399950..a17f1de 100644
*** a/src/bin/psql/common.c
--- b/src/bin/psql/common.c
***
*** 10,15 
--- 10,16 
  
  #include 
  #include 
+ #include 
  #include 
  #ifndef WIN32
  #include /* for write() */
*** ClearOrSaveResult(PGresult *result)
*** 530,535 
--- 531,586 
  	}
  }
  
+ /*
+  * Print microtiming output.  Always print raw milliseconds; if the interval
+  * is >= 1 second, also break it down into days/hours/minutes/seconds.
+  */
+ static void
+ PrintTiming(double elapsed_msec)
+ {
+ 	double		seconds;
+ 	double		minutes;
+ 	double		hours;
+ 	double		days;
+ 
+ 	if (elapsed_msec < 1000.0)
+ 	{
+ 		/* This is the traditional (pre-v10) output format */
+ 		printf(_("Time: %.3f ms\n"), elapsed_msec);
+ 		return;
+ 	}
+ 
+ 	/*
+ 	 * Note: we could print just seconds, in a format like %06.3f, when the
+ 	 * total is less than 1min.  But that's hard to interpret unless we tack
+ 	 * on "s" or otherwise annotate it.  Forcing the display to include
+ 	 * minutes seems like a better solution.
+ 	 */
+ 	seconds = elapsed_msec / 1000.0;
+ 	minutes = floor(seconds / 60.0);
+ 	seconds -= 60.0 * minutes;
+ 	if (minutes < 60.0)
+ 	{
+ 		printf(_("Time: %.3f ms (%02d:%06.3f)\n"),
+ 			   elapsed_msec, (int) minutes, seconds);
+ 		return;
+ 	}
+ 
+ 	hours = floor(minutes / 60.0);
+ 	minutes -= 60.0 * hours;
+ 	if (hours < 24.0)
+ 	{
+ 		printf(_("Time: %.3f ms (%02d:%02d:%06.3f)\n"),
+ 			   elapsed_msec, (int) hours, (int) minutes, seconds);
+ 		return;
+ 	}
+ 
+ 	days = floor(hours / 24.0);
+ 	hours -= 24.0 * days;
+ 	printf(_("Time: %.3f ms (%.0f %02d:%02d:%06.3f)\n"),
+ 		   elapsed_msec, days, (int) hours, (int) minutes, seconds);
+ }
+ 
  
  /*
   * PSQLexec
*** PSQLexecWatch(const char *query, const p
*** 679,685 
  
  	/* Possible microtiming output */
  	if (pset.timing)
! 		printf(_("Time: %.3f ms\n"), elapsed_msec);
  
  	return 1;
  }
--- 730,736 
  
  	/* Possible microtiming output */
  	if (pset.timing)
! 		PrintTiming(elapsed_msec);
  
  	return 1;
  }
*** SendQuery(const char *query)
*** 1332,1338 
  
  	/* Possible microtiming output */
  	if (pset.timing)
! 		printf(_("Time: %.3f ms\n"), elapsed_msec);
  
  	/* check for events that may occur during query execution */
  
--- 1383,1389 
  
  	/* Possible microtiming output */
  	if (pset.timing)
! 		PrintTiming(elapsed_msec);
  
  	/* check for events that may occur during query execution */
  

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:

Re: [HACKERS] \timing interval

2016-09-01 Thread Corey Huinker
On Thu, Sep 1, 2016 at 3:01 PM, Tom Lane  wrote:

> Corey Huinker  writes:
> > On Thu, Sep 1, 2016 at 2:43 PM, Tom Lane  wrote:
> >> Note that times from 1 second to 1 hour all get the nn:nn.nnn
> >> treatment.  I experimented with these variants for sub-minute times:
> >> ...
> >> but it seems like the first variant is not terribly intelligible and
> >> the second variant is inconsistent with what happens for longer times.
>
> > Well, if we're looking to be consistent, here's what interval does now:
> > ...
> > Should we just plug into whatever code that uses?
>
> Well, that code's on the backend side so we're not going to just call it
> in any case.  And I think we don't want to be quite so verbose as to go up
> to hh:mm:ss.fff as soon as we get past 1 second.  However, comparing that
> output to what I had suggests that maybe it's better to keep a leading
> zero in two-digit fields, that is render times like "00:01.234",
> "01:23.456", or "01:23:45.678" rather than suppressing the initial zero as
> I had in my examples.  It's an extra character but I think it reinforces
> the meaning.
>
> regards, tom lane
>

+1
The larger jump in widths from no MM:SS to HH:MM:SS is a good visual cue.
Jumping from MM:SS to H:MM:SS to HH:MM:SS would be more subtle and possibly
confusing.


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
Corey Huinker  writes:
> On Thu, Sep 1, 2016 at 2:43 PM, Tom Lane  wrote:
>> Note that times from 1 second to 1 hour all get the nn:nn.nnn
>> treatment.  I experimented with these variants for sub-minute times:
>> ...
>> but it seems like the first variant is not terribly intelligible and
>> the second variant is inconsistent with what happens for longer times.

> Well, if we're looking to be consistent, here's what interval does now:
> ...
> Should we just plug into whatever code that uses?

Well, that code's on the backend side so we're not going to just call it
in any case.  And I think we don't want to be quite so verbose as to go up
to hh:mm:ss.fff as soon as we get past 1 second.  However, comparing that
output to what I had suggests that maybe it's better to keep a leading
zero in two-digit fields, that is render times like "00:01.234",
"01:23.456", or "01:23:45.678" rather than suppressing the initial zero as
I had in my examples.  It's an extra character but I think it reinforces
the meaning.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Corey Huinker
On Thu, Sep 1, 2016 at 2:43 PM, Tom Lane  wrote:

> I wrote:
> > Sorry, that probably added no clarity at all, I was confusing
> > seconds with milliseconds in the example values :-(
>
> After a bit of further fooling with sample values, I propose this
> progression:
>
> Time: 0.100 ms
> Time: 1.200 ms
> Time: 1001.200 ms (0:01.001)
> Time: 12001.200 ms (0:12.001)
> Time: 60001.200 ms (1:00.001)
> Time: 720001.200 ms (12:00.001)
> Time: 3660001.200 ms (1:01:00.001)
> Time: 43920001.200 ms (12:12:00.001)
> Time: 176460001.200 ms (2 01:01:00.001)
> Time: 216720001.200 ms (2 12:12:00.001)
> Time: 1000.000 ms (115740740740 17:46:40.000)
>
> Note that times from 1 second to 1 hour all get the nn:nn.nnn
> treatment.  I experimented with these variants for sub-minute times:
>
> Time: 1001.200 ms (1.001)
> Time: 12001.200 ms (12.001)
> Time: 1001.200 ms (1.001 s)
> Time: 12001.200 ms (12.001 s)
>
> but it seems like the first variant is not terribly intelligible and
> the second variant is inconsistent with what happens for longer times.
> Adding a zero minutes field is a subtler way of cueing the reader that
> it's mm:ss.
>
> regards, tom lane
>

Well, if we're looking to be consistent, here's what interval does now:

# select '1 second 1 millisecond'::interval, '1 minute 2
milliseconds'::interval, '1 hour 30 milliseconds'::interval, '1 day 1 hour
999 milliseconds'::interval, '1 week 1 day 1 hour'::interval;
   interval   |   interval   |  interval   |  interval  |
 interval
--+--+-++-
 00:00:01.001 | 00:01:00.002 | 01:00:00.03 | 1 day 01:00:00.999 | 8 days
01:00:00
(1 row)


Should we just plug into whatever code that uses? It's slightly different:

# select interval '1000.001 milliseconds'::interval;
ERROR:  interval field value out of range: "1000.001
milliseconds"
LINE 1: select interval '1000.001 milliseconds'::int...
^
# select interval '216720001.200 milliseconds';
   interval
---
 60:12:00.0012
(1 row)

# select interval '176460001.200 ms';
   interval
---
 49:01:00.0012
(1 row)


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
Corey Huinker  writes:
> I'm going to hold off a bit to see if anybody else chimes in, and if not
> I'm going to deliver a patch.

I've already been updating yours, no need for another.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
I wrote:
> Sorry, that probably added no clarity at all, I was confusing
> seconds with milliseconds in the example values :-(

After a bit of further fooling with sample values, I propose this
progression:

Time: 0.100 ms
Time: 1.200 ms
Time: 1001.200 ms (0:01.001)
Time: 12001.200 ms (0:12.001)
Time: 60001.200 ms (1:00.001)
Time: 720001.200 ms (12:00.001)
Time: 3660001.200 ms (1:01:00.001)
Time: 43920001.200 ms (12:12:00.001)
Time: 176460001.200 ms (2 01:01:00.001)
Time: 216720001.200 ms (2 12:12:00.001)
Time: 1000.000 ms (115740740740 17:46:40.000)

Note that times from 1 second to 1 hour all get the nn:nn.nnn
treatment.  I experimented with these variants for sub-minute times:

Time: 1001.200 ms (1.001)
Time: 12001.200 ms (12.001)
Time: 1001.200 ms (1.001 s)
Time: 12001.200 ms (12.001 s)

but it seems like the first variant is not terribly intelligible and
the second variant is inconsistent with what happens for longer times.
Adding a zero minutes field is a subtler way of cueing the reader that
it's mm:ss.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Corey Huinker
On Thu, Sep 1, 2016 at 2:17 PM, Tom Lane  wrote:

> I wrote:
> > So for clarity's sake: first suitable format among these:
>
> > Time: 59.999 ms
> > Time: 121.999 ms (2:01.999)
> > Time: 10921.999 ms (3:02:01.999)
> > Time: 356521.999 ms (4 3:02:01.999)
>
> Sorry, that probably added no clarity at all, I was confusing
> seconds with milliseconds in the example values :-(
>
> regards, tom lane
>

I didn't scan your examples for correctness, but the parentheticals matched
my own idea for what "left-trimmed" meant.

I'm going to hold off a bit to see if anybody else chimes in, and if not
I'm going to deliver a patch.


Re: [HACKERS] autonomous transactions

2016-09-01 Thread Andres Freund
On 2016-08-31 06:10:31 +0200, Joel Jacobson wrote:
> This is important if you as a caller function want to be sure none of
> the work made by anything called down the stack gets committed.
> That is, if you as a caller decide to rollback, e.g. by raising an
> exception, and you want to be sure *everything* gets rollbacked,
> including all work by functions you've called.

> If the caller can't control this, then the author of the caller
> function would need to inspect the source code of all function being
> called, to be sure there are no code using autonomous transactions.

I'm not convinced this makes much sense. All FDWs, dblink etc. already
allow you do stuff outside of a transaction.

Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2016-09-01 Thread Joel Jacobson
On Wed, Aug 31, 2016 at 6:22 PM, Simon Riggs  wrote:
> On 31 August 2016 at 14:09, Joel Jacobson  wrote:
>> My idea on how to deal with this would be to mark the function to be
>> "AUTONOMOUS" similar to how a function is marked to be "PARALLEL
>> SAFE",
>> and to throw an error if a caller that has blocked autonomous
>> transactions tries to call a function that is marked to be autonomous.
>>
>> That way none of the code that needs to be audited would ever get executed.
>
> Not sure I see why you would want to turn off execution for only some 
> functions.
>
> What happens if your function calls some other function with
> side-effects?

I'm not sure I understand your questions. All volatile functions modifying data
have side-effects. What I meant was if they are allowed to commit it
even if the caller doesn't want to.

However, I'll try to clarify the two scenarios I envision:

1. If a function is declared AUTONOMOUS and it gets called,
then that means nothing in the txn has blocked autonomous yet
and the function and any other function will be able to do autonomous txns
from that here on, so if some function would try to block autonomous that
would throw an error.

2. If a function has blocked autonomous and something later on
tries to call a function declared AUTONOMOUS then that would throw an error.

Basically, we start with a NULL state where autonomous is neither blocked
or explicitly allowed. Whatever happens first decides if autonomous transactions
will explicitly be blocked or allowed during the txn.

So we can go from NULL -> AUTONOMOUS ALLOWED
or NULL -> AUTONOMOUS BLOCKED,
but that's the only two state transitions possible.

Once set, it cannot be changed.

If nothing in an application cares about autonomous transactions,
they don't have to do anything special, they don't need to modify any
of their code.

But if it for some reason is important to block autonomous transactions
because the application is written in a way where it is expected
a RAISE EXCEPTION always rollbacks everything,
then the author of such an application (e.g. me) can just block
autonomous transactions
and continue to live happily ever after without having to dream nightmares about
developers misusing the feature, and only use it when appropriate.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2016-09-01 Thread Joel Jacobson
On Thu, Sep 1, 2016 at 12:12 AM, Vik Fearing  wrote:
> Part of what people want this for is to audit what people *try* to do.
> We can already audit what they've actually done.
>
> With your solution, we still wouldn't know when an unauthorized attempt
> to do something happened.

The unauthorized attempt to execute the function will still be logged
to the PostgreSQL log file
since it would throw an error, just like trying to connect with e.g.
an invalid username would be logged to the log files.
I think that's enough for that use-case, since it's arguably not an
application layer problem,
since no part of the code was ever executed.

But if someone tries to execute a function where one of the input params
is a password and the function raises an exception if the password
is incorrect and wants to log the unauthorized attempt, then that
would be a good example of when you could use and would need to use
autonomous transactions to log the invalid password attempt.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
I wrote:
> So for clarity's sake: first suitable format among these:

> Time: 59.999 ms
> Time: 121.999 ms (2:01.999)
> Time: 10921.999 ms (3:02:01.999)
> Time: 356521.999 ms (4 3:02:01.999)

Sorry, that probably added no clarity at all, I was confusing
seconds with milliseconds in the example values :-(

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Peter Geoghegan
On Thu, Sep 1, 2016 at 9:23 AM, Robert Haas  wrote:
> Indeed, it had occurred to me that we might not even want to compile
> this code into the server unless WAL_DEBUG is defined; after all, how
> does it help a regular user to detect that the server has a bug?  Bug
> or no bug, that's the code they've got.  But on further reflection, it
> seems like it could be useful: if we suspect a bug in the redo code
> but we can't reproduce it here, we could ask the customer to turn this
> option on to see whether it produces logging indicating the nature of
> the problem.  However, because of the likely expensive of enabling the
> feature, it seems like it would be quite desirable to limit the
> expense of generating many extra FPWs to the affected rmgr.  For
> example, if a user has a table with a btree index and a gin index, and
> we suspect a bug in GIN, it would be nice for the user to be able to
> enable the feature *only for GIN* rather than paying the cost of
> enabling it for btree and heap as well.[2]

Yes, that would be rather a large advantage.

I think that there really is no hard distinction between users and
hackers. Some people will want to run this in production, and it would
be a lot better if performance was at least not atrocious. If amcheck
couldn't do the majority of its verification with only an
AccessShareLock, then users probably just couldn't use it. Heroku
wouldn't have been able to use it on all production databases. It
wouldn't have mattered that the verification was no less effective,
since the bugs it found would simply never have been observed in
practice.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Indexes

2016-09-01 Thread Jesper Pedersen

On 08/05/2016 07:36 AM, Amit Kapila wrote:

On Thu, Aug 4, 2016 at 8:02 PM, Mithun Cy  wrote:

I did some basic testing of same. In that I found one issue with cursor.



Thanks for the testing.  The reason for failure was that the patch
didn't take into account the fact that for scrolling cursors, scan can
reacquire the lock and pin on bucket buffer multiple times.   I have
fixed it such that we release the pin on bucket buffers after we scan
the last overflow page in bucket. Attached patch fixes the issue for
me, let me know if you still see the issue.



Needs a rebase.

hashinsert.c

+* reuse the space.  There is no such apparent benefit from finsihing 
the

-> finishing

hashpage.c

+ * retrun the buffer, else return InvalidBuffer.

-> return

+   if (blkno == P_NEW)
+   elog(ERROR, "hash AM does not use P_NEW");

Left over ?

+ * for unlocking it.

-> for unlocking them.

hashsearch.c

+* bucket, but not pin, then acuire the lock on new bucket and again

-> acquire

hashutil.c

+ * half.  It is mainly required to finsh the incomplete splits where we are

-> finish

Ran some tests on a CHAR() based column which showed good results. Will 
have to compare with a run with the WAL patch applied.


make check-world passes.

Best regards,
 Jesper



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Here is new version of the patch.
Now function pack_float is commented better.
All inline keywords are removed. I haven't found any performance loss for that.
Remove of static's showed 1%-7% performance loss in SELECT computation
(3 test runs), so I left statics where they are.

Actually, to avoid this kind of hacks, probably, it would be better to
fork GiST to GiSTv2 and implement many API changes there:
1. Bulk load API
2. Richier choose subtree API
3. Allow for key test not just NO\MAYBE answers, but SURE answer to
skip key test for underlying tree
And some other improvements require breaking chanes for extensions.
GiST idea is almost 25, modern spatial indices vary a lot from things
that were there during 90th.

Best regards, Andrey Borodin, Octonica & Ural Federal University.
diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c
index 3feddef..9ce3cd6 100644
--- a/contrib/cube/cube.c
+++ b/contrib/cube/cube.c
@@ -91,14 +91,16 @@ PG_FUNCTION_INFO_V1(cube_enlarge);
 /*
 ** For internal use only
 */
-int32  cube_cmp_v0(NDBOX *a, NDBOX *b);
-bool   cube_contains_v0(NDBOX *a, NDBOX *b);
-bool   cube_overlap_v0(NDBOX *a, NDBOX *b);
-NDBOX *cube_union_v0(NDBOX *a, NDBOX *b);
-void   rt_cube_size(NDBOX *a, double *sz);
-NDBOX *g_cube_binary_union(NDBOX *r1, NDBOX *r2, int *sizep);
-bool   g_cube_leaf_consistent(NDBOX *key, NDBOX *query, StrategyNumber 
strategy);
-bool   g_cube_internal_consistent(NDBOX *key, NDBOX *query, 
StrategyNumber strategy);
+static int32   cube_cmp_v0(NDBOX *a, NDBOX *b);
+static boolcube_contains_v0(NDBOX *a, NDBOX *b);
+static boolcube_overlap_v0(NDBOX *a, NDBOX *b);
+static NDBOX  *cube_union_v0(NDBOX *a, NDBOX *b);
+static float   pack_float(float actualValue, int realm);
+static voidrt_cube_size(NDBOX *a, double *sz);
+static voidrt_cube_edge(NDBOX *a, double *sz);
+static NDBOX  *g_cube_binary_union(NDBOX *r1, NDBOX *r2, int *sizep);
+static boolg_cube_leaf_consistent(NDBOX *key, NDBOX *query, 
StrategyNumber strategy);
+static boolg_cube_internal_consistent(NDBOX *key, NDBOX *query, 
StrategyNumber strategy);
 
 /*
 ** Auxiliary funxtions
@@ -419,7 +421,27 @@ g_cube_decompress(PG_FUNCTION_ARGS)
}
PG_RETURN_POINTER(entry);
 }
-
+/*
+ * Function to pack bit flags inside float type
+ * Resulted value represent can be from four different "realms"
+ * Every value from realm 3 is greater than any value from realms 2,1 and 0.
+ * Every value from realm 2 is less than every value from realm 3 and greater
+ * than any value from realm 1 and 0, and so on. Values from the same realm
+ * loose two bits of precision. This technique is possible due to floating
+ * point numbers specification according to IEEE 754: exponent bits are highest
+ * (excluding sign bits, but here penalty is always positive). If float a is
+ * greater than float b, integer A with same bit representation as a is greater
+ * than integer B with same bits as b.
+ */
+static float
+pack_float(float actualValue, int realm)
+{
+   /* two bits for realm, others for value */
+   /* we have 4 realms */
+   int realmAjustment = *((int*))/4;
+   int realCode = realm * (INT32_MAX/4) + realmAjustment;
+   return *((float*));
+}
 
 /*
 ** The GiST Penalty method for boxes
@@ -428,6 +450,11 @@ g_cube_decompress(PG_FUNCTION_ARGS)
 Datum
 g_cube_penalty(PG_FUNCTION_ARGS)
 {
+   /* REALM 0: No extension is required, volume is zero, return edge   
*/
+   /* REALM 1: No extension is required, return nonzero volume 
*/
+   /* REALM 2: Volume extension is zero, return nonzero edge extension 
*/
+   /* REALM 3: Volume extension is nonzero, return it  
*/
+
GISTENTRY  *origentry = (GISTENTRY *) PG_GETARG_POINTER(0);
GISTENTRY  *newentry = (GISTENTRY *) PG_GETARG_POINTER(1);
float  *result = (float *) PG_GETARG_POINTER(2);
@@ -441,9 +468,33 @@ g_cube_penalty(PG_FUNCTION_ARGS)
rt_cube_size(DatumGetNDBOX(origentry->key), );
*result = (float) (tmp1 - tmp2);
 
-   /*
-* fprintf(stderr, "penalty\n"); fprintf(stderr, "\t%g\n", *result);
-*/
+   if( *result == 0 )
+   {
+   double tmp3 = tmp1; /* remember entry volume */
+   rt_cube_edge(ud, );
+   rt_cube_edge(DatumGetNDBOX(origentry->key), );
+   *result = (float) (tmp1 - tmp2);
+   if( *result == 0 )
+   {
+   if( tmp3 != 0 )
+   {
+   *result = pack_float(tmp3, 1); /* REALM 1 */
+   }
+   else
+   {
+   *result = pack_float(tmp1, 0); /* REALM 0 */
+   }
+   }
+  

Re: [HACKERS] \timing interval

2016-09-01 Thread Tom Lane
[ This patch is marked Ready For Committer, and discussion seems to have
died off, so let's get on with committing something ... ]

Corey Huinker  writes:
> Generally speaking, people disliked the third mode for \timing, and were
> generally fine with AndrewG's idea of printing the timing in both raw
> milliseconds and a more human-digestible format, which means that we can:

Yeah, there seemed to be general agreement on just appending a more human
readable format to the existing printout.

> 3. ignore locales and fall back to a left-trimmed DDD HH:MM:SS.mmm format
>  + Easy to revert to that code
>  + My original format and one PeterE advocated
>  - others disliked

I think this is the approach to go with as a starting point, since it
largely avoids both localization and units-naming concerns.  If someone
feels the desire to build a customizable output format, that can be dealt
with as a separate patch on top of this one ... but I really question that
it'd ever be worth the trouble.

So for clarity's sake: first suitable format among these:

Time: 59.999 ms
Time: 121.999 ms (2:01.999)
Time: 10921.999 ms (3:02:01.999)
Time: 356521.999 ms (4 3:02:01.999)

In an NLS-enabled build, the translator would be able to fool with the
punctuation, though I dunno whether any translators would need to.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Forbid use of LF and CR characters in database and role names

2016-09-01 Thread Peter Eisentraut
On 8/11/16 9:12 PM, Michael Paquier wrote:
> Note that pg_dump[all] and pg_upgrade already have safeguards against
> those things per the same routines putting quotes for execution as
> commands into psql and shell. So attached is a patch to implement this
> restriction in the backend,

How about some documentation?  I think the CREATE ROLE and CREATE
DATABASE man pages might be suitable places.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup, pg_receivexlog and data durability (was: silent data loss with ext4 / all current versions)

2016-09-01 Thread Peter Eisentraut
On 5/13/16 2:39 AM, Michael Paquier wrote:
> So, attached are two patches that apply on HEAD to address the problem
> of pg_basebackup that does not sync the data it writes. As
> pg_basebackup cannot use directly initdb -S because, as a client-side
> utility, it may be installed while initdb is not (see Fedora and
> RHEL), I have refactored the code so as the routines in initdb.c doing
> the fsync of PGDATA and other fsync stuff are in src/fe_utils/, and
> this is 0001.

Why fe_utils?  initdb is not a front-end program.

> Patch 0002 is a set of fixes for pg_basebackup:
> - In plain mode, fsync_pgdata is used so as all the tablespaces are
> fsync'd at once. This takes care as well of the case where pg_xlog is
> a symlink.
> - In tar mode (no stdout), each tar file is synced individually, and
> the base directory is synced once at the end.
> In both cases, failures are not considered fatal.

Maybe there should be --nosync options like initdb has?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Simon Riggs
On 1 September 2016 at 17:23, Robert Haas  wrote:

> The primary audience of this feature is PostgreSQL developers

I have spoken to users who are waiting for this feature to run in
production, which is why I suggested it.

Some people care more about correctness than they do about loss of performance.

Obviously, this would be expensive and those with a super high
performance requirement may not be able to take advantage of this. I'm
sure many people will turn it off once if they hit a performance
issue, but running it in production for the first few months will give
people a very safe feeling.

I think the primary use for an rmgr filter might well be PostgreSQL developers.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Tom Lane
Peter Eisentraut  writes:
> As I just wrote in another message in this thread, I don't trust system
> load metrics very much as a gatekeeper.  They are reasonable for
> long-term charting to discover trends, but there are numerous potential
> problems for using them for this kind of resource control thing.

As a note in support of that, sendmail has a "feature" to suppress service
if system load gets above X, which I have never found to do anything
except result in self-DOSing.  The load spike might not have anything to
do with the service that is trying to un-spike things.  Even if it does,
Peter is correct to note that the response delay is much too long to form
part of a useful feedback loop.  It could be all right for scheduling
activities whose length is comparable to the load average measurement
interval, but not for short-term decisions.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] What is the posix_memalign() equivalent for the PostgreSQL?

2016-09-01 Thread Anderson Carniel
Dear all,

I am developing an extension for the PostgreSQL that write/read some
external files from the PostgreSQL. In order to write/read, I am using the
O_DIRECT flag and using the posix_memalign to allocate memory. I would like
to know if the postgresql internal library provides an equivalent function
for the posix_memalign since I am getting unexpected errors. All my
allocations are in the TopMemoryContext since I am working with several
buffers that must be alive while the PostgreSQL Server is activated.

Thanks in advance,
Anderson


Re: [HACKERS] Remove superuser() checks from pgstattuple

2016-09-01 Thread Peter Eisentraut
On 8/23/16 5:22 PM, Stephen Frost wrote:
> Now that we track initial privileges on extension objects and changes to
> those permissions, we can drop the superuser() checks from the various
> functions which are part of the pgstattuple extension.
> 
> Since a pg_upgrade will preserve the version of the extension which
> existed prior to the upgrade, we can't simply modify the existing
> functions but instead need to create new functions which remove the
> checks and update the SQL-level functions to use the new functions

I think this is a good change to pursue, and we'll likely want to do
more similar changes in contrib.  But I'm worried that what is logically
a 10-line change will end up a 20 KiB patch every time.

Have we explored other options for addressing the upgrade problems?
Maybe the function could check that non-default privileges have been
granted?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 kick-off

2016-09-01 Thread Fabrízio de Royes Mello
On Thu, Sep 1, 2016 at 1:02 PM, Magnus Hagander  wrote:
>
> On Sep 1, 2016 17:44, "Fabrízio de Royes Mello" 
wrote:
> >
> >
> >
> > On Thu, Sep 1, 2016 at 8:35 AM, Tom Lane  wrote:
> > >
> > > Magnus Hagander  writes:
> > > >> On Thu, Aug 4, 2016 at 12:09 AM, Fabrízio de Royes Mello
> > > >>  wrote:
> > > >>> If there are no complains about my lack of experience in this
field I
> > > >>> would like do become the next CFM (am I the first brazilian??)
> > >
> > > > Did we make a decision on this? Cf is starting now, are we
appointing
> > > > Fabrizio as the cf manager?
> > >
> > > I didn't hear any other volunteers, so he's it.
> > >
> >
> > Well... I was waiting for an approval from core team... then now I am
the CF manager...
>
> Yup,  clearly you have it now :-)
>

And here we go!!!


> >
> > I'm getting some help from Alvaro to get the admin grants in CommitFest
App to start this party.
>
> Sounds good and don't hesitate to ask any of us if you are wondering
about something or need some help.
>

Alvaro sent a message by IM that I'm granted as admin in commitfest app but
something seams wrong... The administration "link/button" doesn't appear to
me yet... What's the correct way to get this rights?

Thanks in advance!

--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Peter Eisentraut
On 8/16/16 3:39 AM, Haribabu Kommi wrote:
> Yes, we need to consider many parameters as a system load, not just only
> the CPU. Here I attached a POC patch that implements the CPU load
> calculation and decide the number of workers based on the available CPU
> load. The load calculation code is not an optimized one, there are many ways
> that can used to calculate the system load. This is just for an example.

I see a number of discussion points here:

We don't yet have enough field experience with the parallel query
facilities to know what kind of use patterns there are and what systems
for load management we need.  So I think building a highly specific
system like this seems premature.  We have settings to limit process
numbers, which seems OK as a start, and those knobs have worked
reasonably well in other areas (e.g., max connections, autovacuum).  We
might well want to enhance this area, but we'll need more experience and
information.

If we think that checking the CPU load is a useful way to manage process
resources, why not apply this to more kinds of processes?  I could
imagine that limiting connections by load could be useful.  Parallel
workers is only one specific niche of this problem.

As I just wrote in another message in this thread, I don't trust system
load metrics very much as a gatekeeper.  They are reasonable for
long-term charting to discover trends, but there are numerous potential
problems for using them for this kind of resource control thing.

All of this seems very platform specific, too.  You have
Windows-specific code, but the rest seems very Linux-specific.  The
dstat tool I had never heard of before.  There is stuff with cgroups,
which I don't know how portable they are across different Linux
installations.  Something about Solaris was mentioned.  What about the
rest?  How can we maintain this in the long term?  How do we know that
these facilities actually work correctly and not cause mysterious problems?

There is a bunch of math in there that is not documented much.  I can't
tell without reverse engineering the code what any of this is supposed
to do.

My suggestion is that we focus on refining the process control numbers
that we already have.  We had extensive discussions about that during
9.6 beta.  We have related patches in the commit fest right now.  Many
ideas have been posted.  System admins are generally able to count their
CPUs and match that to the number of sessions and jobs they need to run.
 Everything beyond that could be great but seems premature before we
have the basics figured out.

Maybe a couple of hooks could be useful to allow people to experiment
with this.  But the hooks should be more general, as described above.
But I think a few GUC settings that can be adjusted at run time could be
sufficient as well.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new gcc warning

2016-09-01 Thread Pavel Stehule
2016-09-01 18:40 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > 2016-09-01 14:31 GMT+02:00 Tom Lane :
> >> That should have gone away in commit a2fd62dd5.  What version are
> >> you looking at?
>
> > I am checking 9.5 branch and I cannot to find this commit there
>
> Hmm ... it wasn't back-patched, evidently.  We don't have a clear
> project policy on whether to back-patch changes that are only meant
> to silence useless warnings, but it seems reasonable to push this
> one, since gcc 6 is getting more widespread.
>

Thank you

Pavel


>
> regards, tom lane
>


Re: [HACKERS] System load consideration before spawning parallel workers

2016-09-01 Thread Peter Eisentraut
On 8/1/16 2:17 AM, Gavin Flower wrote:
> Possibly look how make does it with the '-l' flag?
> 
> '-l 8' don't start more process when load is 8 or greater, works on 
> Linux at least...

The problem with that approach is that it takes about a minute for the
load averages figures to be updated, by which time you have already
thrashed your system.

You can try this out by building PostgreSQL this way.  Please save your
work first, because you might have to hard-reboot your system.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new gcc warning

2016-09-01 Thread Tom Lane
Pavel Stehule  writes:
> 2016-09-01 14:31 GMT+02:00 Tom Lane :
>> That should have gone away in commit a2fd62dd5.  What version are
>> you looking at?

> I am checking 9.5 branch and I cannot to find this commit there

Hmm ... it wasn't back-patched, evidently.  We don't have a clear
project policy on whether to back-patch changes that are only meant
to silence useless warnings, but it seems reasonable to push this
one, since gcc 6 is getting more widespread.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new gcc warning

2016-09-01 Thread Pavel Stehule
2016-09-01 14:31 GMT+02:00 Tom Lane :

> Pavel Stehule  writes:
> > I see a new warning in upstream
>
> > r/include/libxml2   -c -o path.o path.c
> > path.c: In function ‘has_drive_prefix’:
> > path.c:89:26: warning: self-comparison always evaluates to false
> > [-Wtautological-compare]
> >   return skip_drive(path) != path;
> >   ^~
>
> That should have gone away in commit a2fd62dd5.  What version are
> you looking at?
>
>
I am checking 9.5 branch and I cannot to find this commit there

Regards

Pavel


> regards, tom lane
>


Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Robert Haas
On Thu, Sep 1, 2016 at 4:12 PM, Simon Riggs  wrote:
> So in my current understanding, a hinted change has by definition no
> WAL record, so we just ship a FPW.

Hmm.  An FPW would have to be contained in a WAL record, so it can't
be right to say that we ship an FPW for lack of a WAL record.  I think
what we ship is nothing at all when wal_log_hints is disabled, and
when wal_log_hints is enabled we log an FDW once per checkpoint.

> A non-hint change has a WAL record
> and it is my (possibly naive) hope that all changes to a page are
> reflected in the WAL record/replay,

I hope for this, too.

> so we can just make a simple
> comparison without caring what is the rmgr of the WAL record.

Sure, that is 100% possible, and likely a good idea as far as the
behavior on the standby is concerned.  What's not so clear is whether
a simple on/off switch is a wise plan on the master.

The purpose of this code, as I understand it, is to check for
discrepancies between "do" and "redo"; that is, to verify that the
changes made to the buffer at the time the WAL record is generated
produce the same result as replay of that WAL record on the standby.
To accomplish this purpose, a post-image of the affected buffers is
included in each and every WAL record.  On replay, that post-image can
be compared with the result of replay.  If they differ, PostgreSQL has
a bug.  I would not expect many users to run this in production,
because it will presumably be wicked expensive.  If I recall
correctly, doing full page writes once per buffer per checkpoint, the
space taken up by FPWs is >75% of WAL volume.  Doing it for every
record will be exponentially more expensive.  The primary audience of
this feature is PostgreSQL developers, who might want to use it to try
to verify that, for example, Amit's patch to add write-ahead logging
for hash indexes does not have bugs.[1]

Indeed, it had occurred to me that we might not even want to compile
this code into the server unless WAL_DEBUG is defined; after all, how
does it help a regular user to detect that the server has a bug?  Bug
or no bug, that's the code they've got.  But on further reflection, it
seems like it could be useful: if we suspect a bug in the redo code
but we can't reproduce it here, we could ask the customer to turn this
option on to see whether it produces logging indicating the nature of
the problem.  However, because of the likely expensive of enabling the
feature, it seems like it would be quite desirable to limit the
expense of generating many extra FPWs to the affected rmgr.  For
example, if a user has a table with a btree index and a gin index, and
we suspect a bug in GIN, it would be nice for the user to be able to
enable the feature *only for GIN* rather than paying the cost of
enabling it for btree and heap as well.[2]

Similarly, when we imagine a developer using this feature to test for
bugs, it may at times be useful to enable it across-the-board to look
for bugs in any aspect of the write-ahead logging system. However, at
other times, when the goal is to find bugs in a particular AM, it
might be useful to enable it only for the corresponding rmgr.  It is
altogether likely that this feature will slow the system down quite a
lot.  If enabling this feature for hash indexes also means enabling it
for the heap, the incremental performance hit might be sufficient to
mask concurrency-related bugs in the hash index code that would
otherwise have been found.  So, I think having at least some basic
filtering is probably a pretty smart idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

[1] It probably has bugs.
[2] One could of course add filtering considerably more complex than
per-rmgr - e.g. enabling it for only one particular relfilenode on a
busy production system might be rather desirable.  But I'm not sure we
really want to go there.  It adds a fair amount of complexity to a
feature that many people are obviously hoping will be quite simple to
use.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2016-09-01 Thread Bruce Momjian
On Thu, Sep  1, 2016 at 02:37:40PM +0530, Pavan Deolasee wrote:
> I like the simplified approach, as long as it doesn't block further
> improvements.
>
> 
> 
> Yes, the proposed approach is simple yet does not stop us from improving 
> things
> further. Moreover it has shown good performance characteristics and I believe
> it's a good first step.

Agreed.  This is BIG.  Do you think it can be done for PG 10?

> Thanks. What's also interesting and something that headline numbers don't show
> is that WARM TPS is as much as 3 times of master TPS when the percentage of
> WARM updates is very high. Notice the spike in TPS in the comparison graph.
> 
> Results with non-default heap fill factor are even better. In both cases, the
> improvement in TPS stays constant over long periods. 

Yes, I expect the benefits of this to show up in better long-term
performance.

> > During first heap scan of VACUUM, we look for tuples with 
> HEAP_WARM_TUPLE
> set.
> > If all live tuples in the chain are either marked with Blue flag or Red
> flag
> > (but no mix of Red and Blue), then the chain is a candidate for HOT
> conversion.
> 
> Uh, if the chain is all blue, then there is are WARM entries so it
> already a HOT chain, so there is nothing to do, right?
> 
> 
> For aborted WARM updates, the heap chain may be all blue, but there may still
> be a red index pointer which must be cleared before we allow further WARM
> updates to the chain.

Ah, understood now.  Thanks.

> Why not just remember the tid of chains converted from WARM to HOT, then
> use "amrecheck" on an index entry matching that tid to see if the index
> matches one of the entries in the chain. 
> 
> 
> That will require random access to heap during index vacuum phase, something I
> would like to avoid. But we can have that as a fall back solution for handling
> aborted vacuums. 

Yes, that is true.  So the challenge is figuring how which of the index
entries pointing to the same tid is valid, and coloring helps with that?

> (It will match all of them or
> none of them, because they are all red.)  I don't see a point in
> coloring the index entries as reds as later you would have to convert to
> blue in the WARM-to-HOT conversion, and a vacuum crash could lead to
> inconsistencies. 
> 
> 
> Yes, that's a concern since the conversion of red to blue will also need to 
> WAL
> logged to ensure that a crash doesn't leave us in inconsistent state. I still
> think that this will be an overall improvement as compared to allowing one 
> WARM
> update per chain.

OK.  I will think some more on this to see if I can come with another
approach.

>  
> 
> Consider that you can just call "amrecheck" on the few
> chains that have converted from WARM to HOT.  I believe this is more
> crash-safe too.  However, if you have converted WARM to HOT in the heap,
> but crash durin the index entry removal, you could potentially have
> duplicates in the index later, which is bad.
>
> As you probably already noted, we clear heap flags only after all indexes are
> cleared of duplicate entries and hence a crash in between should not cause any
> correctness issue. As long as heap tuples are marked as warm, amrecheck will
> ensure that only valid tuples are returned to the caller.

OK, got it.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I. As I am, so you will be. +
+ Ancient Roman grave inscription +


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 kick-off

2016-09-01 Thread Magnus Hagander
On Sep 1, 2016 17:44, "Fabrízio de Royes Mello" 
wrote:
>
>
>
> On Thu, Sep 1, 2016 at 8:35 AM, Tom Lane  wrote:
> >
> > Magnus Hagander  writes:
> > >> On Thu, Aug 4, 2016 at 12:09 AM, Fabrízio de Royes Mello
> > >>  wrote:
> > >>> If there are no complains about my lack of experience in this field
I
> > >>> would like do become the next CFM (am I the first brazilian??)
> >
> > > Did we make a decision on this? Cf is starting now, are we appointing
> > > Fabrizio as the cf manager?
> >
> > I didn't hear any other volunteers, so he's it.
> >
>
> Well... I was waiting for an approval from core team... then now I am the
CF manager...

Yup,  clearly you have it now :-)

>
> I'm getting some help from Alvaro to get the admin grants in CommitFest
App to start this party.

Sounds good and don't hesitate to ask any of us if you are wondering about
something or need some help.

>
> In a few hours I'm send the emails and officially start the commitfest.

Excellent!

/Magnus


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-09-01 Thread Tom Lane
Maksim Milyutin  writes:
>> On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin
>> > wrote:
>> Yes, but the problem is that nothing gives you the guarantee that at the
>> moment you decide to handle the interrupt, the QueryDesc structures
>> you're looking at are in a good shape for Explain* functions to run on
>> them.  Even if that appears to be the case in your testing now, there's
>> no way to tell if that will be the case at any future point in time.

> CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc 
> structure) is more or less consistent.

Really?  Even if that's 100% true today, which I wouldn't bet very much
on, it seems like a really dangerous property to insist on system-wide.
The only restriction we have ever placed on CHECK_FOR_INTERRUPTS is that
it occur at places where it'd be safe to throw elog(ERROR), and in general
we don't assume that the active query is still in a usable state after
an error.  What you propose would amount to a new restriction that nothing
can ever call any nontrivial subroutine while the active query tree is
less than fully valid (because the subroutine might contain a
CHECK_FOR_INTERRUPTS somewhere).  That sounds impractical and
unenforceable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 kick-off

2016-09-01 Thread Fabrízio de Royes Mello
On Thu, Sep 1, 2016 at 8:35 AM, Tom Lane  wrote:
>
> Magnus Hagander  writes:
> >> On Thu, Aug 4, 2016 at 12:09 AM, Fabrízio de Royes Mello
> >>  wrote:
> >>> If there are no complains about my lack of experience in this field I
> >>> would like do become the next CFM (am I the first brazilian??)
>
> > Did we make a decision on this? Cf is starting now, are we appointing
> > Fabrizio as the cf manager?
>
> I didn't hear any other volunteers, so he's it.
>

Well... I was waiting for an approval from core team... then now I am the
CF manager...

I'm getting some help from Alvaro to get the admin grants in CommitFest App
to start this party.

In a few hours I'm send the emails and officially start the commitfest.

Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-09-01 Thread Maksim Milyutin



On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin
> wrote:

On Mon, Aug 29, 2016 at 5:22 PM, maksim

>> wrote:

Hi, hackers!

Now I complete extension that provides facility to see the
current
state of query execution working on external session in form of
EXPLAIN ANALYZE output. This extension works on 9.5 version,
for 9.6
and later it doesn't support detailed statistics for
parallel nodes yet.

I want to present patches to the latest version of
PostgreSQL core
to enable this extension.

Hello,

Did you publish the extension itself yet?


Hello, extension for version 9.5 is available in repository
https://github.com/postgrespro/pg_query_state/tree/master
.

Last year (actually, exactly one year ago) I was trying to do
something
very similar, and it quickly turned out that signals are not the
best
way to make this sort of inspection.  You can find the discussion
here:

https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM=xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com




Thanks for link!

My patch *custom_signal.patch* resolves the problem of «heavy»
signal handlers. In essence, I follow the course offered in
*procsignal.c* file. They define *ProcSignalReason* values on which
the SUGUSR1 is multiplexed. Signal recent causes setting flags for
*ProcessInterrupt* actuating, i.e. procsignal_sigusr1_handler() only
sets specific flags. When CHECK_FOR_INTERRUPTS appears later on
query execution *ProcessInterrupt* is called. Then triggered user
defined signal handler is executed.
As a result, we have a deferred signal handling.


Yes, but the problem is that nothing gives you the guarantee that at the
moment you decide to handle the interrupt, the QueryDesc structures
you're looking at are in a good shape for Explain* functions to run on
them.  Even if that appears to be the case in your testing now, there's
no way to tell if that will be the case at any future point in time.


CHECK_FOR_INTERRUPTS are located in places where query state (QueryDesc 
structure) is more or less consistent. In these macro calls I pass 
QueryDesc to Explain* functions. I exactly know that elementary 
statistics updating functions (e.g. InstrStartNode, InstrStopNode, etc) 
don't contain CHECK_FOR_INTERRUPTS within itself therefore statistics at 
least on node level is consistent when backend will be ready to transfer 
its state.
The problem may be in interpretation of collected statistics in Explain* 
functions. In my practice there was the case when wrong number of 
inserted rows is shown under INSERT ON CONFLICT request. That request 
consisted of two parts: SELECT from table and INSERT with check on 
predicate. And I was interrupted between these parts. Formula for 
inserted rows was the number of extracting rows from SELECT minus 
rejected rows from INSERT. And I got imaginary inserted row. I removed 
the printing number of inserted rows under explain of running query 
because I don't know whether INSERT node has processed that last row. 
But the remaining number of rejected rows was deterministic and I showed it.



Another problem is use if shm_mq facility, because it manipulates the
state of process latch.  This is not supposed to happen to a backend
happily performing its tasks, at random due to external factors, and
this is what the proposed approach introduces


In Postgres source code the most WaitLatch() call on process latch is 
surrounded by loop and forms the pattern like this:


  for (;;)
  {
 if (desired_state_has_occured)
   break;

 WaitLatch(MyLatch);
 CHECK_FOR_INTERRUPTS();
 ResetLatch(MyLatch)
  }

The motivation of this decision is pretty clear illustrated by the 
extract from comment in Postgres core:


usage of "the generic process latch has to be robust against unrelated 
wakeups: Always check that the desired state has occurred, and wait 
again if not"[1].


I mean that random setting of process latch at the time of query 
executing don't affect on another usage of that latch later in code.



1. src/backend/storage/lmgr/proc.c:1724 for ProcWaitForSignal function

--
Maksim Milyutin
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing checks when malloc returns NULL...

2016-09-01 Thread Tom Lane
Michael Paquier  writes:
> On Thu, Sep 1, 2016 at 10:03 PM, Tom Lane  wrote:
>> Don't see the point really ... it's just more API churn without any
>> very compelling reason.

> OK, then no objection to your approach. At least I tried.

OK, pushed my version then.  I think we have now dealt with everything
mentioned in this thread except for the issues in pg_locale.c.  Are
you planning to tackle that?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: Exclude additional directories in pg_basebackup

2016-09-01 Thread Peter Eisentraut
On 8/15/16 3:39 PM, David Steele wrote:
> That patch got me thinking about what else could be excluded and after
> some investigation I found the following: pg_notify, pg_serial,
> pg_snapshots, pg_subtrans.  These directories are all cleaned, zeroed,
> or rebuilt on server start.
> 
> The attached patch adds these directories to the pg_basebackup
> exclusions and takes an array-based approach to excluding directories
> and files during backup.

We do support other backup methods besides using pg_basebackup.  So I
think we need to document the required or recommended handling of each
of these directories.  And then pg_basebackup should become a consumer
of that documentation.

The current documentation on this is at
,
which covers pg_xlog and pg_replslot.  I think that documentation should
be expanded, maybe with a simple list that is easy to copy into an
exclude file, following by more detail on each directory.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Postgres abort found in 9.3.11

2016-09-01 Thread Tom Lane
"K S, Sandhya (Nokia - IN/Bangalore)"  writes:
> Our setup is a hot-standby architecture. This crash is occurring only on 
> stand-by node. Postgres continues to run without any issues on active node.
> Postmaster is waiting for a start and is throwing this message.

> Aug 22 11:44:21.462555 info node-0 postgres[8222]: [1-2] HINT:  Is another 
> postmaster already running on port 5433? If not, wait a few seconds and 
> retry.  
> Aug 22 11:44:52.065760 crit node-1 postgres[8629]: [18-1] err-3:  
> btree_xlog_delete_get_latestRemovedXid: cannot operate with inconsistent 
> dataAug 22 11:44:52.065971 crit CFPU-1 postgres[8629]: [18-2] CONTEXT:  xlog 
> redo delete: index 1663/16386/17378; iblk 1, heap 1663/16386/16518;

Hmm, that HINT seems to be the tail end of a message indicating that the
postmaster is refusing to start because of an existing postmaster.  Why
is that appearing?  If you've got some script that's overeagerly launching
and killing postmasters, maybe that's the ultimate cause of problems.

The only method I've heard of for getting that get_latestRemovedXid
error is to try to launch a standalone backend (postgres --single)
in a standby server directory.  We don't support that, cf
https://www.postgresql.org/message-id/flat/00F0B2CEF6D0CEF8A90119D4%40eje.credativ.lan

BTW, I'm curious about the "err-3:" part.  That would not be expected
in any standard build of Postgres ... is this something custom modified?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] autonomous transactions

2016-09-01 Thread Constantin S. Pan
On Wed, 31 Aug 2016 14:46:30 +0100
Greg Stark  wrote:

> On Wed, Aug 31, 2016 at 2:50 AM, Peter Eisentraut
>  wrote:
> > - A API interface to open a "connection" to a background worker, run
> > queries, get results: AutonomousSessionStart(),
> > AutonomousSessionEnd(), AutonomousSessionExecute(), etc.  The
> > communication happens using the client/server protocol.
> 
> I'm surprised by the background worker. I had envisioned autonomous
> transactions being implemented by allowing a process to reserve a
> second entry in PGPROC with the same pid. Or perhaps save its existing
> information in a separate pgproc slot (or stack of slots) and
> restoring it after the autonomous transaction commits.
> 
> Using a background worker mean that the autonomous transaction can't
> access any state from the process memory. Parameters in plpgsql are a
> symptom of this but I suspect there will be others. What happens if a
> statement timeout occurs during an autonomous transaction? What
> happens if you use a pl language in the autonomous transaction and if
> it tries to use non-transactional information such as prepared
> statements?

I am trying to implement autonomous transactions that way. I
have already implemented suspending and restoring the parent
transaction state, at least some of it. The next thing on
the plan is the procarray/snapshot stuff. I think we should
reuse the same PGPROC for the autonomous transaction, and
allocate a stack of PGXACTs for the case of nested
autonomous transactions.

Solving the more general problem, running multiple
concurrent transactions with a single backend, may also be
interesting for some users. Autonomous transactions would
then be just a use case for that feature.

Regards,
Constantin Pan
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Choosing parallel_degree

2016-09-01 Thread Simon Riggs
On 12 April 2016 at 14:11, Simon Riggs  wrote:
> On 12 April 2016 at 13:53, Robert Haas  wrote:
>>
>> On Mon, Apr 11, 2016 at 5:45 PM, Simon Riggs 
>> wrote:
>> > On 8 April 2016 at 17:49, Robert Haas  wrote:
>> >> With the patch, you can - if you wish - substitute
>> >> some other number for the one the planner comes up with.
>> >
>> > I saw you're using AccessExclusiveLock, the reason being it affects
>> > SELECTs.
>> >
>> > That is supposed to apply when things might change the answer from a
>> > SELECT,
>> > whereas this affects only the default for a plan.
>> >
>> > Can I change this to a lower setting? I would have done this before
>> > applying
>> > the patch, but you beat me to it.
>>
>> I don't have a problem with reducing the lock level there, if we're
>> convinced that it's safe.
>
>
> I'll run up a patch, with appropriate comments.

Attached

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


reduce_lock_levels_incl_comments.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing checks when malloc returns NULL...

2016-09-01 Thread Michael Paquier
On Thu, Sep 1, 2016 at 10:03 PM, Tom Lane  wrote:
> Michael Paquier  writes:
>> Also, we could take one extra step forward then, and just introduce
>> ShmemAllocExtended that holds two flags as per the attached:
>> - SHMEM_ALLOC_ZERO that zeros all the fields
>> - SHMEM_ALLOC_NO_OOM that does not fail
>
> Don't see the point really ... it's just more API churn without any
> very compelling reason.

OK, then no objection to your approach. At least I tried.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN logging GIN_SEGMENT_UNMODIFIED actions?

2016-09-01 Thread Tom Lane
Fujii Masao  writes:
> I applied your suggested changes into the patch. Patch attached.

That looks pretty sane to me (but I just eyeballed it, didn't test).

One further minor improvement would be to rearrange the
XLOG_GIN_VACUUM_DATA_LEAF_PAGE case so that we don't bother calling
XLogRecGetBlockData() if there's a full-page image.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Logical decoding slots can go backwards when used from SQL, docs are wrong

2016-09-01 Thread Simon Riggs
On 20 August 2016 at 15:04, Petr Jelinek  wrote:
> On 20/08/16 16:01, Craig Ringer wrote:
>>
>> On 5 June 2016 at 09:54, David G. Johnston > > wrote:
>>
>> On Thursday, March 17, 2016, Craig Ringer > > wrote:
>>
>> The first patch was incorrectly created on top of failover slots
>> not HEAD. Attached patch applies on HEAD.
>>
>>
>> Lots of logical decoding work ongoing but this one shows as active
>> in the September cf and the comments by Craig indicate potential
>> relevance to 9.6.  Is this still live as-is or has it been subsumed
>> by other threads?
>>
>>
>>
>> Yes. Both those patches are still pending and should be considered for
>> commit in the next CF. (Of course, I think they should just be
>> committed, but I would, wouldn't I?)
>>
>>
>
> I think the doc one should definitely go in and possibly be back-patched all
> the way to 9.4. I didn't look at the other one.

I agree the doc patch should go in, though I suggest reword it
slightly, like attached patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Doc-correction-each-change-once.v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GIN logging GIN_SEGMENT_UNMODIFIED actions?

2016-09-01 Thread Fujii Masao
On Wed, Aug 31, 2016 at 8:32 PM, Tom Lane  wrote:
> Fujii Masao  writes:
>> On Wed, Aug 31, 2016 at 12:10 AM, Tom Lane  wrote:
>>> Hmm, comparing gin_desc() to ginRedoInsert() makes me think there are more
>>> problems there than that one.  Aren't the references to "payload" wrong
>>> in all three branches of that "if" construct, not just the middle one?
>
>> If we do this, the extra information like ginxlogInsertEntry->isDelete will
>> never be reported when the record has FPW.
>
> I'm happy to have it print whatever is there, but are you sure that the
> information is even there?  I thought that this chunk of the WAL record
> would get optimized away if we write an FPW image instead.

I was thinking that optimization logic was changed for logical decoding.
This understanding is right for heap, but not right for GIN, i.e., you're right,
such data chunk for GIN would be removed from WAL record if FPW is taken.
I applied your suggested changes into the patch. Patch attached.

Regards,

-- 
Fujii Masao
*** a/src/backend/access/rmgrdesc/gindesc.c
--- b/src/backend/access/rmgrdesc/gindesc.c
***
*** 87,99  gin_desc(StringInfo buf, XLogReaderState *record)
  		case XLOG_GIN_INSERT:
  			{
  ginxlogInsert *xlrec = (ginxlogInsert *) rec;
- char	   *payload = rec + sizeof(ginxlogInsert);
  
  appendStringInfo(buf, "isdata: %c isleaf: %c",
  			  (xlrec->flags & GIN_INSERT_ISDATA) ? 'T' : 'F',
  			 (xlrec->flags & GIN_INSERT_ISLEAF) ? 'T' : 'F');
  if (!(xlrec->flags & GIN_INSERT_ISLEAF))
  {
  	BlockNumber leftChildBlkno;
  	BlockNumber rightChildBlkno;
  
--- 87,99 
  		case XLOG_GIN_INSERT:
  			{
  ginxlogInsert *xlrec = (ginxlogInsert *) rec;
  
  appendStringInfo(buf, "isdata: %c isleaf: %c",
  			  (xlrec->flags & GIN_INSERT_ISDATA) ? 'T' : 'F',
  			 (xlrec->flags & GIN_INSERT_ISLEAF) ? 'T' : 'F');
  if (!(xlrec->flags & GIN_INSERT_ISLEAF))
  {
+ 	char	   *payload = rec + sizeof(ginxlogInsert);
  	BlockNumber leftChildBlkno;
  	BlockNumber rightChildBlkno;
  
***
*** 104,130  gin_desc(StringInfo buf, XLogReaderState *record)
  	appendStringInfo(buf, " children: %u/%u",
  	 leftChildBlkno, rightChildBlkno);
  }
! if (!(xlrec->flags & GIN_INSERT_ISDATA))
! 	appendStringInfo(buf, " isdelete: %c",
! 	(((ginxlogInsertEntry *) payload)->isDelete) ? 'T' : 'F');
! else if (xlrec->flags & GIN_INSERT_ISLEAF)
! {
! 	ginxlogRecompressDataLeaf *insertData =
! 	(ginxlogRecompressDataLeaf *) payload;
! 
! 	if (XLogRecHasBlockImage(record, 0))
! 		appendStringInfoString(buf, " (full page image)");
! 	else
! 		desc_recompress_leaf(buf, insertData);
! }
  else
  {
! 	ginxlogInsertDataInternal *insertData = (ginxlogInsertDataInternal *) payload;
  
! 	appendStringInfo(buf, " pitem: %u-%u/%u",
! 			 PostingItemGetBlockNumber(>newitem),
! 		 ItemPointerGetBlockNumber(>newitem.key),
! 	   ItemPointerGetOffsetNumber(>newitem.key));
  }
  			}
  			break;
--- 104,130 
  	appendStringInfo(buf, " children: %u/%u",
  	 leftChildBlkno, rightChildBlkno);
  }
! if (XLogRecHasBlockImage(record, 0))
! 	appendStringInfoString(buf, " (full page image)");
  else
  {
! 	char	   *payload = XLogRecGetBlockData(record, 0, NULL);
  
! 	if (!(xlrec->flags & GIN_INSERT_ISDATA))
! 		appendStringInfo(buf, " isdelete: %c",
! 		 (((ginxlogInsertEntry *) payload)->isDelete) ? 'T' : 'F');
! 	else if (xlrec->flags & GIN_INSERT_ISLEAF)
! 		desc_recompress_leaf(buf, (ginxlogRecompressDataLeaf *) payload);
! 	else
! 	{
! 		ginxlogInsertDataInternal *insertData =
! 			(ginxlogInsertDataInternal *) payload;
! 
! 		appendStringInfo(buf, " pitem: %u-%u/%u",
! 		 PostingItemGetBlockNumber(>newitem),
! 		 ItemPointerGetBlockNumber(>newitem.key),
! 		 ItemPointerGetOffsetNumber(>newitem.key));
! 	}
  }
  			}
  			break;
***
*** 144,150  gin_desc(StringInfo buf, XLogReaderState *record)
  			break;
  		case XLOG_GIN_VACUUM_DATA_LEAF_PAGE:
  			{
! ginxlogVacuumDataLeafPage *xlrec = (ginxlogVacuumDataLeafPage *) rec;
  
  if (XLogRecHasBlockImage(record, 0))
  	appendStringInfoString(buf, " (full page image)");
--- 144,151 
  			break;
  		case XLOG_GIN_VACUUM_DATA_LEAF_PAGE:
  			{
! ginxlogVacuumDataLeafPage *xlrec =
! 	(ginxlogVacuumDataLeafPage *) XLogRecGetBlockData(record, 0, NULL);
  
  if (XLogRecHasBlockImage(record, 0))
  	appendStringInfoString(buf, " (full page image)");

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Missing checks when malloc returns NULL...

2016-09-01 Thread Tom Lane
Michael Paquier  writes:
> Also, we could take one extra step forward then, and just introduce
> ShmemAllocExtended that holds two flags as per the attached:
> - SHMEM_ALLOC_ZERO that zeros all the fields
> - SHMEM_ALLOC_NO_OOM that does not fail

Don't see the point really ... it's just more API churn without any
very compelling reason.

(It doesn't look like you correctly implemented the case of both
flags being set, either.)

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Thank you for your coments, Tom.

> Modern compilers are generally able to make their own decisions about it, and 
> trying to put your thumb on the scales this heavily is not likely to improve 
> the code.
Well, I have tested that combination of "static inline" affets
performance of index build on a scale of 5%. Though I didn't tested
with "static" only.
AFAIK compiler cannot prove that array of function input and output do
not intersect, so it emits lots of writes to output address inside
loop body.

>That pack_float function is absolutely not acceptable
Well, possibly, there are ways to achive penalty optimization without
such hacks, but it failed for pg_shpere and in PostGIS code. They
reverted plain arithmetic optimization without bit hacks, becuase it
didn't worked. This one works.
There is other way: advance GiST API. But I'm not sure it is possible
to do so without breaking compatibility with many existing extensions.

Best regards, Andrey Borodin, Octonica & Ural Federal University.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Tom Lane
Andrew Borodin  writes:
> Does every supported Postgres platform conforms to IEEE 754 floating
> point specification?

Possibly, but it is against project policy to allow code to assume so.
That pack_float function is absolutely not acceptable IMV, and would
still be if it were adequately documented, which it's far away from
being.

On general grounds, I would forget all the "inline"s.  Modern compilers
are generally able to make their own decisions about it, and trying to put
your thumb on the scales this heavily is not likely to improve the code.

Haven't really read the patch, just responding to a couple points you
mentioned in the intro.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-01 Thread Michael Paquier
On Thu, Sep 1, 2016 at 5:13 PM, Magnus Hagander  wrote:
> We don't seem to check for similar issues as the one just found in the
> existing tests though, do we? As in, we don't actually verify that the xlog
> files being streamed are 16Mb? (Or for that matter that the tarfile emitted
> by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)

No. There is no checks on the WAL file size (you should use the output
of pg_controldata to see how large the segments should be). For the
tar file, the complication is in its untar... Perl provides some ways
to untar things, though the oldest version that we support in the TAP
tests does not offer that :(
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] new gcc warning

2016-09-01 Thread Tom Lane
Pavel Stehule  writes:
> I see a new warning in upstream

> r/include/libxml2   -c -o path.o path.c
> path.c: In function ‘has_drive_prefix’:
> path.c:89:26: warning: self-comparison always evaluates to false
> [-Wtautological-compare]
>   return skip_drive(path) != path;
>   ^~

That should have gone away in commit a2fd62dd5.  What version are
you looking at?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [WIP] Patches to enable extraction state of query execution from external session

2016-09-01 Thread Oleksandr Shulgin
On Tue, Aug 30, 2016 at 9:34 AM, Maksim Milyutin 
wrote:

> On Mon, Aug 29, 2016 at 5:22 PM, maksim > > wrote:
>>
>> Hi, hackers!
>>
>> Now I complete extension that provides facility to see the current
>> state of query execution working on external session in form of
>> EXPLAIN ANALYZE output. This extension works on 9.5 version, for 9.6
>> and later it doesn't support detailed statistics for parallel nodes
>> yet.
>>
>> I want to present patches to the latest version of PostgreSQL core
>> to enable this extension.
>>
>> Hello,
>>
>> Did you publish the extension itself yet?
>>
>>
> Hello, extension for version 9.5 is available in repository
> https://github.com/postgrespro/pg_query_state/tree/master.
>
> Last year (actually, exactly one year ago) I was trying to do something
>> very similar, and it quickly turned out that signals are not the best
>> way to make this sort of inspection.  You can find the discussion
>> here: https://www.postgresql.org/message-id/CACACo5Sz7G0MFauC082iM
>> =xx_hq7qq5ndr4jpo+h-o5vp6i...@mail.gmail.com
>>
>
> Thanks for link!
>
> My patch *custom_signal.patch* resolves the problem of «heavy» signal
> handlers. In essence, I follow the course offered in *procsignal.c* file.
> They define *ProcSignalReason* values on which the SUGUSR1 is multiplexed.
> Signal recent causes setting flags for *ProcessInterrupt* actuating, i.e.
> procsignal_sigusr1_handler() only sets specific flags. When
> CHECK_FOR_INTERRUPTS appears later on query execution *ProcessInterrupt* is
> called. Then triggered user defined signal handler is executed.
> As a result, we have a deferred signal handling.
>

Yes, but the problem is that nothing gives you the guarantee that at the
moment you decide to handle the interrupt, the QueryDesc structures you're
looking at are in a good shape for Explain* functions to run on them.  Even
if that appears to be the case in your testing now, there's no way to tell
if that will be the case at any future point in time.

Another problem is use if shm_mq facility, because it manipulates the state
of process latch.  This is not supposed to happen to a backend happily
performing its tasks, at random due to external factors, and this is what
the proposed approach introduces.

--
Alex


Re: [HACKERS] PostgreSQL 10 kick-off

2016-09-01 Thread Tom Lane
Magnus Hagander  writes:
>> On Thu, Aug 4, 2016 at 12:09 AM, Fabrízio de Royes Mello
>>  wrote:
>>> If there are no complains about my lack of experience in this field I
>>> would like do become the next CFM (am I the first brazilian??)

> Did we make a decision on this? Cf is starting now, are we appointing
> Fabrizio as the cf manager?

I didn't hear any other volunteers, so he's it.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Surprising behaviour of \set AUTOCOMMIT ON

2016-09-01 Thread Rahila Syed
Ok. Please find attached a patch which introduces psql error when
autocommit is turned on inside a transaction. It also adds relevant
documentation in psql-ref.sgml. Following is the output.

bash-4.2$ psql -d postgres
psql (10devel)
Type "help" for help.

postgres=# \set AUTOCOMMIT OFF
postgres=# create table test(i int);
CREATE TABLE
postgres=# \set AUTOCOMMIT ON
\set: Cannot set AUTOCOMMIT to ON inside a transaction, either COMMIT or
ROLLBACK and retry
postgres=#


Thank you,
Rahila Syed

On Wed, Aug 17, 2016 at 6:31 PM, Robert Haas  wrote:

> On Tue, Aug 16, 2016 at 5:25 PM, Rahila Syed 
> wrote:
> >>I think I like the option of having psql issue an error.  On the
> >>server side, the transaction would still be open, but the user would
> >>receive a psql error message and the autocommit setting would not be
> >>changed.  So the user could type COMMIT or ROLLBACK manually and then
> >>retry changing the value of the setting.
> >
> > Throwing psql error comes out to be most accepted outcome on this
> thread. I
> > agree it is safer than guessing user intention.
> >
> > Although according to the default behaviour of psql, error will abort the
> > current transaction and roll back all the previous commands.
>
> A server error would do that, but a psql errror won't.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


psql_error_on_autocommit.patch
Description: application/download

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Simon Riggs
On 1 September 2016 at 11:16, Robert Haas  wrote:
> On Wed, Aug 31, 2016 at 7:02 PM, Simon Riggs  wrote:
>> I'd prefer a solution that was not dependent upon RmgrID at all.
>>
>> If there are various special cases that we need to cater for, ISTM
>> they would be flaws in the existing WAL implementation rather than
>> anything we would want to perpetuate. I hope we'll spend time fixing
>> them rather than add loads of weird code to work around the
>> imperfections.
>>
>> Underdocumented special case code is going to be unbelievably
>> difficult to get right in the long term.
>
> It seems to me that you may be conflating the issue of which changes
> should be masked out as hints (which is, indeed, special case code,
> whether underdocumented or not) with the issue of which rmgrs the user
> may want to verify (which is just a case of matching the rmgr ID in
> the WAL record against a list provided by the user, and is not special
> case code at all).

Yep, it seems entirely likely that I am misunderstanding what is
happening here. I'd like to see an analysis/discussion before we write
code. As you might expect, I'm excited by this feature and the
discoveries it appears likely to bring.

We've got wal_log_hints and that causes lots of extra traffic. I'm
happy with assuming that is switched on in this case also. (Perhaps we
might have a wal_log_level with various levels of logging.)

So in my current understanding, a hinted change has by definition no
WAL record, so we just ship a FPW. A non-hint change has a WAL record
and it is my (possibly naive) hope that all changes to a page are
reflected in the WAL record/replay, so we can just make a simple
comparison without caring what is the rmgr of the WAL record.

If we can start by discussing which special cases we know about that
require extra code, that will help. We can then decide whether to fix
the WAL record/replay or fix the comparison logic, possibly on a case
by case basis. My current preference is to generate lots of small
fixes to existing WAL code and then have a very, very simple patch for
this actual feature, but am willing to discuss alternate approaches.

IMV this would be a feature certain users would want turned on all the
time for everything. So I'm not bothered much about making this
feature settable by rmgr. I might question why this particular feature
would be settable by rmgr, when features like wal_log_hints and
wal_compression are not, but such discussion is a minor point in
comparison to discussing the main feature.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WAL consistency check facility

2016-09-01 Thread Robert Haas
On Wed, Aug 31, 2016 at 7:02 PM, Simon Riggs  wrote:
> I'd prefer a solution that was not dependent upon RmgrID at all.
>
> If there are various special cases that we need to cater for, ISTM
> they would be flaws in the existing WAL implementation rather than
> anything we would want to perpetuate. I hope we'll spend time fixing
> them rather than add loads of weird code to work around the
> imperfections.
>
> Underdocumented special case code is going to be unbelievably
> difficult to get right in the long term.

It seems to me that you may be conflating the issue of which changes
should be masked out as hints (which is, indeed, special case code,
whether underdocumented or not) with the issue of which rmgrs the user
may want to verify (which is just a case of matching the rmgr ID in
the WAL record against a list provided by the user, and is not special
case code at all).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PostgreSQL 10 kick-off

2016-09-01 Thread Magnus Hagander
On Aug 4, 2016 2:25 AM, "Michael Paquier"  wrote:
>
> On Thu, Aug 4, 2016 at 12:09 AM, Fabrízio de Royes Mello
>  wrote:
> > If there are no complains about my lack of experience in this field I
would
> > like do become the next CFM (am I the first brazilian??)
>
> That would be great. I am not sending more patches for the 1st CF and
> as long as there are no issues for 9.6 to deal with, so I am switching
> to patch review mode pretty soon, so I'll help out with things.

Did we make a decision on this? Cf is starting now, are we appointing
Fabrizio as the cf manager?

/Magnus


Re: [HACKERS] CREATE POLICY bug ?

2016-09-01 Thread Dean Rasheed
[Please reply to the list, not just to me, so that others can benefit
from and contribute to the discussion]

On 31 August 2016 at 11:52, Andrea Adami  wrote:
> Thnaks Dean, i did further investigations:
> i set the owner of the view to: "mana...@scuola247.it" with:
> ALTER TABLE public.policy_view OWNER TO "mana...@scuola247.it";
> and i thinking to see from the select:
> select * from policy_view
> the rows: 1,2,3
> then
> set role 'mana...@scuola247.it';
> select * from policy_view;
> return rows 1,2,3 as expected but:
> set role 'teac...@scuola247.it';
> select * from policy_view;
> returns rows 4,5 and
> set role 'postgres'
> select * from policy_view
> return nothing ...
> what you thinking about ?
>
> Andrea

That's correct. With the table owned by postgres and the view owned by
"mana...@scuola247.it", access to the table via the view is subject to
the policies that apply to "mana...@scuola247.it". So regardless of
who the current user is, when selecting from the view, the policy
"standard" will be applied, and that will limit the visible rows to
those for which usr = current_user.

Regards,
Dean


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-09-01 Thread Simon Riggs
On 31 August 2016 at 20:01, Abhijit Menon-Sen  wrote:

> Unfortunately, some parts conflict with the patch that Simon just posted
> (e.g., his patch removes trigger_file altogether, whereas mine converts
> it into a GUC along the lines of the original patch). Rather than trying
> to untangle that right now, I'm posting what I have as-is, and I'll post
> an updated version tomorrow.

Thanks.

archive_cleanup_command no longer needs to be in shmem. Checkpointer
will have its own copy of the value.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-09-01 Thread Simon Riggs
On 1 September 2016 at 06:34, Michael Paquier  wrote:
> On Thu, Sep 1, 2016 at 1:15 AM, Simon Riggs  wrote:
>> This is a summary of proposed changes to the recovery.conf API for
>> v10. These are based in part on earlier discussions, and represent a
>> minimal modification to current usage.
>>
>> Proposed changes (with reference to patches from Abhijit Menon-Sen and 
>> myself)
>>
>> * pg_ctl start -M (normal|recover|continue)
...
> That looks like a sound design.

Thanks

>> * Recovery parameters would now be part of the main postgresql.conf
>> infrastructure
>> Any parameters set in $DATADIR/recovery.conf will be read after the
>> main parameter file, similar to the way that postgresql.conf.auto is
>> read.
>> (Abhijit)
>> * pg_basebackup -R will continue to generate a parameter file called
>> recovery.conf as it does now, but will also create a file named
>> recovery.trigger.
>> (This part is WIP; patch doesn't include implementation for tar format yet)
>
> Or we could just throw away this dependency with recovery.conf,
> simply. I see no need to keep it if recovery is triggered depending on
> recovery.trigger, nor recommend its use. We could instead add
> include_if_exists 'recovery.conf' at the bottom of postgresql.conf and
> let the infrastructure do the rest to simplify the patch.

It works exactly the same as ALTER SYSTEM and adds only one small hunk of code.

>> * Parameters
>> All of the parameters formerly set in recovery.conf can be set in
>> postgresql.conf using RELOAD
>> These parameters will have no defaults in postgresql.conf.sample
>> Setting them has no effect during normal running, or once recovery ends.
>>  https://www.postgresql.org/docs/devel/static/archive-recovery-settings.html
>>  https://www.postgresql.org/docs/devel/static/recovery-target-settings.html
>>  https://www.postgresql.org/docs/devel/static/standby-settings.html
>> (Abhijit)
>
> Hm. I think that what would make sense here is a new GUC category,
> meaning that once recovery is launched the new parameters are not
> taken into account once again. Even updating primary_conninfo would
> need a restart of the WAL receiver, so we could just make them
> GUC_POSTMASTER and be done with it.

We definitely want most of them set at RELOAD, especially recovery targets.

Almost all of them will have no effect when normal mode starts, so I
don't see any other special handling needed.

>> Related cleanup
>> * Promotion signal file is now called "promote.trigger" rather than
>> just "promote"
>
> If that's not strictly necessary this renaming is not mandatory.

I think it makes sense to keep both files with same suffix, for clarity.

>> * Remove user configurable "trigger_file" mechanism - use
>> "promote.trigger" for all cases
>
> Ugh. I am -1 on that. There are likely backup tools and users that
> rely on this option, particularly to be able to trigger promotion
> using a file that is on a different partition than PGDATA.

OK, I wasn't thinking of that. Perhaps we should have a trigger
directory parameter?

>> * Remove Fallback promote mechanism, so all promotion is now "fast" in xlog.c
>
> No problem with that. Now others have surely other opinions. That
> could be addressed as a separate patch.

I'll post separate patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-01 Thread Thom Brown
On 1 September 2016 at 10:02, Robert Haas  wrote:
> On Thu, Sep 1, 2016 at 12:04 PM, Stephen Frost  wrote:
>> As outlined in the commit message, this adds support for restrictive RLS
>> policies.  We've had this in the backend since 9.5, but they were only
>> available via hooks and therefore extensions.  This adds support for
>> them to be configured through regular DDL commands.  These policies are,
>> essentially "AND"d instead of "OR"d.
>>
>> Includes updates to the catalog, grammer, psql, pg_dump, and regression
>> tests.  Documentation will be added soon, but until then, would be great
>> to get feedback on the grammer, catalog and code changes.
>
> I don't like CREATE RESTRICT POLICY much.  It's not very good grammar,
> for one thing.  I think putting the word RESTRICT, or maybe AS
> RESTRICT, somewhere later in the command would be better.
>
> I also think that it is very strange to have the grammar keyword be
> "restrict" but the internal flag be called "permissive".  It would be
> better to have the sense of those flags match.
>
> (This is not intended as a full review, just a quick comment.)

I had proposed this sort of functionality a couple years back:
https://www.depesz.com/2014/10/02/waiting-for-9-5-row-level-security-policies-rls/#comment-187800

And I suggested CREATE RESTRICTIVE POLICY, but looking back at that,
perhaps you're right, and it would be better to add it later in the
command.

Thom


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Patch: Write Amplification Reduction Method (WARM)

2016-09-01 Thread Pavan Deolasee
On Thu, Sep 1, 2016 at 1:33 AM, Bruce Momjian  wrote:

> On Wed, Aug 31, 2016 at 10:15:33PM +0530, Pavan Deolasee wrote:
> > Instead, what I would like to propose and the patch currently implements
> is to
> > restrict WARM update to once per chain. So the first non-HOT update to a
> tuple
> > or a HOT chain can be a WARM update. The chain can further be HOT
> updated any
> > number of times. But it can no further be WARM updated. This might look
> too
> > restrictive, but it can still bring down the number of regular updates by
> > almost 50%. Further, if we devise a strategy to convert a WARM chain
> back to
> > HOT chain, it can again be WARM updated. (This part is currently not
> > implemented). A good side effect of this simple strategy is that we know
> there
> > can maximum two index entries pointing to any given WARM chain.
>
> I like the simplified approach, as long as it doesn't block further
> improvements.
>
>
Yes, the proposed approach is simple yet does not stop us from improving
things further. Moreover it has shown good performance characteristics and
I believe it's a good first step.


>
> > Master:
> > tps = 1138.072117 (including connections establishing)
> >
> > WARM:
> > tps = 2016.812924 (including connections establishing)
>
> These are very impressive results.
>
>
Thanks. What's also interesting and something that headline numbers don't
show is that WARM TPS is as much as 3 times of master TPS when the
percentage of WARM updates is very high. Notice the spike in TPS in the
comparison graph.

Results with non-default heap fill factor are even better. In both cases,
the improvement in TPS stays constant over long periods.


>
> >
> > During first heap scan of VACUUM, we look for tuples with
> HEAP_WARM_TUPLE set.
> > If all live tuples in the chain are either marked with Blue flag or Red
> flag
> > (but no mix of Red and Blue), then the chain is a candidate for HOT
> conversion.
>
> Uh, if the chain is all blue, then there is are WARM entries so it
> already a HOT chain, so there is nothing to do, right?
>

For aborted WARM updates, the heap chain may be all blue, but there may
still be a red index pointer which must be cleared before we allow further
WARM updates to the chain.


>
> > We remember the root line pointer and Red-Blue flag of the WARM chain in
> a
> > separate array.
> >
> > If we have a Red WARM chain, then our goal is to remove Blue pointers
> and vice
> > versa. But there is a catch. For Index2 above, there is only Blue pointer
> > and that must not be removed. IOW we should remove Blue pointer iff a Red
> > pointer exists. Since index vacuum may visit Red and Blue pointers in any
> > order, I think we will need another index pass to remove dead
> > index pointers. So in the first index pass we check which WARM
> candidates have
> > 2 index pointers. In the second pass, we remove the dead pointer and
> reset Red
> > flag is the surviving index pointer is Red.
>
> Why not just remember the tid of chains converted from WARM to HOT, then
> use "amrecheck" on an index entry matching that tid to see if the index
> matches one of the entries in the chain.


That will require random access to heap during index vacuum phase,
something I would like to avoid. But we can have that as a fall back
solution for handling aborted vacuums.



> (It will match all of them or
> none of them, because they are all red.)  I don't see a point in
> coloring the index entries as reds as later you would have to convert to
> blue in the WARM-to-HOT conversion, and a vacuum crash could lead to
> inconsistencies.


Yes, that's a concern since the conversion of red to blue will also need to
WAL logged to ensure that a crash doesn't leave us in inconsistent state. I
still think that this will be an overall improvement as compared to
allowing one WARM update per chain.


> Consider that you can just call "amrecheck" on the few
> chains that have converted from WARM to HOT.  I believe this is more
> crash-safe too.  However, if you have converted WARM to HOT in the heap,
> but crash durin the index entry removal, you could potentially have
> duplicates in the index later, which is bad.
>
>
As you probably already noted, we clear heap flags only after all indexes
are cleared of duplicate entries and hence a crash in between should not
cause any correctness issue. As long as heap tuples are marked as warm,
amrecheck will ensure that only valid tuples are returned to the caller.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] pg_basebackup wish list

2016-09-01 Thread Simon Riggs
On 19 August 2016 at 08:46, Michael Paquier  wrote:
> On Fri, Aug 19, 2016 at 2:04 PM, Masahiko Sawada  
> wrote:
>> I agree with adding this as an option and removing directory by default.
>> And it looks good to me except for missing new line in usage output.
>>
>> printf(_("  -l, --label=LABEL  set backup label\n"));
>> +   printf(_("  -n, --noclean  do not clean up after errors"));
>> printf(_("  -P, --progress show progress information\n"));
>>
>> Registered this patch to CF1.
>
> +1 for the idea. Looking at the patch it is taking a sane approach.

Apart from this one liner change we look good to go.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for restrictive RLS policies

2016-09-01 Thread Robert Haas
On Thu, Sep 1, 2016 at 12:04 PM, Stephen Frost  wrote:
> As outlined in the commit message, this adds support for restrictive RLS
> policies.  We've had this in the backend since 9.5, but they were only
> available via hooks and therefore extensions.  This adds support for
> them to be configured through regular DDL commands.  These policies are,
> essentially "AND"d instead of "OR"d.
>
> Includes updates to the catalog, grammer, psql, pg_dump, and regression
> tests.  Documentation will be added soon, but until then, would be great
> to get feedback on the grammer, catalog and code changes.

I don't like CREATE RESTRICT POLICY much.  It's not very good grammar,
for one thing.  I think putting the word RESTRICT, or maybe AS
RESTRICT, somewhere later in the command would be better.

I also think that it is very strange to have the grammar keyword be
"restrict" but the internal flag be called "permissive".  It would be
better to have the sense of those flags match.

(This is not intended as a full review, just a quick comment.)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] GiST penalty functions [PoC]

2016-09-01 Thread Andrew Borodin
Hi hackers!

Here is the new patch version.
With the help of Mikhail Bakhterev I've optimized subroutines of the
penalty function. Index build time now is roughly equivalent to build
time before patch (test attached to thread start).
Time of SELECT statement execution is reduced by 40%.
Changes in the patch:
1. Wrong usage of realms is fixed
2. Cube size and margin (edge) functions are optimized to reduce
memory write instructions count (result of these functions were
written on evey cycle of a loop)
3. All private functions are marked as static inline
4. Comments are formatted per project style

I'm going to put this to commitfest queue, because performance of gist
queries is improved significantly and I do not see any serious
drawbacks.

Any ideas about this patch are welcome. Especialy I'm conserned about
portability of pack_float function.
Does every supported Postgres platform conforms to IEEE 754 floating
point specification?

Also I'm not sure about possibility to hit any problems with float
NaNs during float package?

Best regards, Andrey Borodin, Octonica & Ural Federal University.
diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c
index 3feddef..dec314f 100644
--- a/contrib/cube/cube.c
+++ b/contrib/cube/cube.c
@@ -91,20 +91,22 @@ PG_FUNCTION_INFO_V1(cube_enlarge);
 /*
 ** For internal use only
 */
-int32  cube_cmp_v0(NDBOX *a, NDBOX *b);
-bool   cube_contains_v0(NDBOX *a, NDBOX *b);
-bool   cube_overlap_v0(NDBOX *a, NDBOX *b);
-NDBOX *cube_union_v0(NDBOX *a, NDBOX *b);
-void   rt_cube_size(NDBOX *a, double *sz);
-NDBOX *g_cube_binary_union(NDBOX *r1, NDBOX *r2, int *sizep);
-bool   g_cube_leaf_consistent(NDBOX *key, NDBOX *query, StrategyNumber 
strategy);
-bool   g_cube_internal_consistent(NDBOX *key, NDBOX *query, 
StrategyNumber strategy);
+static inline int32cube_cmp_v0(NDBOX *a, NDBOX *b);
+static inline bool cube_contains_v0(NDBOX *a, NDBOX *b);
+static inline bool cube_overlap_v0(NDBOX *a, NDBOX *b);
+static inline NDBOX   *cube_union_v0(NDBOX *a, NDBOX *b);
+static inline floatpack_float(float actualValue, int realm);
+static inline void rt_cube_size(NDBOX *a, double *sz);
+static inline void rt_cube_edge(NDBOX *a, double *sz);
+static inline NDBOX   *g_cube_binary_union(NDBOX *r1, NDBOX *r2, int 
*sizep);
+static inline bool g_cube_leaf_consistent(NDBOX *key, NDBOX 
*query, StrategyNumber strategy);
+static inline bool g_cube_internal_consistent(NDBOX *key, NDBOX 
*query, StrategyNumber strategy);
 
 /*
 ** Auxiliary funxtions
 */
-static double distance_1D(double a1, double a2, double b1, double b2);
-static bool cube_is_point_internal(NDBOX *cube);
+static inline double distance_1D(double a1, double a2, double b1, double b2);
+static inline bool cube_is_point_internal(NDBOX *cube);
 
 
 /*
@@ -420,6 +422,15 @@ g_cube_decompress(PG_FUNCTION_ARGS)
PG_RETURN_POINTER(entry);
 }
 
+static inline float
+pack_float(float actualValue, int realm)
+{
+   /* two bits for realm, other for value  */
+   /* we have 4 realms */
+   int realmAjustment = *((int*))/4;
+   int realCode = realm * (INT32_MAX/4) + realmAjustment;
+   return *((float*));
+}
 
 /*
 ** The GiST Penalty method for boxes
@@ -428,6 +439,11 @@ g_cube_decompress(PG_FUNCTION_ARGS)
 Datum
 g_cube_penalty(PG_FUNCTION_ARGS)
 {
+   /* REALM 0: No extension is required, volume is zero, return edge   
*/
+   /* REALM 1: No extension is required, return nonzero volume 
*/
+   /* REALM 2: Volume extension is zero, return nonzero edge extension 
*/
+   /* REALM 3: Volume extension is nonzero, return it  
*/
+
GISTENTRY  *origentry = (GISTENTRY *) PG_GETARG_POINTER(0);
GISTENTRY  *newentry = (GISTENTRY *) PG_GETARG_POINTER(1);
float  *result = (float *) PG_GETARG_POINTER(2);
@@ -441,9 +457,33 @@ g_cube_penalty(PG_FUNCTION_ARGS)
rt_cube_size(DatumGetNDBOX(origentry->key), );
*result = (float) (tmp1 - tmp2);
 
-   /*
-* fprintf(stderr, "penalty\n"); fprintf(stderr, "\t%g\n", *result);
-*/
+   if( *result == 0 )
+   {
+   double tmp3 = tmp1; /* remember entry volume */
+   rt_cube_edge(ud, );
+   rt_cube_edge(DatumGetNDBOX(origentry->key), );
+   *result = (float) (tmp1 - tmp2);
+   if( *result == 0 )
+   {
+   if( tmp3 != 0 )
+   {
+   *result = pack_float(tmp3, 1); /* REALM 1 */
+   }
+   else
+   {
+   *result = pack_float(tmp1, 0); /* REALM 0 */
+   

[HACKERS] new gcc warning

2016-09-01 Thread Pavel Stehule
Hi

I see a new warning in upstream

r/include/libxml2   -c -o path.o path.c
path.c: In function ‘has_drive_prefix’:
path.c:89:26: warning: self-comparison always evaluates to false
[-Wtautological-compare]
  return skip_drive(path) != path;
  ^~
Regards

[pavel@nemesis postgresql]$ gcc --version
gcc (GCC) 6.1.1 20160621 (Red Hat 6.1.1-3)
Copyright (C) 2016 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Regards

Pavel


Re: [HACKERS] Declarative partitioning - another take

2016-09-01 Thread Ashutosh Bapat
>
> > I don't think you need to do anything in the path creation code for this.
> > As is it flattens all AppendPath hierarchies whether for partitioning or
> > inheritance or subqueries. We should leave it as it is.
>
> I thought it would be convenient for pairwise join code to work with the
> hierarchy intact even within the AppendPath tree.  If it turns out to be
> so, maybe that patch can take care of it.
>

Partition-wise join work with RelOptInfos, so it's fine if the AppendPath
hierarchy is flattened out. We need the RelOptInfo hierarchy though.


>
> >> I think I can manage to squeeze in (a) in the next version patch and
> will
> >> also start working on (b), mainly the part about RelOptInfo getting some
> >> partitioning info.
> >
> > I am fine with b, where you would include some partitioning information
> in
> > RelOptInfo. But you don't need to do what you said in (b) above.
> >
> > In a private conversation Robert Haas suggested a way slightly different
> > than what my patch for partition-wise join does. He suggested that the
> > partitioning schemes i.e strategy, number of partitions and bounds of the
> > partitioned elations involved in the query should be stored in
> PlannerInfo
> > in the form of a list. Each partitioning scheme is annotated with the
> > relids of the partitioned relations. RelOptInfo of the partitioned
> relation
> > will point to the partitioning scheme in PlannerInfo. Along-with that
> each
> > RelOptInfo will need to store partition keys for corresponding relation.
> > This simplifies matching the partitioning schemes of the joining
> relations.
> > Also it reduces the number of copies of partition bounds floating around
> as
> > we expect that a query will involve multiple partitioned tables following
> > similar partitioning schemes. May be you want to consider this idea while
> > working on (b).
>
> So IIUC, a partitioned relation's (baserel or joinrel) RelOptInfo has only
> the information about partition keys.  They will be matched with query
> restriction quals pruning away any unneeded partitions which happens
> individually for each such parent baserel (within set_append_rel_size() I
> suppose).  Further, two joining relations are eligible to be considered
> for pairwise joining if they have identical partition keys and query
> equi-join quals match the same.  The resulting joinrel will have the same
> partition key (as either joining relation) and will have as many
> partitions as there are in the intersection of sets of partitions of
> joining rels (intersection proceeds by matching partition bounds).
>
> "Partition scheme" structs go into a PlannerInfo list member, one
> corresponding to each partitioned relation - baserel or joinrel, right?
>

Multiple relations (base or join) can share Partition Scheme if they are
partitioned the same way. Each partition scheme also stores the relids of
the base relations partitioned by that scheme.


> As you say, each such struct has the following pieces of information:
> strategy, num_partitions, bounds (and other auxiliary info).  Before
> make_one_rel() starts, the list has one for each partitioned baserel.
>
After make_one_rel() has formed baserel pathlists and before
> make_rel_from_joinlist() is called, are the partition scheme structs of
> processed baserels marked with some information about the pruning activity
> that occurred so far?


Right now pruned partitions are labelled as dummy rels (empty appent
paths). That's enough to detect a pruned partition. I haven't found a need
to label partitioning scheme with pruned partitions for partition-wise join.


> Then as we build successively higher levels of
> joinrels, new entries will be made for those joinrels for which we added
> pairwise join paths, with relids matching the corresponding joinrels.
> Does that make sense?
>
>
I don't think we will make any new partition scheme entry in PlannerInfo
after all the base relations have been considered. Partitionin-wise join
will pick the one suitable for the given join. But in case partition-wise
join needs to make new entries, I will take care of that in my patch.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-01 Thread Magnus Hagander
On Thu, Sep 1, 2016 at 9:43 AM, Michael Paquier 
wrote:

> On Thu, Sep 1, 2016 at 4:41 PM, Magnus Hagander 
> wrote:
> > That's definitely not intended - it's supposed to be 16Mb. And it
> actually
> > writes 16Mb to the tarfile, it's the extraction that doesn't see them.
> That
> > also means that if you get more than one member of the tarfile at this
> > point, it will be corrupt. (It's not corrupt in the .tar.gz case,
> clearly my
> > testing of the very last iteration of the patch forgot to doublecheck
> this
> > with both).
> >
> > Oops. Will fix.
>
> If at the same time you could add some tests in pg_basebackup/t, that
> would be great :)
>

That's definitely on my slightly-longer-term plan. But I've successfully
managed to avoid perl long enough now that looking at the code in those
tests is mighty confusing :) So I need a bunch of readup before I can
figure that out. (yes, that means I've managed to avoid our own discussions
about that style tests on this list quite successfully too :P)

We don't seem to check for similar issues as the one just found in the
existing tests though, do we? As in, we don't actually verify that the xlog
files being streamed are 16Mb? (Or for that matter that the tarfile emitted
by -Ft is actually a tarfile?) Or am I missing some magic somewhere? :)

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-01 Thread Michael Paquier
On Thu, Sep 1, 2016 at 4:41 PM, Magnus Hagander  wrote:
> That's definitely not intended - it's supposed to be 16Mb. And it actually
> writes 16Mb to the tarfile, it's the extraction that doesn't see them. That
> also means that if you get more than one member of the tarfile at this
> point, it will be corrupt. (It's not corrupt in the .tar.gz case, clearly my
> testing of the very last iteration of the patch forgot to doublecheck this
> with both).
>
> Oops. Will fix.

If at the same time you could add some tests in pg_basebackup/t, that
would be great :)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-01 Thread Magnus Hagander
On Thu, Sep 1, 2016 at 9:19 AM, Michael Paquier 
wrote:

> On Thu, Sep 1, 2016 at 6:58 AM, Magnus Hagander 
> wrote:
> > Attached patch adds support for -X stream to work with .tar and .tar.gz
> file
> > formats.
>
> Nice.
>
> > If tar mode is specified, a separate pg_xlog.tar (or .tar.gz) file is
> > created and the data is streamed into it. Regular mode is (should not)
> see
> > any changes in how it works.
>
> Could you use XLOGDIR from xlog_internal.h at least?
>

Yes, that makes sense.


> > The implementation creates a "walmethod" for directory and one for tar,
> > which is basically a set of function pointers that we pass around as
> part of
> > the StreamCtl structure. All calls to modify the files are sent through
> the
> > current method, using the normal open/read/write calls as it is now for
> > directories, and the more complicated method for tar and targz.
>
> That looks pretty cool by looking at the code.
>
> > The tar method doesn't support all things that are required for
> > pg_receivexlog, but I don't think it makes any real sense to have support
> > for pg_receivexlog in tar mode. But it does support all the things that
> > pg_basebackup needs.
>
> Agreed. Your patch is enough complicated.
>
> > Some smaller pieces of functionality like unlinking files on failure and
> > padding files have been moved into the walmethod because they have to be
> > differently implemented (we cannot pre-pad a compressed file -- the size
> > will depend on the compression ration anyway -- for example).
> >
> > AFAICT we never actually documented that -X stream doesn't work with tar
> in
> > the manpage of current versions. Only in the error message. We might
> want to
> > fix that in backbranches.
>
> +1 for documenting that in back-branches.
>
> > In passing this also fixes an XXX comment about not re-lseeking on the
> WAL
> > file all the time -- the walmethod now tracks the current position in the
> > file in a variable.
> >
> > Finally, to make this work, the pring_tar_number() function is now
> exported
> > from port/tar.c along with the other ones already exported from there.
>
> walmethods.c:387:9: warning: comparison of unsigned expression < 0 is
> always false [-Wtautological-compare]
> if (r < 0)
> This patch is generating a warning for me with clang.
>
> I have just tested this feature:
> $ pg_basebackup -X stream -D data -F t
> Which generates that:
> $ ls data
> base.tar pg_xlog.tar
> However after decompressing pg_xlog.tar the segments don't have a correct
> size:
> $ ls -l 0*
> -rw---  1 mpaquier  _guest   3.9M Sep  1 16:12 00010011
>
> Even if that's filling them with zeros during pg_basebackup when a
> segment is done, those should be 16MB to allow users to reuse them
> directly.
>

Huh, that's annoying. I must've broken that when I fixed padding for
compressed files. It forgets the padding when it updates the size of the
tarfile (works fine for compressed files because padding is done at the
end).

That's definitely not intended - it's supposed to be 16Mb. And it actually
writes 16Mb to the tarfile, it's the extraction that doesn't see them. That
also means that if you get more than one member of the tarfile at this
point, it will be corrupt. (It's not corrupt in the .tar.gz case, clearly
my testing of the very last iteration of the patch forgot to doublecheck
this with both).

Oops. Will fix.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/


Re: [HACKERS] Parallel build with MSVC

2016-09-01 Thread Michael Paquier
On Thu, Apr 28, 2016 at 4:16 PM, Christian Ullrich  wrote:
> * Michael Paquier wrote:
>> On Wed, Apr 27, 2016 at 5:15 PM, Christian Ullrich 
>> wrote:
>>> OK then, hopefully last round attached.
>>
>> Thanks. Those are fine in my view. It is hard to tell if a committer
>> is going to have a look at that soon, so I think that it would be
>> wiser to add that to the next CF so as those patches don't fall into
>> oblivion.
>
> Done.

As far as I can see, those patches are still fine, so I switched the
patch as 'ready for committer'.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal for changes to recovery.conf API

2016-09-01 Thread Abhijit Menon-Sen
At 2016-09-01 15:51:11 +0900, michael.paqu...@gmail.com wrote:
>
> -   (errmsg("starting point-in-time recovery to XID %u",
> -   recoveryTargetXid)));
> User loses information if those logs are removed.

Agreed. I'm revising the patch right now, and I decided to leave them.
I'll consider and comment on the remainder of your points after I've
posted an update. Thanks for reading.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup stream xlog to tar

2016-09-01 Thread Michael Paquier
On Thu, Sep 1, 2016 at 6:58 AM, Magnus Hagander  wrote:
> Attached patch adds support for -X stream to work with .tar and .tar.gz file
> formats.

Nice.

> If tar mode is specified, a separate pg_xlog.tar (or .tar.gz) file is
> created and the data is streamed into it. Regular mode is (should not) see
> any changes in how it works.

Could you use XLOGDIR from xlog_internal.h at least?

> The implementation creates a "walmethod" for directory and one for tar,
> which is basically a set of function pointers that we pass around as part of
> the StreamCtl structure. All calls to modify the files are sent through the
> current method, using the normal open/read/write calls as it is now for
> directories, and the more complicated method for tar and targz.

That looks pretty cool by looking at the code.

> The tar method doesn't support all things that are required for
> pg_receivexlog, but I don't think it makes any real sense to have support
> for pg_receivexlog in tar mode. But it does support all the things that
> pg_basebackup needs.

Agreed. Your patch is enough complicated.

> Some smaller pieces of functionality like unlinking files on failure and
> padding files have been moved into the walmethod because they have to be
> differently implemented (we cannot pre-pad a compressed file -- the size
> will depend on the compression ration anyway -- for example).
>
> AFAICT we never actually documented that -X stream doesn't work with tar in
> the manpage of current versions. Only in the error message. We might want to
> fix that in backbranches.

+1 for documenting that in back-branches.

> In passing this also fixes an XXX comment about not re-lseeking on the WAL
> file all the time -- the walmethod now tracks the current position in the
> file in a variable.
>
> Finally, to make this work, the pring_tar_number() function is now exported
> from port/tar.c along with the other ones already exported from there.

walmethods.c:387:9: warning: comparison of unsigned expression < 0 is
always false [-Wtautological-compare]
if (r < 0)
This patch is generating a warning for me with clang.

I have just tested this feature:
$ pg_basebackup -X stream -D data -F t
Which generates that:
$ ls data
base.tar pg_xlog.tar
However after decompressing pg_xlog.tar the segments don't have a correct size:
$ ls -l 0*
-rw---  1 mpaquier  _guest   3.9M Sep  1 16:12 00010011

Even if that's filling them with zeros during pg_basebackup when a
segment is done, those should be 16MB to allow users to reuse them
directly.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Comment on GatherPath.single_copy

2016-09-01 Thread Kyotaro HORIGUCHI
At Wed, 31 Aug 2016 07:26:22 -0400, Tom Lane  wrote in 
<5934.1472642...@sss.pgh.pa.us>
> Robert Haas  writes:
> > On Tue, Aug 30, 2016 at 6:38 PM, Tom Lane  wrote:
> >> Robert, could you fix the documentation for that field so it's
> >> intelligible?
> 
> > Uh, maybe.  The trick, as you've already noted, is finding something
> > better.  Maybe this:
> 
> > -boolsingle_copy;/* path must not be executed >1x */
> > +boolsingle_copy;/* don't execute path in multiple 
> > processes */
> 
> OK by me.
> 
>   regards, tom lane

Me too, thanks.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


  1   2   >