date:20160309

Re: [HACKERS] pg_rewind just doesn't fsync anything?

2016-03-09 Thread Michael Paquier

On Thu, Mar 10, 2016 at 8:37 AM, Abhijit Menon-Sen  wrote:
> At 2016-03-10 08:35:43 +0100, michael.paqu...@gmail.com wrote:
>>
>> > I guess the easiest fix would be to shell out to initdb -s?
>>
>> What do you mean? I am not sure what initdb has to do with that as we
>> have no need for it in pg_rewind.
>
> initdb -S/--sync-only fsyncs everything in the data directory and exits.

Missed your point, good to know that initdb is not doing anything else
with -S than fsyncing everything in PGDATA. Still, I think that we had
better fsync only entries that are modified by pg_rewind, and files
that got updated, and not the whole directory, a target data folder
should be stopped properly to be able to rewind, and it is better to
avoid dependencies between utilities if that's not strictly necessary.
initdb is likely to be installed side-by-side with pg_rewind in any
distribution though.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind just doesn't fsync anything?

2016-03-09 Thread Abhijit Menon-Sen

At 2016-03-10 08:35:43 +0100, michael.paqu...@gmail.com wrote:
>
> > I guess the easiest fix would be to shell out to initdb -s?
> 
> What do you mean? I am not sure what initdb has to do with that as we
> have no need for it in pg_rewind.

initdb -S/--sync-only fsyncs everything in the data directory and exits.

-- Abhijit


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_rewind just doesn't fsync anything?

2016-03-09 Thread Michael Paquier

On Thu, Mar 10, 2016 at 4:43 AM, Andres Freund  wrote:
> how come that the only comment in pg_rewind about fsyncing is '
> void
> close_target_file(void)
> {
> ...
> /* fsync? */
> }
>
> Isn't that a bit, uh, minimal for a utility that's likely to be used in
> failover scenarios?
>
> I think we might actually be "saved" due to
> http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2ce439f33
> because pg_rewind appears to leave the cluster in
>
> ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
> updateControlFile(_new);

Yep, with minimum recovery target changed as well up to the LSN where
pg_rewind began.

> a state that StartupXLOG will treat as needing recovery:
>
> if (ControlFile->state != DB_SHUTDOWNED &&
> ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
> SyncDataDirectory();
>
> but that code went in after pg_rewind, so this certainly can't be an
> intentional save.
> I also don't think it's ok that you need to start the cluster to make it
> safe against a crash?

No, that's not acceptable. One is not obliged to use the rewound data
folder immediately, and my first intuition is that we had better
guarantee that an entry in the file map gets consistent on disk
immediately after being done operating on it. If we get a power loss
after pg_rewind is done we may lose data silently.

> I guess the easiest fix would be to shell out to initdb -s?

What do you mean? I am not sure what initdb has to do with that as we
have no need for it in pg_rewind.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] POC: Cache data in GetSnapshotData()

2016-03-09 Thread Mithun Cy

On Thu, Mar 3, 2016 at 11:50 PM, Robert Haas  wrote:
>What if you apply both this and Amit's clog control log patch(es)?  Maybe
the combination of the two helps substantially more than either >one alone.


I did the above tests along with Amit's clog patch. Machine :8 socket, 64
core. 128 hyperthread.

clients BASE ONLY CLOG CHANGES % Increase ONLY SAVE SNAPSHOT % Increase CLOG
CHANGES + SAVE SNAPSHOT % Increase
64 29247.658034 30855.728835 5.4981181711 29254.532186 0.0235032562
32691.832776 11.7758992463
88 31214.305391 33313.393095 6.7247618606 32109.248609 2.8670931702
35433.655203 13.5173592978
128 30896.673949 34015.362152 10.0939285832 *** *** 34947.296355
13.110221549
256 27183.780921 31192.895437 14.7481857938 *** *** 32873.782735
20.9316056164
With clog changes, perf of caching the snapshot patch improves.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread pokurev

Hi Amit,

Thank you for updating the patch.

> -Original Message-
> From: Amit Langote [mailto:langote_amit...@lab.ntt.co.jp]
> Sent: Thursday, March 10, 2016 3:36 PM
> To: Robert Haas 
> Cc: Kyotaro HORIGUCHI ; Amit Langote
> ; SPS ポクレ ヴィナヤック(三技術)
> ; pgsql-hackers@postgresql.org; SPS 坂野 昌
> 平(三技術) 
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On 2016/03/10 14:29, Amit Langote wrote:
> > I rebased remainder patches (attached).
> >
> > 0001 is a small patch to fix issues reported by Tomas and Vinayak.
> > 0002 and 0003 are WIP patches to implement progress reporting for
> vacuum.
> 
> Oops, in 0002, I wrongly joined with pg_class in the definition of
> pg_stat_progress_vacuum to output the schema-qualified name of the table
> being vacuumed.  That means we need to connect to the correct database,
> which is undesirable. Updated version fixes that (shows database name and
> relid).  You may also have noticed that I said pg_stat_progress_vacuum, not
> pg_stat_vacuum_progress (IMHO, the former is a better name).
> 
> Updated patches attached.
In 0002-
+CREATE VIEW pg_stat_progress_vacuum AS
+SELECT
+S.pid AS pid,
+D.datname AS database,
+S.relid AS relid,
.
.
.
.
+FROM pg_database D, pg_stat_get_progress_info('VACUUM') AS S
+WHERE S.datid = D.oid;
I think we need to use datid instead of datname.
Robert added datid in pg_stat_get_progress_info() and we are using that 
function here.
+values[1] = ObjectIdGetDatum(beentry->st_databaseid);

+DATA(insert OID = 3318 (  pg_stat_get_progress_info   PGNSP PGUID 12 1 
100 0 0 f f f f f t s r 1 0 2249 "25" 
"{25,23,26,26,20,20,20,20,20,20,20,20,20,20}" "{i,o,o,o,o,o,o,o,o,o,o,o,o,o}" 
"{cmdtype,pid,datid,relid,param1,param2,param3,param4,param5,param6,param7,param8,param9,param10}"
 _null_ _null_ pg_stat_get_progress_info _null_ _null_ _null_ ));

So I think it's better to report datid not datname.
The definition of view is simply like:
+CREATE VIEW pg_stat_progress_vacuum AS
+SELECT
+S.pid AS pid,
+S.datid AS datid,
+S.relid AS relid,
+CASE S.param1
+WHEN 1 THEN 'scanning heap'
+WHEN 2 THEN 'vacuuming indexes'
+WHEN 3 THEN 'vacuuming heap'
+WHEN 4 THEN 'cleanup'
+ELSE 'unknown phase'
+END AS processing_phase,
+S.param2 AS total_heap_blocks,
+S.param3 AS current_heap_block,
+S.param4 AS total_index_blocks,
+S.param5 AS index_blocks_done,
+S.param6 AS index_vacuum_count,
+CASE S.param2
+WHEN 0 THEN round(100.0, 2)
+   ELSE round((S.param3 + 1) * 100.0 / S.param2, 2)
+END AS percent_done
+FROM pg_stat_get_progress_info('VACUUM') AS S;

In the pg_stat_activity view, datid and datname are the separate columns. So 
maybe we can add datname as separate column in pg_stat_progress_vacuum, but I 
think it's not required only datid is sufficient.
Any comment?

Regards,
Vinayak Pokale

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Michael Paquier

On Thu, Mar 10, 2016 at 12:12 AM, Alvaro Herrera
 wrote:
> Aleksander Alekseev wrote:
>> pg_receivexlog: could not send replication command "START_REPLICATION":
>> out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
>> again pg_receivexlog: starting log streaming at 0/100 (timeline 1)
>>
>> Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
>> msgLength=3) at fe-protocol3.c:1398 1398  const char
>> *errmsg = NULL;
>> ```
>>
>> Granted this behaviour is a bit better then the current one. But
>> basically it's the same infinite loop only with pauses and warnings. I
>> wonder if this is a behaviour we really want. For instance wouldn't it
>> be better just to terminate an application in out-of-memory case? "Let
>> it crash" as Erlang programmers say.
>
> Hmm.  It would be useful to retry in the case that there is a chance
> that the program releases memory and can continue later.  But if it will
> only stay there doing nothing other than retrying, then that obviously
> will not happen.  One situation where this might help is if the overall
> *system* is short on memory and we expect that situation to resolve
> itself after a while -- after all, if the system is so loaded that it
> can't allocate a few more bytes for the COPY message, then odds are that
> other things are also crashing and eventually enough memory will be
> released that pg_receivexlog can continue.

Yep, that's my assumption regarding that, at some point the system may
succeed, and I don't think that we should break the current behaviors
of pg_receivexlog and pg_recvlogical regarding that in the
back-branches. Now, note that without the patch we actually have the
same problem. Say if OOMs happen continuously in getCopyStart, with
COPY_BOTH libpq would attempt to read the next message continuously
and would keep failing. Except that in this case the caller has no
idea what is happening as things keep running in libpq itself.

> On the other hand, if the system is so loaded, perhaps it's better to
> "let it crash" and have it restart later -- presumably once the admin
> notices the problem and restarts it manually after cleaning up the mess.
>
> If all programs are well behaved and nothing crashes when OOM but they
> all retry instead, then everything will continue to retry infinitely and
> make no progress.  That cannot be good.

That's something we could take care of in those client utilities I
think with a new option like --maximum-retries or similar, but anyway
I think that's a different discussion. The patch I am proposing here
allows a client application to be made aware of OOM errors that
happen. If we don't do something about that first, something like
--maximum-retries would be useless for COPY_BOTH as the client will
never be made aware of the OOM that happened in libpq and would keep
looping inside libpq itself until some memory is freed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-09 Thread Kouhei Kaigai

> -Original Message-
> From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Petr Jelinek
> Sent: Thursday, March 10, 2016 11:01 AM
> To: Kaigai Kouhei(海外 浩平); pgsql-hackers@postgresql.org
> Subject: Re: [HACKERS] Reworks of CustomScan serialization/deserialization
> 
> On 10/03/16 02:18, Kouhei Kaigai wrote:
> >
> >> I am not sure I like the fact that we have this EXTNODENAME_MAX_LEN and
> >> now the CUSTOM_NAME_MAX_LEN with the same length and also they are both
> >> same lenght as NAMEDATALEN I wonder if this shouldn't be somehow
> >> squished to less defines.
> >>
> > Hmm. I just followed the manner in extensible.c, because this label was
> > initially NAMEDATALEN, then Robert changed it with EXTNODENAME_MAX_LEN.
> > I guess he avoid to apply same label on different entities - NAMEDATALEN
> > is a limitation for NameData type, but identifier of extensible-node and
> > custom-scan node are not restricted by.
> >
> 
> Makes sense.
> 
> >> Also in RegisterCustomScanMethods
> >> +  Assert(strlen(methods->CustomName) <= CUSTOM_NAME_MAX_LEN);
> >>
> >> Shouldn't this be actually "if" with ereport() considering this is
> >> public API and extensions can pass anything there? (for that matter same
> >> is true for RegisterExtensibleNodeMethods but that's already committed).
> >>
> > Hmm. I don't have clear answer which is better. The reason why I put
> > Assert() here is that only c-binary extension uses this interface, thus,
> > author will fix up the problem of too long name prior to its release.
> > Of course, if-with-ereport() also informs extension author the name is
> > too long.
> > One downside of Assert() may be, it makes oversight if --enable-cassert
> > was not specified.
> >
> 
> Well that's exactly my problem, this should IMHO throw error even
> without --enable-cassert. It's not like it's some performance sensitive
> API where if would be big problem, ensuring correctness of the input is
> more imporant here IMHO.
>
We may need to fix up RegisterExtensibleNodeMethods() first.

Also, length limitation is (EXTNODENAME_MAX_LEN-1) because the last byte
is consumed by '\0' character. In fact, hash, match and keycopy function
of HTAB for string keys deal with the first (keysize - 1) bytes.
So, strkey(extnodename) == EXTNODENAME_MAX_LEN is not legal.

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



pgsql-v9.6-extensible-namelen-check-by-ereport.patch
Description: pgsql-v9.6-extensible-namelen-check-by-ereport.patch

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Add generate_series(date,date) and generate_series(date,date,integer)

2016-03-09 Thread Michael Paquier

On Wed, Mar 9, 2016 at 12:13 AM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> I'm pretty meh about the whole idea of this function, though,
>> actually, and I don't see a single clear +1 vote for this
>> functionality upthread.  (Apologies if I've missed one.)  In the
>> absence of a few of those, I recommend we reject this.
>
> +1

I'm meh for this patch.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] Request - repeat value of \pset title during \watch interations

2016-03-09 Thread David G. Johnston

Adding -hackers for consideration in the Commitfest.

Thanks!

David J.

>>>Original request by me

http://www.postgresql.org/message-id/CAKFQuwZqjz-je3Z=8jdodym3jm-n2ul4cuqy5vh8n75e5v1...@mail.gmail.com

When executing a query using \watch in psql the first execution of the
query includes "Title is [...]" when \pset title is in use.  Subsequent
executions do not.  Once that first display goes off-screen the information
in the title is no longer readily accessible.  If using \watch for a
long-running monitoring query it can be helpful to incorporate some context
information into the title.

-- Forwarded message --
From: Michael Paquier 
Date: Thu, Jan 28, 2016 at 6:01 AM
Subject: Re: [GENERAL] Request - repeat value of \pset title during \watch
interations
To: "David G. Johnston" 
Cc: Tom Lane , "pgsql-gene...@postgresql.org" <
pgsql-gene...@postgresql.org>


On Thu, Jan 28, 2016 at 1:54 PM, David G. Johnston
 wrote:
> On Wed, Jan 27, 2016 at 9:13 PM, Michael Paquier <
michael.paqu...@gmail.com>
> wrote:
>>
>> On Thu, Jan 28, 2016 at 9:34 AM, David G. Johnston
>>  wrote:
>> > So how about:
>> >
>> > + snprintf(title, strlen(myopt.title) + 50,
>> > + _("Watch every %lds\t%s\t%s"),
>> > +  sleep, head_title, asctime(localtime()));
>>
>> I would just keep the timestamp and the title separated so what do you
>> think about that instead?
>> Watch every Xs   $timestamp
>> $head_title
>
>
> That works.  I like having the title immediately above the table.
>
> The other option that came to mind would be to place the time information
> after the table display while leaving the title before it.  On an output
> that requires more vertical space than is available in the terminal one
> would no longer have to scroll up to confirm last execution time.  If
doing
> this I'd probably get rid of any logic that attempts to center the time
> information on the table and simply leave it left-aligned.

And the example:

OK, attached is an updated patch. How does that look?

   Watch every 5sFri Jan 29 13:06:31 2016

This is a medium length title
repeat


--
 

(1 row)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9750a5b..3241d27 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3020,7 +3020,8 @@ static bool
 do_watch(PQExpBuffer query_buf, long sleep)
 {
 	printQueryOpt myopt = pset.popt;
-	char		title[50];
+	char		 *title;
+	bool		 *head_title = NULL;
 
 	if (!query_buf || query_buf->len <= 0)
 	{
@@ -3034,6 +3035,18 @@ do_watch(PQExpBuffer query_buf, long sleep)
 	 */
 	myopt.topt.pager = 0;
 
+	/*
+	 * Take into account any title present in the user setup as a part of
+	 * what is printed for each iteration by using it as a header.
+	 */
+	if (myopt.title)
+	{
+		title = pg_malloc(strlen(myopt.title) + 50);
+		head_title = pg_strdup(myopt.title);
+	}
+	else
+		title = pg_malloc(50);
+
 	for (;;)
 	{
 		int			res;
@@ -3045,8 +3058,13 @@ do_watch(PQExpBuffer query_buf, long sleep)
 		 * of completion of the command?
 		 */
 		timer = time(NULL);
-		snprintf(title, sizeof(title), _("Watch every %lds\t%s"),
- sleep, asctime(localtime()));
+		if (head_title)
+			snprintf(title, strlen(myopt.title) + 50,
+	 _("Watch every %lds\t%s\n%s"),
+	 sleep, asctime(localtime()), head_title);
+		else
+			snprintf(title, 50, _("Watch every %lds\t%s"),
+	 sleep, asctime(localtime()));
 		myopt.title = title;
 
 		/* Run the query and print out the results */
@@ -3059,7 +3077,11 @@ do_watch(PQExpBuffer query_buf, long sleep)
 		if (res == 0)
 			break;
 		if (res == -1)
+		{
+			pg_free(title);
+			pg_free(head_title);
 			return false;
+		}
 
 		/*
 		 * Set up cancellation of 'watch' via SIGINT.  We redo this each time
@@ -3084,6 +3106,8 @@ do_watch(PQExpBuffer query_buf, long sleep)
 		sigint_interrupt_enabled = false;
 	}
 
+	pg_free(title);
+	pg_free(head_title);
 	return true;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread Amit Langote

On 2016/03/10 14:29, Amit Langote wrote:
> I rebased remainder patches (attached).
> 
> 0001 is a small patch to fix issues reported by Tomas and Vinayak.  0002
> and 0003 are WIP patches to implement progress reporting for vacuum.

Oops, in 0002, I wrongly joined with pg_class in the definition of
pg_stat_progress_vacuum to output the schema-qualified name of the table
being vacuumed.  That means we need to connect to the correct database,
which is undesirable. Updated version fixes that (shows database name and
relid).  You may also have noticed that I said pg_stat_progress_vacuum,
not pg_stat_vacuum_progress (IMHO, the former is a better name).

Updated patches attached.

Thanks,
Amit
>From 7cb702c7fae9fceef3048a82522390844c5a67cc Mon Sep 17 00:00:00 2001
From: amit 
Date: Thu, 10 Mar 2016 11:25:20 +0900
Subject: [PATCH 1/3] Some minor fixes in commit b6fb6471.

---
 src/backend/postmaster/pgstat.c |7 ---
 src/backend/utils/adt/pgstatfuncs.c |2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index ce5da3e..4424cb8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2861,8 +2861,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 /*---
  * pgstat_progress_start_command() -
  *
- * Set st_command in own backend entry.  Also, zero-initialize
- * st_progress_param array.
+ * Set st_progress_command (and st_progress_command_target) in own backend
+ * entry.  Also, zero-initialize st_progress_param array.
  *---
  */
 void
@@ -2904,7 +2904,8 @@ pgstat_progress_update_param(int index, int64 val)
 /*---
  * pgstat_progress_end_command() -
  *
- * Update index'th member in st_progress_param[] of own backend entry.
+ * Reset st_progress_command (and st_progress_command_target) in own backend
+ * entry.  This signals the end of the command.
  *---
  */
 void
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 0c790ff..2fb51fa 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -614,7 +614,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		else
 		{
 			nulls[2] = true;
-			for (i = 1; i < PGSTAT_NUM_PROGRESS_PARAM + 1; i++)
+			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
 nulls[i+3] = true;
 		}
 
-- 
1.7.1

>From 3ca85c000ec2cd6148d0b3adb35aefa7e29ab23d Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 7 Mar 2016 14:38:34 +0900
Subject: [PATCH 2/3] WIP: Implement progress reporting for VACUUM command.

This basically utilizes the pgstat_progress* API to report a handful of
paramters to indicate its progress.  lazy_vacuum_rel() and lazy_scan_heap()
have been altered to report command start, command target table, and the
following parameters: processing phase, number of heap blocks, number of
index blocks (all indexes), current heap block number in the main scan loop
(whenever changes), index blocks vacuumed (once per finished index vacuum),
and number of index vacuum passes (every time when all indexes are vacuumed).
Following processing phases are identified and reported whenever one changes
to another: 'scanning heap', 'vacuuming indexes', 'vacuuming heap', and
'cleanup'.

TODO: find a way to report index pages vacuumed in a more granular manner than
the current report per index vacuumed.

A view named pg_stat_progress_vacuum has been added that shows these values.
---
 doc/src/sgml/monitoring.sgml |  106 ++
 src/backend/catalog/system_views.sql |   24 
 src/backend/commands/vacuumlazy.c|   73 +++-
 src/test/regress/expected/rules.out  |   22 +++
 4 files changed, 224 insertions(+), 1 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 85459d0..082f94c 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -507,6 +507,12 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   yet included in pg_stat_user_functions).
  
 
+ 
+  pg_stat_progress_vacuumpg_stat_progress_vacuum
+  One row for each backend (including autovacuum worker processes) running
+  VACUUM, showing current progress in terms of heap pages it
+  has finished processing.
+ 
 

   
@@ -1822,6 +1828,106 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
controls exactly which functions are tracked.
   
 
+  
+   pg_stat_progress_vacuum View
+   
+
+
+  Column
+  Type
+  Description
+ 
+
+
+   
+
+ pid
+ integer
+ Process ID of backend
+
+
+ database
+ name
+ Name of the database this backend is connected to
+
+
+ relid
+ Oid
+ OID of the table being vacuumed
+
+
+ processing_phase
+

Re: [HACKERS] Freeze avoidance of very large table.

2016-03-09 Thread Masahiko Sawada

Thank you for reviewing!
Attached updated patch.


On Thu, Mar 10, 2016 at 3:37 AM, Robert Haas  wrote:
> On Wed, Mar 9, 2016 at 9:09 AM, Masahiko Sawada
>  wrote: Attached latest 2 patches.
>> * 000 patch : Incorporated the review comments and made rewriting
>> logic more clearly.
>
> That's better, thanks.  But your comments don't survive pgindent.
> After running pgindent, I get this:
>
> +   /*
> +* These old_* variables point to old visibility map page.
> +*
> +* cur_old: Points to current position on old
> page. blkend_old :
> +* Points to end of old block. break_old  : Points to
> old page break
> +* position for rewriting a new page. After wrote a
> new page, old_end
> +* proceeds rewriteVmBytesPerPgae bytes.
> +*/
>
> You need to either surround this sort of thing with dashes to make
> pgindent ignore it, or, probably better, rewrite it using complete
> sentences that together form a paragraph.

Fixed.

>
> +   Oid pg_database_oid;/* OID of
> pg_database relation */
>
> Not used anywhere?

Fixed.

> Instead of vm_need_rewrite, how about vm_must_add_frozenbit?

Fixed.

> Can you explain the changes to test.sh?

Current regression test scenario is,
1. Do 'make check' on pre-upgrade cluster
2. Dump relallvisible values of all relation in pre-upgrade cluster to
vm_test1.txt
3. Do pg_upgrade
4. Do analyze (not vacuum), dump relallvisibile values of all relation
in post-upgrade cluster to vm_test2.txt
5. Compare between vm_test1.txt and vm_test2.txt

That is, regression test compares between relallvisible values in
pre-upgrade cluster and post-upgrade cluster.
But because test.sh always uses pre/post clusters with same catalog
version, I realized that we cannot ensure that visibility map
rewriting is processed successfully on test.sh framework.
Rewriting visibility map never be executed.
We might need to have another framework for rewriting visibility map page..

Regards,

--
Masahiko Sawada
diff --git a/src/bin/pg_upgrade/file.c b/src/bin/pg_upgrade/file.c
index 2a99a28..7c5bfa6 100644
--- a/src/bin/pg_upgrade/file.c
+++ b/src/bin/pg_upgrade/file.c
@@ -9,7 +9,11 @@
 
 #include "postgres_fe.h"
 
+#include "access/visibilitymap.h"
 #include "pg_upgrade.h"
+#include "storage/bufpage.h"
+#include "storage/checksum.h"
+#include "storage/checksum_impl.h"
 
 #include 
 
@@ -138,6 +142,130 @@ copy_file(const char *srcfile, const char *dstfile, bool force)
 #endif
 
 
+/*
+ * rewriteVisibilityMap()
+ *
+ * In versions of PostgreSQL prior to catversion 201602181, PostgreSQL's
+ * visibility map included one bit per heap page; it now includes two.
+ * When upgrading a cluster from before that time to a current PostgreSQL
+ * version, we could refuse to copy visibility maps from the old cluster
+ * to the new cluster; the next VACUUM would recreate them, but at the
+ * price of scanning the entire table.  So, instead, we rewrite the old
+ * visibility maps in the new format.  That way, the all-visible bit
+ * remains set for the pages for which it was set previously.  The
+ * all-frozen bit is never set by this conversion; we leave that to
+ * VACUUM.
+ */
+const char *
+rewriteVisibilityMap(const char *fromfile, const char *tofile, bool force)
+{
+#define BITS_PER_HEAPBLOCK_OLD 1
+
+	int			src_fd = 0;
+	int			dst_fd = 0;
+	char		buffer[BLCKSZ];
+	ssize_t 	bytesRead;
+	int			rewriteVmBytesPerPage;
+	BlockNumber	blkno = 0;
+
+	/* Compute we need how many old page bytes to rewrite a new page */
+	rewriteVmBytesPerPage = (BLCKSZ - SizeOfPageHeaderData) / 2;
+
+	/* Reset errno */
+	errno = 0;
+
+	if ((fromfile == NULL) || (tofile == NULL))
+		return getErrorText();
+
+	if ((src_fd = open(fromfile, O_RDONLY, 0)) < 0)
+		goto err;
+
+	if ((dst_fd = open(tofile, O_RDWR | O_CREAT | (force ? 0 : O_EXCL), S_IRUSR | S_IWUSR)) < 0)
+		goto err;
+
+	/*
+	 * Turn each visibility map page into 2 pages one by one.
+	 * Rewritten 2 pages have same page header as old page had.
+	 */
+	while ((bytesRead = read(src_fd, buffer, BLCKSZ)) == BLCKSZ)
+	{
+		char	*cur_old, *break_old, *blkend_old;
+		PageHeaderData	pageheader;
+
+		/* Save the page header data */
+		memcpy(, buffer, SizeOfPageHeaderData);
+
+		/*
+		 * These old_* variables point to old visibility map page.
+		 * cur_old points to current potision on old page. blend_old
+		 * points to end of old block. break_old points to old page
+		 * break position for rewritin a new page. After wrote a new
+		 * page, break_old proceeds rewriteVmBytesPerPage bytes.
+		 */
+		cur_old = buffer + SizeOfPageHeaderData;
+		blkend_old = buffer + bytesRead;
+		break_old = cur_old + rewriteVmBytesPerPage;
+
+		while (blkend_old >= break_old)
+		{
+			char	vmbuf[BLCKSZ];
+			char	*cur_new = vmbuf;
+
+			/* Copy page header in advance */
+			memcpy(vmbuf, , SizeOfPageHeaderData);
+
+

Re: [HACKERS] Improving replay of XLOG_BTREE_VACUUM records

2016-03-09 Thread Michael Paquier

On Thu, Mar 10, 2016 at 1:29 AM, David Steele  wrote:
> On 1/8/16 9:34 AM, Alvaro Herrera wrote:
>> Simon Riggs wrote:
>>>
>>> On 8 January 2016 at 13:36, Alvaro Herrera 
>>> wrote:

 I would agree except for the observation on toast indexes.  I think
 that's an important enough use case that perhaps we should have both.
>>>
>>> The exclusion of toast indexes is something we can remove also, I have
>>> recently discovered. When we access toast data we ignore MVCC, but we
>>> still
>>> have the toast pointer and chunkid to use for rechecking our scan
>>> results.
>>> So a later patch will add some rechecks.
>>
>> Ah, interesting, glad to hear.  I take it you're pushing your patch
>> soon, then?
>
> ISTM that this patch should be "returned with feedback" or "rejected" based
> on the thread.  I'm marking it "waiting for author" for the time being.

I think that we are still waiting for some input from Simon here...
Simon, are you going to finish wrapping up your other patch?
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread Amit Langote

On 2016/03/10 2:16, Robert Haas wrote:
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
>  wrote:
>> On 2016/03/09 10:11, Amit Langote wrote:
>>> The attached revision addresses above and one of Horiguchi-san's comments
>>> in his email yesterday.
>>
>> I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
>> so many versions.
> 
> I've committed 0001 with heavy revisions.  Just because we don't need
> an SQL-visible function to clear the command progress doesn't mean we
> don't need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view
> output, adjusted the comments, and so on.  Please rebase the remainder
> of the series.  Thanks.

Great, thanks a lot for the review and committing in much better shape!

I rebased remainder patches (attached).

0001 is a small patch to fix issues reported by Tomas and Vinayak.  0002
and 0003 are WIP patches to implement progress reporting for vacuum.

Thanks,
Amit
>From 7cb702c7fae9fceef3048a82522390844c5a67cc Mon Sep 17 00:00:00 2001
From: amit 
Date: Thu, 10 Mar 2016 11:25:20 +0900
Subject: [PATCH 1/3] Some minor fixes in commit b6fb6471.

---
 src/backend/postmaster/pgstat.c |7 ---
 src/backend/utils/adt/pgstatfuncs.c |2 +-
 2 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/src/backend/postmaster/pgstat.c b/src/backend/postmaster/pgstat.c
index ce5da3e..4424cb8 100644
--- a/src/backend/postmaster/pgstat.c
+++ b/src/backend/postmaster/pgstat.c
@@ -2861,8 +2861,8 @@ pgstat_report_activity(BackendState state, const char *cmd_str)
 /*---
  * pgstat_progress_start_command() -
  *
- * Set st_command in own backend entry.  Also, zero-initialize
- * st_progress_param array.
+ * Set st_progress_command (and st_progress_command_target) in own backend
+ * entry.  Also, zero-initialize st_progress_param array.
  *---
  */
 void
@@ -2904,7 +2904,8 @@ pgstat_progress_update_param(int index, int64 val)
 /*---
  * pgstat_progress_end_command() -
  *
- * Update index'th member in st_progress_param[] of own backend entry.
+ * Reset st_progress_command (and st_progress_command_target) in own backend
+ * entry.  This signals the end of the command.
  *---
  */
 void
diff --git a/src/backend/utils/adt/pgstatfuncs.c b/src/backend/utils/adt/pgstatfuncs.c
index 0c790ff..2fb51fa 100644
--- a/src/backend/utils/adt/pgstatfuncs.c
+++ b/src/backend/utils/adt/pgstatfuncs.c
@@ -614,7 +614,7 @@ pg_stat_get_progress_info(PG_FUNCTION_ARGS)
 		else
 		{
 			nulls[2] = true;
-			for (i = 1; i < PGSTAT_NUM_PROGRESS_PARAM + 1; i++)
+			for (i = 0; i < PGSTAT_NUM_PROGRESS_PARAM; i++)
 nulls[i+3] = true;
 		}
 
-- 
1.7.1

>From 4567da933b25a9e23fe1a72a6994d3a9b7bc1ea4 Mon Sep 17 00:00:00 2001
From: amit 
Date: Mon, 7 Mar 2016 14:38:34 +0900
Subject: [PATCH 2/3] WIP: Implement progress reporting for VACUUM command.

This basically utilizes the pgstat_progress* API to report a handful of
paramters to indicate its progress.  lazy_vacuum_rel() and lazy_scan_heap()
have been altered to report command start, command target table, and the
following parameters: processing phase, number of heap blocks, number of
index blocks (all indexes), current heap block number in the main scan loop
(whenever changes), index blocks vacuumed (once per finished index vacuum),
and number of index vacuum passes (every time when all indexes are vacuumed).
Following processing phases are identified and reported whenever one changes
to another: 'scanning heap', 'vacuuming indexes', 'vacuuming heap', and
'cleanup'.

TODO: find a way to report index pages vacuumed in a more granular manner than
the current report per index vacuumed.

A view named pg_stat_vacuum_progress has been added that shows these values.
---
 doc/src/sgml/monitoring.sgml |  106 ++
 src/backend/catalog/system_views.sql |   24 
 src/backend/commands/vacuumlazy.c|   73 +++-
 src/test/regress/expected/rules.out  |   24 
 4 files changed, 226 insertions(+), 1 deletions(-)

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index 85459d0..544f959 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -507,6 +507,12 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
   yet included in pg_stat_user_functions).
  
 
+ 
+  pg_stat_progress_vacuumpg_stat_progress_vacuum
+  One row for each backend (including autovacuum worker processes) running
+  VACUUM, showing current progress in terms of heap pages it
+  has finished processing.
+ 
 

   
@@ -1822,6 +1828,106 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
controls exactly which functions are tracked.
   
 
+  
+   pg_stat_progress_vacuum View
+   
+

Re: [HACKERS] RFC: replace pg_stat_activity.waiting with something more descriptive

2016-03-09 Thread Amit Kapila

On Wed, Mar 9, 2016 at 7:17 PM, Robert Haas  wrote:
>
> On Wed, Mar 9, 2016 at 8:31 AM, Amit Kapila 
wrote:
> > Thanks for the suggestion.  I have updated the patch to include
wait_event_type information in the wait_event table.
>
> I think we should remove "a server process is" from all of these entries.
>
> Also, I think this kind of thing should be tightened up:
>
> + A server process is waiting on any one of the
commit_timestamp
> + buffer locks to read or write the commit_timestamp page in the
> + pg_commit_ts subdirectory.
>
> I'd just write: Waiting to read or write a commit timestamp buffer.
>

Okay, changed as per suggestion and fixed the morerows issue pointed by
Thom.

With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


extend_pg_stat_activity_v14.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pushing down sorted joins

2016-03-09 Thread Ashutosh Bapat

On Wed, Mar 9, 2016 at 9:22 PM, Robert Haas  wrote:

> On Wed, Mar 9, 2016 at 2:23 AM, Ashutosh Bapat
>  wrote:
> > [ new patch ]
>
> This looks OK to me.  Committed!
>
>
Thanks.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [HACKERS] pstrdup(TextDatumGetCString(foo)) ?

2016-03-09 Thread Tom Lane

Chapman Flack  writes:
> I am encountering, here and there, an idiom like
>   pstrdup(TextDatumGetCString(foo))

> or a pre-8.4 version,

>   pstrdup(DatumGetCString(DirectFunctionCall1(textout, foo)))

> It's leading me to question my sanity because it appears to me
> that both text_to_cstring (underlying TextDatumGetCString) and
> textout already return a string palloc'd in the current context,
> and that pstrdup (without any change of context) can't be accomplishing
> anything. I'm probably missing something crucial, but what?

No, you're right: pstrdup in that context is a useless waste of cycles and
memory.

A quick grep in the current PG sources shows me only two occurrences of
the former, both in the same function and both doubtless introduced by
the same misinformed author.  I find no occurrences of the latter.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pstrdup(TextDatumGetCString(foo)) ?

2016-03-09 Thread Chapman Flack

I am encountering, here and there, an idiom like

  pstrdup(TextDatumGetCString(foo))

or a pre-8.4 version,

  pstrdup(DatumGetCString(DirectFunctionCall1(textout, foo)))

It's leading me to question my sanity because it appears to me
that both text_to_cstring (underlying TextDatumGetCString) and
textout already return a string palloc'd in the current context,
and that pstrdup (without any change of context) can't be accomplishing
anything. I'm probably missing something crucial, but what?

-Chap


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] pg_rewind just doesn't fsync anything?

2016-03-09 Thread Andres Freund

Hi,

how come that the only comment in pg_rewind about fsyncing is '
void
close_target_file(void)
{
...
/* fsync? */
}

Isn't that a bit, uh, minimal for a utility that's likely to be used in
failover scenarios?

I think we might actually be "saved" due to
http://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=2ce439f33
because pg_rewind appears to leave the cluster in

ControlFile_new.state = DB_IN_ARCHIVE_RECOVERY;
updateControlFile(_new);

a state that StartupXLOG will treat as needing recovery:

if (ControlFile->state != DB_SHUTDOWNED &&
ControlFile->state != DB_SHUTDOWNED_IN_RECOVERY)
SyncDataDirectory();

but that code went in after pg_rewind, so this certainly can't be an
intentional save.

I also don't think it's ok that you need to start the cluster to make it
safe against a crash?

I guess the easiest fix would be to shell out to initdb -s?

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] auto_explain sample rate

2016-03-09 Thread Petr Jelinek


On 17/02/16 01:17, Julien Rouhaud wrote:


Agreed, it's too obscure. Attached v4 fixes as you said.



Seems to be simple enough patch and works. However I would like 
documentation to say that the range is 0 to 1 and represents fraction of 
the queries sampled, because right now both the GUC description and the 
documentation say it's in percent but that's not really true as percent 
is 0 to 100.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] silent data loss with ext4 / all current versions

2016-03-09 Thread Andres Freund

On 2016-03-07 21:55:52 -0800, Andres Freund wrote:
> Here's my updated version.
> 
> Note that I've split the patch into two. One for the infrastructure, and
> one for the callsites.

I've finally pushed these, after making a number of mostly cosmetic
fixes. The only of real consequence is that I've removed the durable_*
call from the renames to .deleted in xlog[archive].c - these don't need
to be durable, and are windows only. Oh, and that there was a typo in
the !HAVE_WORKING_LINK case.

There's a *lot* of version skew here: not-present functionality, moved
files, different APIs - we got it all.  I've tried to check in each
version whether we're missing fsyncs for renames and everything.
Michael, *please* double check the diffs for the different branches.

Note that we currently have some frontend programs with the equivalent
problem. Most importantly receivelog.c (pg_basebackup/pg_recveivexlog)
are missing pretty much the same directory fsyncs.  And at least for
pg_recvxlog it's critical, especially now that receivexlog support
syncrep.  I've not done anything about that; there's pretty much no
chance to share backend code with the frontend in the back-branches.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Relation extension scalability

2016-03-09 Thread Petr Jelinek


On 10/03/16 02:53, Dilip Kumar wrote:


Attaching a latest patch.



Hmm, why did you remove the comment above the call to 
UnlockRelationForExtension? It still seems relevant, maybe with some 
minor modification?


Also there is a bit of whitespace mess inside the conditional lock block.

--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-09 Thread Petr Jelinek

On 10/03/16 02:18, Kouhei Kaigai wrote:
> 
>> I am not sure I like the fact that we have this EXTNODENAME_MAX_LEN and
>> now the CUSTOM_NAME_MAX_LEN with the same length and also they are both
>> same lenght as NAMEDATALEN I wonder if this shouldn't be somehow
>> squished to less defines.
>>
> Hmm. I just followed the manner in extensible.c, because this label was
> initially NAMEDATALEN, then Robert changed it with EXTNODENAME_MAX_LEN.
> I guess he avoid to apply same label on different entities - NAMEDATALEN
> is a limitation for NameData type, but identifier of extensible-node and
> custom-scan node are not restricted by.
> 

Makes sense.

>> Also in RegisterCustomScanMethods
>> +Assert(strlen(methods->CustomName) <= CUSTOM_NAME_MAX_LEN);
>>
>> Shouldn't this be actually "if" with ereport() considering this is
>> public API and extensions can pass anything there? (for that matter same
>> is true for RegisterExtensibleNodeMethods but that's already committed).
>>
> Hmm. I don't have clear answer which is better. The reason why I put
> Assert() here is that only c-binary extension uses this interface, thus,
> author will fix up the problem of too long name prior to its release.
> Of course, if-with-ereport() also informs extension author the name is
> too long.
> One downside of Assert() may be, it makes oversight if --enable-cassert
> was not specified.
> 

Well that's exactly my problem, this should IMHO throw error even
without --enable-cassert. It's not like it's some performance sensitive
API where if would be big problem, ensuring correctness of the input is
more imporant here IMHO.


-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread pokurev

Hi,
Thank you very much for committing this feature.
> -Original Message-
> From: Robert Haas [mailto:robertmh...@gmail.com]
> Sent: Thursday, March 10, 2016 2:17 AM
> To: Amit Langote 
> Cc: Kyotaro HORIGUCHI ; Amit Langote
> ; SPS ポクレ ヴィナヤック(三技術)
> ; pgsql-hackers@postgresql.org; SPS 坂野 昌
> 平(三技術) 
> Subject: Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.
> 
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
>  wrote:
> > On 2016/03/09 10:11, Amit Langote wrote:
> >> The attached revision addresses above and one of Horiguchi-san's
> >> comments in his email yesterday.
> >
> > I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry
> > about so many versions.
> 
> I've committed 0001 with heavy revisions.  Just because we don't need an
> SQL-visible function to clear the command progress doesn't mean we don't
> need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view output,
> adjusted the comments, and so on.  Please rebase the remainder of the
> series.  Thanks.
Some minor typos need to fix.
+/*---
 + * pgstat_progress_start_command() -
 + *
 + * Set st_command in own backend entry.  Also, zero-initialize
 + * st_progress_param array.
 + *---
 + */
In the description we need to use st_progress_command instead of st_command.

+/*---
 + * pgstat_progress_end_command() -
 + *
 + * Update index'th member in st_progress_param[] of own backend entry.
 + *---
 + */
Here also need to change the description.

Regards,
Vinayak Pokale

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-09 Thread Craig Ringer

On 10 March 2016 at 05:30, Alvaro Herrera  wrote:

> Craig Ringer wrote:
>
> > 004 allows filtering on RecursiveCopy by a predicate function. Needed for
> > filesystem level backups (007). It could probably be squashed with 007 if
> > desired.
>
> I pushed this after some tinkering:
>
> * filtering applies to all directory entries, not just files.  So you
> can filter a subdirectory, for example.  If you have symlinks (which you
> know this module will bomb out on) you can skip them too.
>
> * The filter sub receives the path relative to the initial source dir,
> rather than the absolute path.  That part was just making it difficult
> to use; in your POD example you were testing /pg_log/ as a regex, which
> skipped the *files* but not the directory itself.


That was actually intentional, but I agree that your change is better.

Thanks for pushing.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Relation extension scalability

2016-03-09 Thread Dilip Kumar

On Wed, Mar 9, 2016 at 1:39 AM, Robert Haas  wrote:

> LockWaiterCount() bravely accesses a shared memory data structure that
> is mutable with no locking at all.  That might actually be safe for
> our purposes, but I think it would be better to take the partition
> lock in shared mode if it doesn't cost too much performance.  If
> that's too expensive, then it should at least have a comment
> explaining (1) that it is doing this without the lock and (2) why
> that's safe (sketch: the LOCK can't go away because we hold it, and
> nRequested could change but we don't mind a stale value, and a 32-bit
> read won't be torn).
>

With LWLock also performance are equally good so added the lock.


>
> A few of the other functions in here also lack comments, and perhaps
> should have them.
>
> The changes to RelationGetBufferForTuple need a visit from the
> refactoring police.  Look at the way you are calling
> RelationAddOneBlock.  The logic looks about like this:
>
> if (needLock)
> {
>   if (trylock relation for extension)
> RelationAddOneBlock();
>   else
>   {
> lock relation for extension;
> if (use fsm)
> {
>   complicated;
> }
> else
>   RelationAddOneBlock();
> }
> else
>   RelationAddOneBlock();
>
> So basically you want to do the RelationAddOneBlock() thing if
> !needLock || !use_fsm || can't trylock.  See if you can rearrange the
> code so that there's only one fallthrough call to
> RelationAddOneBlock() instead of three separate ones.
>

Actually in every case we need one blocks, So I have re factored it and
RelationAddOneBlock is now out of any condition.


>
> Also, consider moving the code that adds multiple blocks at a time to
> its own function instead of including it all in line.
>

Done

Attaching a latest patch.



-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com
diff --git a/src/backend/access/heap/hio.c b/src/backend/access/heap/hio.c
index 8140418..eb4ee0c 100644
--- a/src/backend/access/heap/hio.c
+++ b/src/backend/access/heap/hio.c
@@ -169,6 +169,86 @@ GetVisibilityMapPins(Relation relation, Buffer buffer1, Buffer buffer2,
 }
 
 /*
+ * RelationAddOneBlock
+ *
+ * Extend relation by one block and lock the buffer
+ */
+static Buffer
+RelationAddOneBlock(Relation relation, Buffer otherBuffer, BulkInsertState bistate)
+{
+	Buffer buffer;
+	/*
+	 * XXX This does an lseek - rather expensive - but at the moment it is the
+	 * only way to accurately determine how many blocks are in a relation.  Is
+	 * it worth keeping an accurate file length in shared memory someplace,
+	 * rather than relying on the kernel to do it for us?
+	 */
+	buffer = ReadBufferBI(relation, P_NEW, bistate);
+
+	/*
+	 * We can be certain that locking the otherBuffer first is OK, since
+	 * it must have a lower page number.
+	 */
+	if (otherBuffer != InvalidBuffer)
+		LockBuffer(otherBuffer, BUFFER_LOCK_EXCLUSIVE);
+
+	/*
+	 * Now acquire lock on the new page.
+	 */
+	LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+
+	return buffer;
+}
+/*
+ * RelationAddExtraBlocks
+ *
+ * Extend extra blocks for the relations to avoid the future contention
+ * on the relation extension lock.
+ */
+
+static void
+RelationAddExtraBlocks(Relation relation, BulkInsertState bistate)
+{
+	Page		page;
+	Size		freespace;
+	BlockNumber blockNum;
+	int			extraBlocks = 0;
+	int 		lockWaiters = 0;
+	Buffer		buffer;
+
+	/*
+	 * For calculating number of extra blocks to extend, find the level
+	 * of contention on this lock, by getting the requester of this lock
+	 * and add extra blocks in multiple of waiters.
+	 */
+	lockWaiters = RelationExtensionLockWaiter(relation);
+
+	extraBlocks = lockWaiters * 20;
+
+	while (extraBlocks--)
+	{
+		/*
+		 * Here we are adding extra blocks to the relation after
+		 * adding each block update the information in FSM so that
+		 * other backend running parallel can find the block.
+		 */
+		buffer = ReadBufferBI(relation, P_NEW, bistate);
+
+		LockBuffer(buffer, BUFFER_LOCK_EXCLUSIVE);
+
+		page = BufferGetPage(buffer);
+		PageInit(page, BufferGetPageSize(buffer), 0);
+		freespace = PageGetHeapFreeSpace(page);
+		MarkBufferDirty(buffer);
+		blockNum = BufferGetBlockNumber(buffer);
+
+		UnlockReleaseBuffer(buffer);
+
+		RecordPageWithFreeSpace(relation, blockNum, freespace);
+	}
+}
+
+/*
  * RelationGetBufferForTuple
  *
  *	Returns pinned and exclusive-locked buffer of a page in given relation
@@ -233,10 +313,11 @@ RelationGetBufferForTuple(Relation relation, Size len,
 	bool		use_fsm = !(options & HEAP_INSERT_SKIP_FSM);
 	Buffer		buffer = InvalidBuffer;
 	Page		page;
-	Size		pageFreeSpace,
-saveFreeSpace;
+	Size		pageFreeSpace = 0,
+saveFreeSpace = 0;
 	BlockNumber targetBlock,
-otherBlock;
+otherBlock,
+lastValidBlock = InvalidBlockNumber;
 	bool		needLock;
 
 	len = MAXALIGN(len);		/* be conservative */
@@ -308,6 +389,7 @@ RelationGetBufferForTuple(Relation relation, Size len,
 		}
 	}
 
+loop:
 	while (targetBlock !=

Re: [HACKERS] Proposal: RETURNING primary_key()

2016-03-09 Thread Craig Ringer

On 10 March 2016 at 00:41, Igal @ Lucee.org  wrote:

> On 3/8/2016 5:12 PM, Craig Ringer wrote:
>
>> One of the worst problems (IMO) is in the driver architecture its self.
>> It attempts to prevent blocking by guestimating the server's send buffer
>> state and its recv buffer state, trying to stop them filling and causing
>> the server to block on writes. It should just avoid blocking on its own
>> send buffer, which it can control with confidence. Or use some of Java's
>> rather good concurrency/threading features to simultaneously consume data
>> from the receive buffer and write to the send buffer when needed, like
>> pgjdbc-ng does.
>>
>
> Are there good reasons to use pgjdbc over pgjdbc-ng then?
>
>
Maturity, support for older versions (-ng just punts on support for
anything except new releases) and older JDBC specs, completeness of support
for some extensions. TBH I haven't done a ton with -ng yet.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-09 Thread Kouhei Kaigai

> On 29/02/16 13:07, Kouhei Kaigai wrote:
> >
> > I'd like to adjust a few of custom-scan interface prior to v9.6 freeze.
> >
> > The major point is serialization/deserialization mechanism.
> > Now, extension has to give LibraryName and SymbolName to reproduce
> > same CustomScanMethods on the background worker process side. Indeed,
> > it is sufficient information to pull the table of function pointers.
> >
> > On the other hands, we now have different mechanism to wrap private
> > information - extensible node. It requires extensions to register its
> > ExtensibleNodeMethods identified by name, usually, on _PG_init() time.
> > It is also reasonable way to reproduce same objects on background
> > worker side.
> >
> > However, mixture of two different ways is not good. My preference is
> > what extensible-node is doing rather than what custom-scan is currently
> > doing.
> > The attached patch allows extension to register CustomScanMethods once,
> > then readFunc.c can pull this table by CustomName in string form.
> >
> 
> Agreed, but this will break compatibility right?
>
The manner to pass a pair of library-name and symbol-name is a new feature
in v9.6, not in v9.5, so it is now the last chance to fix up the interface
requirement.

> > I'm not 100% certain whether "nodes/custom-apis.h" is the best location,
> > but somewhere we can put these declarations rather than the primitive
> > header files might be needed.
> 
> custom-apis.c does not sound like right name to me, maybe it can be just
> custom.c but custom.h might be bit too generic, maybe custom_node.h
>
OK, custom_node.h may be better.

> I am not sure I like the fact that we have this EXTNODENAME_MAX_LEN and
> now the CUSTOM_NAME_MAX_LEN with the same length and also they are both
> same lenght as NAMEDATALEN I wonder if this shouldn't be somehow
> squished to less defines.
>
Hmm. I just followed the manner in extensible.c, because this label was
initially NAMEDATALEN, then Robert changed it with EXTNODENAME_MAX_LEN.
I guess he avoid to apply same label on different entities - NAMEDATALEN
is a limitation for NameData type, but identifier of extensible-node and
custom-scan node are not restricted by.

> Also in RegisterCustomScanMethods
> + Assert(strlen(methods->CustomName) <= CUSTOM_NAME_MAX_LEN);
> 
> Shouldn't this be actually "if" with ereport() considering this is
> public API and extensions can pass anything there? (for that matter same
> is true for RegisterExtensibleNodeMethods but that's already committed).
>
Hmm. I don't have clear answer which is better. The reason why I put
Assert() here is that only c-binary extension uses this interface, thus,
author will fix up the problem of too long name prior to its release.
Of course, if-with-ereport() also informs extension author the name is
too long.
One downside of Assert() may be, it makes oversight if --enable-cassert
was not specified.

> Other than that this seems like straight conversion to same basic
> template as extensible nodes so I think it's ok.
> 

Thanks,
--
NEC Business Creation Division / PG-Strom Project
KaiGai Kohei 



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Parallel Aggregate

2016-03-09 Thread Haribabu Kommi

On Mon, Mar 7, 2016 at 4:39 PM, Tom Lane  wrote:
> Haribabu Kommi  writes:
>> 2. Temporary fix for float aggregate types in _equalAggref because of
>> a change in aggtype to trans type, otherwise the parallel aggregation
>> plan failure in set_plan_references. whenever the aggtype is not matched,
>> it verifies with the trans type also.
>
> That is a completely unacceptable kluge.  Quite aside from being ugly as
> sin, it probably breaks more things than it fixes, first because it breaks
> the fundamental semantics of equal() across the board, and second because
> it puts catalog lookups into equal(), which *will* cause problems.  You
> can not expect that this will get committed, not even as a "temporary fix".

I am not able to find a better solution to handle this problem, i will provide
the details of the problem and why I did the change, so if you can provide
some point where to look into, that would be helpful.

In parallel aggregate, as the aggregate operation is divided into two steps
such as finalize and partial aggregate. The partial aggregate is executed
in the worker and returns the results of transition data which is of type
aggtranstype. This can work fine even if we don't change the targetlist
aggref return type from aggtype to aggtranstype for aggregates whose
aggtype is a variable length data type. The output slot that is generated
with variable length type, so even if we send the aggtrantype data that
is also a variable length, this can work.

But when it comes to the float aggregates, the aggtype is a fixed length
and aggtranstype is a variable length data type. so if we try to change
the aggtype of a aggref in set_plan_references function with aggtrantype
only the partial aggregate targetlist is getting changed, because the
set_plan_references works from top of the plan.

To avoid this problem, I changed the target list type during the partial
aggregate path generation itself and that leads to failure in _equalAggref
function in set_plan_references. Because of which I put the temporary
fix.

Do you have any point in handling this problem?

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] statistics for array types

2016-03-09 Thread David Steele


On 1/18/16 11:24 AM, Alvaro Herrera wrote:

Alexander Korotkov wrote:


The patch implementing my idea above is attached.

What's the status here?  Jeff, did you have a look at Alexander's
version of your patch?  Tomas, does this patch satisfy your concerns?


This was marked as "needs review" but I have changed it to "waiting for 
author".


Thanks,
-David


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] Client Log Output Filtering

2016-03-09 Thread Petr Jelinek


On 03/02/16 05:02, Robert Haas wrote:

On Mon, Feb 1, 2016 at 7:24 PM, David Steele  wrote:


I have attached a patch that adds an ereport() macro to suppress client
output for a single report call (applies cleanly on 1d0c3b3).  I'll also
move it to the next CF.


I don't see any reason not to accept this.



Yes, the idea seems sane.

Looking at the code, this adds bool hide_from_client to edata which is 
not initialized in errstart so that needs to be fixed.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Improving replay of XLOG_BTREE_VACUUM records

2016-03-09 Thread David Steele


On 1/8/16 9:34 AM, Alvaro Herrera wrote:

Simon Riggs wrote:

On 8 January 2016 at 13:36, Alvaro Herrera  wrote:

I would agree except for the observation on toast indexes.  I think
that's an important enough use case that perhaps we should have both.

The exclusion of toast indexes is something we can remove also, I have
recently discovered. When we access toast data we ignore MVCC, but we still
have the toast pointer and chunkid to use for rechecking our scan results.
So a later patch will add some rechecks.

Ah, interesting, glad to hear.  I take it you're pushing your patch
soon, then?


ISTM that this patch should be "returned with feedback" or "rejected" 
based on the thread.  I'm marking it "waiting for author" for the time 
being.


Thanks,

--
-David
da...@pgmasters.net



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Reworks of CustomScan serialization/deserialization

2016-03-09 Thread Petr Jelinek

Hi,

On 29/02/16 13:07, Kouhei Kaigai wrote:
> 
> I'd like to adjust a few of custom-scan interface prior to v9.6 freeze.
> 
> The major point is serialization/deserialization mechanism.
> Now, extension has to give LibraryName and SymbolName to reproduce
> same CustomScanMethods on the background worker process side. Indeed,
> it is sufficient information to pull the table of function pointers.
> 
> On the other hands, we now have different mechanism to wrap private
> information - extensible node. It requires extensions to register its
> ExtensibleNodeMethods identified by name, usually, on _PG_init() time.
> It is also reasonable way to reproduce same objects on background
> worker side.
> 
> However, mixture of two different ways is not good. My preference is
> what extensible-node is doing rather than what custom-scan is currently
> doing.
> The attached patch allows extension to register CustomScanMethods once,
> then readFunc.c can pull this table by CustomName in string form.
> 

Agreed, but this will break compatibility right?

> I'm not 100% certain whether "nodes/custom-apis.h" is the best location,
> but somewhere we can put these declarations rather than the primitive
> header files might be needed.

custom-apis.c does not sound like right name to me, maybe it can be just
custom.c but custom.h might be bit too generic, maybe custom_node.h

I am not sure I like the fact that we have this EXTNODENAME_MAX_LEN and
now the CUSTOM_NAME_MAX_LEN with the same length and also they are both
same lenght as NAMEDATALEN I wonder if this shouldn't be somehow
squished to less defines.

Also in RegisterCustomScanMethods
+   Assert(strlen(methods->CustomName) <= CUSTOM_NAME_MAX_LEN);

Shouldn't this be actually "if" with ereport() considering this is
public API and extensions can pass anything there? (for that matter same
is true for RegisterExtensibleNodeMethods but that's already committed).

Other than that this seems like straight conversion to same basic
template as extensible nodes so I think it's ok.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: Generic WAL logical messages

2016-03-09 Thread Petr Jelinek


On 08/03/16 21:21, Artur Zakirov wrote:

I think here


+const char *
+logicalmsg_identify(uint8 info)
+{
+if (info & ~XLR_INFO_MASK == XLOG_LOGICAL_MESSAGE)
+return "MESSAGE";
+
+return NULL;
+}


we should use brackets

const char *
logicalmsg_identify(uint8 info)
{
 if ((info & ~XLR_INFO_MASK) == XLOG_LOGICAL_MESSAGE)
 return "MESSAGE";

 return NULL;
}



Correct, fixed, thanks.

I also rebased this as there was conflict after the fixes to logical 
decoding by Andres.



--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
>From f65b4d858067af520e853cafb4fdfd11b6b3fdc0 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Wed, 24 Feb 2016 17:02:36 +0100
Subject: [PATCH] Logical Decoding Messages

---
 contrib/test_decoding/Makefile  |  2 +-
 contrib/test_decoding/expected/ddl.out  | 21 --
 contrib/test_decoding/expected/messages.out | 56 
 contrib/test_decoding/sql/ddl.sql   |  3 +-
 contrib/test_decoding/sql/messages.sql  | 17 +
 contrib/test_decoding/test_decoding.c   | 19 ++
 doc/src/sgml/func.sgml  | 45 +
 doc/src/sgml/logicaldecoding.sgml   | 37 +++
 src/backend/access/rmgrdesc/Makefile|  4 +-
 src/backend/access/rmgrdesc/logicalmsgdesc.c| 41 
 src/backend/access/transam/rmgr.c   |  1 +
 src/backend/replication/logical/Makefile|  2 +-
 src/backend/replication/logical/decode.c| 49 ++
 src/backend/replication/logical/logical.c   | 38 +++
 src/backend/replication/logical/logicalfuncs.c  | 27 
 src/backend/replication/logical/message.c   | 87 +
 src/backend/replication/logical/reorderbuffer.c | 67 +++
 src/backend/replication/logical/snapbuild.c | 19 ++
 src/bin/pg_xlogdump/rmgrdesc.c  |  1 +
 src/include/access/rmgrlist.h   |  1 +
 src/include/catalog/pg_proc.h   |  4 ++
 src/include/replication/logicalfuncs.h  |  2 +
 src/include/replication/message.h   | 41 
 src/include/replication/output_plugin.h | 13 
 src/include/replication/reorderbuffer.h | 21 ++
 src/include/replication/snapbuild.h |  2 +
 26 files changed, 608 insertions(+), 12 deletions(-)
 create mode 100644 contrib/test_decoding/expected/messages.out
 create mode 100644 contrib/test_decoding/sql/messages.sql
 create mode 100644 src/backend/access/rmgrdesc/logicalmsgdesc.c
 create mode 100644 src/backend/replication/logical/message.c
 create mode 100644 src/include/replication/message.h

diff --git a/contrib/test_decoding/Makefile b/contrib/test_decoding/Makefile
index 200c43e..7568531 100644
--- a/contrib/test_decoding/Makefile
+++ b/contrib/test_decoding/Makefile
@@ -38,7 +38,7 @@ submake-test_decoding:
 	$(MAKE) -C $(top_builddir)/contrib/test_decoding
 
 REGRESSCHECKS=ddl xact rewrite toast permissions decoding_in_xact \
-	decoding_into_rel binary prepared replorigin time
+	decoding_into_rel binary prepared replorigin time messages
 
 regresscheck: | submake-regress submake-test_decoding temp-install
 	$(MKDIR_P) regression_output
diff --git a/contrib/test_decoding/expected/ddl.out b/contrib/test_decoding/expected/ddl.out
index 77719e8..32cd24d 100644
--- a/contrib/test_decoding/expected/ddl.out
+++ b/contrib/test_decoding/expected/ddl.out
@@ -220,11 +220,17 @@ SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'inc
 (7 rows)
 
 /*
- * check that disk spooling works
+ * check that disk spooling works (also for logical messages)
  */
 BEGIN;
 CREATE TABLE tr_etoomuch (id serial primary key, data int);
 INSERT INTO tr_etoomuch(data) SELECT g.i FROM generate_series(1, 10234) g(i);
+SELECT 'tx logical msg' FROM pg_logical_emit_message(true, 'test', 'tx logical msg');
+?column?
+
+ tx logical msg
+(1 row)
+
 DELETE FROM tr_etoomuch WHERE id < 5000;
 UPDATE tr_etoomuch SET data = - data WHERE id > 5000;
 COMMIT;
@@ -233,12 +239,13 @@ SELECT count(*), min(data), max(data)
 FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1')
 GROUP BY substring(data, 1, 24)
 ORDER BY 1,2;
- count |   min   |  max   
+-+
- 1 | BEGIN   | BEGIN
- 1 | COMMIT  | COMMIT
- 20467 | table public.tr_etoomuch: DELETE: id[integer]:1 | table public.tr_etoomuch: UPDATE: id[integer]: data[integer]:-
-(3 rows)
+ count |

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-09 Thread Haribabu Kommi

On Thu, Mar 10, 2016 at 5:30 AM, Robert Haas  wrote:
> On Wed, Mar 9, 2016 at 7:16 AM, Haribabu Kommi  
> wrote:
>> On Wed, Mar 9, 2016 at 10:06 PM, Amit Kapila  wrote:
>>> On Wed, Mar 9, 2016 at 11:46 AM, Haribabu Kommi 
>>> wrote:

 I tried replacing the random() with PostmaterRandom() for a test and it
 worked.
 This is generating different random values, so the issue is not occurring.

 "Global/PostgreSQL.2115609797"

 I feel, we should add the the data directory path + the random number to
 generate the name for dynamic shared memory, this can fix problem.

>>>
>>> As mentioned above, I think if we can investigate why this error is
>>> generated, that will be helpful.  Currently the code ensures that if the
>>> segment already exists, it should retry to create a segment with other name
>>> (refer dsm_impl_windows()), so the point of investigation is, why it is not
>>> going via that path?  I am guessing due to some reason CreateFileMapping()
>>> is returning NULL in this case whereas ideally it should return the existing
>>> handle with an error ERROR_ALREADY_EXISTS.
>>
>> DEBUG:  mapped win32 error code 5 to 13
>>
>> Yes, the CreateFileMapping() is returning NULL with an error of
>> ERROR_ACCESS_DENIED.
>> I am not able to find the reason for this error. This error is occurring only
>> when the PostgreSQL is started as a service only.
>
> Another question is: why are both postmasters returning the same
> random number?  That's not very, uh, random.

The random number is generated from our own implementation of
random function. The random function internally calls the pg_lrand48
function to get the random value. This function generates the random
number based on specified random seed and pre-defined calculations.
Because of this reason, the same random number is getting generated
every time.

In LInux, the random function is used from the glibc, there also it is
generating the same random number as the first number, but if the
number is used by some process then it is generating a different random
number for the next PostgreSQL process.

Regards,
Hari Babu
Fujitsu Australia

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Timeline following for logical slots

2016-03-09 Thread Petr Jelinek


On 04/03/16 17:08, Craig Ringer wrote:

I'd really appreciate some review of the logic there by people who know
timelines well and preferably know the xlogreader. It's really just one
function and 2/3 comments; the code is simple but the reasoning leading
to it is not.



I think this will have to be work for committer at this point. I can't 
find any flaws in the logic myself so I unless somebody protests I am 
going to mark this as ready for committer.



I've also attached an updated version of the tests posted a few days
ago.  The tests depend on the remaining patches from the TAP
enhancements tree so it's easiest to just get the whole tree from
https://github.com/2ndQuadrant/postgres/tree/dev/logical-decoding-timeline-following
(subject to regular rebases and force pushes, do not use as a base).

The tests now include a test module that exposes some slots guts to SQL
to allow the client to sync slot state from master to replica(s) without
needing failover slots and the use of extra WAL as transport. It's very
much for-testing-only.

The new test module is used by a second round of tests to demonstrate
the practicality of failover of a logical replication client to a
physical replica using a base backup taken by pg_basebackup and without
the presence of failover slots. I won't pretend it's pretty.



Well for testing purposes it's quite fine I think. The TAP framework 
enhancements needed for this are now in and it works correctly against 
current master.


--
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Implement failover on libpq connect level.

2016-03-09 Thread David Steele


Hi Victor,

On 2/1/16 5:05 PM, Alvaro Herrera wrote:

Victor Wagner wrote:

On Fri, 22 Jan 2016 16:36:15 -0300
Alvaro Herrera  wrote:


You're editing the expected file for the libpq-regress thingy, but you
haven't added any new lines to test the new capability.  I think it'd
be good to add some there.  (I already said this earlier in the
thread; is there any reason you ignored it the first time?)

I seriously doubt that this program can be used to test new
capabilities.

All it does, it calls PQconninfoParse and than examines some fields of
PGconn structure.

Ah, you're right, I didn't remember that.


If I add some new uris, than only thing I can test is that comma is
properly copied from the URI to this field. And may be that some syntax
errors are properly detected.

Yeah, we should do that.


So, I think that new functionality need other approach for testing.
There should be test of real connection to real temporary cluster.
Probably, based on Perl TAP framework which is actively used in the
Postgres recently.

Yes, agreed.  So please have a look at that one and share your opinion
about it.  It'd be useful.

Meanwhile I'm moving the patch to the next commitfest.


There hasn't been any movement on this patch in a while.  Will you have 
a new tests ready for review soon?


Thanks,

--
-David
da...@pgmasters.net

Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-09 Thread Alvaro Herrera

Craig Ringer wrote:

> 007 adds PostgresNode support for hot and cold filesystem-level backups
> using pg_start_backup and pg_stop_backup, which will be required for some
> coming tests and are useful by themselves.

Finally, pushed this one after rebasing on top of the changes in the
others.  I think we're done here, at last ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimizer questions

2016-03-09 Thread Tom Lane

Konstantin Knizhnik  writes:
> I think that the best approach is to generate two different paths: 
> original one, when projection is always done before sort and another one 
> with postponed projection of non-trivial columns. Then we compare costs 
> of two paths and choose the best one.
> Unfortunately, I do not understand now how to implement it with existed 
> grouping_planner.
> Do you think that it is possible?

After fooling with this awhile, I don't think it's actually necessary
to do that.  See attached proof-of-concept patch.

Although this patch gets through our regression tests, that's only because
it's conservative about deciding to postpone evaluation; if it tried to
postpone evaluation in a query with window functions, it would fail
miserably, because pull_var_clause doesn't know about window functions.
I think that that's a clear oversight and we should extend it to offer
the same sorts of behaviors as it does for Aggrefs.  But that would be
slightly invasive, there being a dozen or so callers; so I didn't bother
to do it yet pending comments on this patch.

I think it's probably also broken for SRFs in the tlist; we need to work
out what semantics we want for those.  If we postpone any SRF to after
the Sort, we can no longer assume that a query LIMIT enables use of
bounded sort (because the SRF might repeatedly return zero rows).
I don't have a huge problem with that, but I think now would be a good
time to nail down some semantics.

Some other work that would be useful would be to refactor so that the
cost_qual_eval operations aren't so redundant.  But that's just a small
time savings not a question of functionality.

And we'd have to document that this changes the behavior for volatile
functions.  For the better, I think: this will mean that you get
consistent results no matter whether the query is implemented by
indexscan or seqscan-and-sort, which has never been true before.
But it is a change.

Do people approve of this sort of change in general, or this patch
approach in particular?  Want to bikeshed the specific
when-to-postpone-eval policies implemented here?  Other comments?

regards, tom lane

diff --git a/src/backend/optimizer/plan/planner.c b/src/backend/optimizer/plan/planner.c
index 8937e71..b15fca1 100644
*** a/src/backend/optimizer/plan/planner.c
--- b/src/backend/optimizer/plan/planner.c
*** static RelOptInfo *create_distinct_paths
*** 126,131 
--- 126,132 
  	  RelOptInfo *input_rel);
  static RelOptInfo *create_ordered_paths(PlannerInfo *root,
  	 RelOptInfo *input_rel,
+ 	 PathTarget *target,
  	 double limit_tuples);
  static PathTarget *make_group_input_target(PlannerInfo *root, List *tlist);
  static List *postprocess_setop_tlist(List *new_tlist, List *orig_tlist);
*** static PathTarget *make_window_input_tar
*** 134,139 
--- 135,142 
  		 List *tlist, List *activeWindows);
  static List *make_pathkeys_for_window(PlannerInfo *root, WindowClause *wc,
  		 List *tlist);
+ static PathTarget *make_sort_input_target(PlannerInfo *root,
+ 	   PathTarget *final_target);

  /*
*** grouping_planner(PlannerInfo *root, bool
*** 1378,1383 
--- 1381,1387 
  	int64		offset_est = 0;
  	int64		count_est = 0;
  	double		limit_tuples = -1.0;
+ 	PathTarget *final_target;
  	RelOptInfo *current_rel;
  	RelOptInfo *final_rel;
  	ListCell   *lc;
*** grouping_planner(PlannerInfo *root, bool
*** 1437,1442 
--- 1441,1449 
  		/* Save aside the final decorated tlist */
  		root->processed_tlist = tlist;

+ 		/* Also extract the PathTarget form of the setop result tlist */
+ 		final_target = current_rel->cheapest_total_path->pathtarget;
+ 
  		/*
  		 * Can't handle FOR [KEY] UPDATE/SHARE here (parser should have
  		 * checked already, but let's make sure).
*** grouping_planner(PlannerInfo *root, bool
*** 1461,1467 
  	else
  	{
  		/* No set operations, do regular planning */
! 		PathTarget *final_target;
  		PathTarget *grouping_target;
  		PathTarget *scanjoin_target;
  		bool		have_grouping;
--- 1468,1474 
  	else
  	{
  		/* No set operations, do regular planning */
! 		PathTarget *sort_input_target;
  		PathTarget *grouping_target;
  		PathTarget *scanjoin_target;
  		bool		have_grouping;
*** grouping_planner(PlannerInfo *root, bool
*** 1657,1678 
  		final_target = create_pathtarget(root, tlist);

  		/*
  		 * If we have window functions to deal with, the output from any
  		 * grouping step needs to be what the window functions want;
! 		 * otherwise, it should just be final_target.
  		 */
  		if (activeWindows)
  			grouping_target = make_window_input_target(root,
  	   tlist,
  	   activeWindows);
  		else
! 			grouping_target = final_target;

  		/*
  		 * If we have grouping or

Re: [HACKERS] HINTing on UPDATE foo SET foo.bar = ..;

2016-03-09 Thread David Steele


On 12/23/15 9:15 PM, Michael Paquier wrote:

On Mon, Dec 7, 2015 at 9:14 PM, Michael Paquier
 wrote:

On Wed, Sep 2, 2015 at 6:19 AM, Marko Tiikkaja  wrote:

Hopefully nobody minds if I slip this to the commit fest that started
today?  The attached patch should address all the comments from the 9.5
cycle.

All the previous comments have been addressed. Perhaps some regression tests
would have some value?

This thread has been stalling for a couple of weeks now. I have marked
it as "returned with feedback". Marko, if you are still working on
this patch, could you add some regression tests and then move it to
the next CF?
This was submitted to the 2016-03 CF but no new patch was supplied and 
there's been no activity on the thread for months.  I'm marking it as 
"returned with feedback."


Marko, if can address Michael's concerns or supply a new patch I'll be 
happy to open the CF entry again.


Thanks,

--
-David
da...@pgmasters.net



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] TAP / recovery-test fs-level backups, psql enhancements etc

2016-03-09 Thread Alvaro Herrera

Craig Ringer wrote:

> 004 allows filtering on RecursiveCopy by a predicate function. Needed for
> filesystem level backups (007). It could probably be squashed with 007 if
> desired.

I pushed this after some tinkering:

* filtering applies to all directory entries, not just files.  So you
can filter a subdirectory, for example.  If you have symlinks (which you
know this module will bomb out on) you can skip them too.

* The filter sub receives the path relative to the initial source dir,
rather than the absolute path.  That part was just making it difficult
to use; in your POD example you were testing /pg_log/ as a regex, which
skipped the *files* but not the directory itself.  Now you can do simply
"$var ne 'pg_log'".  (Pallavi Sontakke implemented most of this.)

* Your test for "ref $filterfn" allowed any reference to be passed, not
just a code reference.  I didn't test passing some other type of
reference; I assume you'd just get a very ugly error message.

Thanks,

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fun with "Ready for Committer" patches

2016-03-09 Thread Michael Paquier

On Wed, Mar 9, 2016 at 10:19 PM, Andres Freund  wrote:
> On 2016-03-10 06:14:25 +0900, Michael Paquier wrote:
>> IMO, during a review one needs to think of himself as a committer.
>> Once the reviewer switches the patch to "Ready for committer", it
>> means that the last version of the patch present would have been the
>> version that gained the right to be pushed.
>
> And one consideration there is whether you, as the committer, would be
> ok with maintaining this feature going forward.

Yes, that's a key point. Committers do the initial push and the
after-sales service as well.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fun with "Ready for Committer" patches

2016-03-09 Thread Andres Freund

On 2016-03-10 06:14:25 +0900, Michael Paquier wrote:
> IMO, during a review one needs to think of himself as a committer.
> Once the reviewer switches the patch to "Ready for committer", it
> means that the last version of the patch present would have been the
> version that gained the right to be pushed.

And one consideration there is whether you, as the committer, would be
ok with maintaining this feature going forward.

But I think for less experienced reviewers that's hard to judge; and I
think asking everyone to do that raises the barriers to do reviews
considerably.  So I think we should somehow document that it's ok to
mark the patch as such, but that you're not forced to do that if you
don't want to.

Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] fun with "Ready for Committer" patches

2016-03-09 Thread Michael Paquier

On Wed, Mar 9, 2016 at 9:47 PM, Robert Haas  wrote:
> On Tue, Mar 8, 2016 at 9:44 PM, Craig Ringer  wrote:
>> On 9 March 2016 at 07:18, Tatsuo Ishii  wrote:
>>> Many of "needs review" state patches already have reviewer(s). Do you
>>> mean we want more reviewers in addition to them for such patches?
>>
>> Yeah. Personally I'm not too confident about what precisely is required to
>> move a patch from needs-review to ready-for-committer. I've done a chunk of
>> review for a number of patches, but I'm not always confident saying "all
>> clear, proceed".
>
> I think that if you've done your best to review it, and you don't see
> any remaining problems, you should mark it Ready for Committer.

IMO, during a review one needs to think of himself as a committer.
Once the reviewer switches the patch to "Ready for committer", it
means that the last version of the patch present would have been the
version that gained the right to be pushed.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread Tomas Vondra

Hi,


On Wed, 2016-03-09 at 12:16 -0500, Robert Haas wrote:
> On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
>  wrote:
> > On 2016/03/09 10:11, Amit Langote wrote:
> >> The attached revision addresses above and one of Horiguchi-san's comments
> >> in his email yesterday.
> >
> > I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
> > so many versions.
> 
> I've committed 0001 with heavy revisions.  Just because we don't need
> an SQL-visible function to clear the command progress doesn't mean we
> don't need to clear it at all; rather, it has to happen automatically.
> I also did a bunch of identifier renaming, added datid to the view
> output, adjusted the comments, and so on.  Please rebase the remainder
> of the series.  Thanks.

I'm pretty sure this piece of code ends up accessing subscripts above
array bounds (and gcc 4.6.4 complains about that):

#define PG_STAT_GET_PROGRESS_COLS PGSTAT_NUM_PROGRESS_PARAM + 3

...

boolnulls[PG_STAT_GET_PROGRESS_COLS];

...

nulls[2] = true;
for (i = 1; i < PGSTAT_NUM_PROGRESS_PARAM + 1; i++)
nulls[i+3] = true;

Now let's say PARAM=10, which means COLS=13. The last index accessed by
the loop will be i=10, which means we'll do this:

nulls[13] = true;

which is above bounds.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-03-09 Thread Tom Lane

Robert Haas  writes:
> On Wed, Mar 9, 2016 at 1:12 PM, Tom Lane  wrote:
>> I hadn't been paying any attention to this thread, but I wonder whether
>> this entire approach isn't considerably inferior to what we can do now
>> with the planner pathification patch.  To quote from the new docs:

> Well, I guess I'm not quite seeing it.  What do you have in mind?
> Just taking a guess here, you might be thinking that instead of
> something like this...

>   Update on public.ft2
> ->  Foreign Update on public.ft2

> We could boil it all the way down to this:

> Foreign Update on public.ft2

Exactly.  I'm not claiming that that would be particularly faster, but
it would eliminate a whole bunch of seriously ugly stuff that's in
this patch.

> But can you, really?  What if the UPDATE is targeting an inheritance
> hierarchy containing some local tables and some remote tables?

I don't really see why that couldn't be made to work.  You'd end up
with ForeignUpdates on the remote tables and a ModifyTable handling
the rest.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-09 Thread Robbie Harwood

David Steele  writes:

> On 3/8/16 5:44 PM, Robbie Harwood wrote:
>> 
>> Here's yet another version of GSSAPI encryption support.  It's also
>> available for viewing on my github:
>
> I got this warning when applying the first patch in the set:
>
> ../other/v6-0001-Move-common-GSSAPI-code-into-its-own-files.patch:245:
> new blank line at EOF.
> +
> warning: 1 line adds whitespace errors.

Hah, so it does.  Thanks for catching it; will fix.

> The build went fine but when testing I was unable to logon at all.  I'm
> using the same methodology as in
> http://www.postgresql.org/message-id/56be0ff9.70...@pgmasters.net except
> that I'm running against 51c0f63 and using the v6 patch set.
>
> psql simply hangs and never returns.  I have attached a pcap of the
> psql/postgres session generated with:
>
> tcpdump -i lo -nnvvXSs 1514 port 5432 -w gssapi.pcap
>
> If you would like me to capture more information please let me know
> specifically how you would like me to capture it.

Thank you for the pcap!  (I'm using wireshark so formats it can open are
greatly appreciated.)  This suggests that the hang is my client code's
fault, but just in case: I assume nothing unusual was logged on the
server?

v6-0002-Connection-encryption-support-for-GSSAPI.patch in fe-connect.c
at around line 2518 adds a call to appendBinaryPQExpBuffer and sets
conn->inEnd.  Can you try without those lines?

Can you also (e.g., with gdb or by adding printf calls) tell me what the
values of conn->inStart, conn->inEnd, and conn->inCursor any time
(should only be once) that those lines are triggered?

> I reverted to v5 and got the same behavior I was seeing with v4 and v5,
> namely that I can only logon occasionally and usually get this error:
>
> psql: expected authentication request from server, but received
>
> Using a fresh build from 51c0f63 I can logon reliably every time so I
> don't think there's an issue in my environment.

Agreed, I'm sure I've caused it somehow, though I don't know what's
wrong yet.  (And if it weren't my fault but you didn't get useful errors
out, that'd be my fault anyway for not checking enough stuff!)

I don't know if this would say anything relevant, but it might be
interesting to see what the results are of applying [1] to the v5 code.
It's the same approach to solving the problem, though it happens at a
different time due to the aforementioned protocol change between v5 and
v6.

Thanks,
--Robbie

[1] 
https://github.com/frozencemetery/postgres/commit/82c89227a6b499ac9273044f91cff747c154629f

signature.asc
Description: PGP signature

Re: [HACKERS] pg_ctl promote wait

2016-03-09 Thread Michael Paquier

On Mon, Feb 29, 2016 at 4:29 PM, Michael Paquier
 wrote:
> On Mon, Feb 29, 2016 at 4:28 PM, Michael Paquier
>  wrote:
>> I would suggest using
>> $node_standby->poll_query_until('SELECT pg_is_in_recovery()') to
>> validate the end of the test.
>
> Meh. SELECT NOT pg_is_in_recovery(). This will wait until the query
> returns true.

Here are some comments about 0002

+   if ((fd = open(control_file_path, O_RDONLY | PG_BINARY, 0)) == -1)
+   {
+   fprintf(stderr, _("%s: could not open file \"%s\" for
reading: %s\n"),
+   progname, control_file_path, strerror(errno));
+   exit(1);
+   }
[...]
Most of the logic of get_control_dbstate() is a mimic of the
recently-introduced get_controlfile() in controldata_utils.c of
dc7d70e. I think that we had better use that, and that we had better
emit an error should an incorrect control file be found while running
those pg_ctl commands as the control file present had better have a
correct CRC all the time or something wrong is going on. So this would
lead to this logic for get_control_dbstate():
control_file_data = get_controlfile(pg_data, progname);
res = control_file_data->state;
pfree(control_file_data);

Except that, 0002 is a good thing to have, switching from the presence
of recovery.conf to what is in the control data file is definitely
more robust, a lot of things happen from when recovery.conf is renamed
to recovery.done until WAL is enabled for backends, particularly the
end of recovery checkpoint and the cleanup of the WAL segments of the
previous timeline.

And now for 0003...

+$node_standby->command_like(['psql', '-X', '-A', '-t', '-c', 'SELECT
pg_is_in_recovery()'],
+   qr/^t$/,
+   'standby is in recovery');
[...]
+$node_standby->command_like(['psql', '-X', '-A', '-t', '-c', 'SELECT
pg_is_in_recovery()'],
+   qr/^f$/,
+   'promoted standby is not in recovery');
for those two you can use $node_standby->psql_safe to get back a query result.

> The subsequent discussion mentioned that there might still be a window
> between end of waiting and when read-write queries would be accepted.  I
> don't know how big that window would be in practice and would be
> interested in some testing and feedback.

And so... Based on the previous discussion, there is an interval of
time between the moment the update of the control file is done and the
point where backends are allowed to emit WAL. I am really worrying
about this interval of time actually, as once pg_ctl exits client
applications should be guaranteed to connect to the server but the
current patch would not be failure-proof, and I imagine that
particularly on CPU-constrained environments this is going to become
unstable. Particularly I expect that slow machines are likely going to
fail in the last test of 003_promote.pl as designed (I am away from
home now so I have not been able to test that unfortunately on my own
stuff but that's possible) because pg_is_in_recovery is controlled by
SharedRecoveryInProgress, so it may be possible that
pg_is_in_recovery() returns false while the control file status is
DB_IN_PRODUCTION. The main factor that can contribute to a larger
window is a higher number of 2PC transactions that need to be loaded
back to shared memory after scanning pg_twophase.

If we are going to have a reliable promote wait mode for pg_ctl, I
think that we had better first reduce this window, something that
could be done is to update SharedRecoveryInProgress while holding an
exclusive lock on ControlFileLock, with this flow for example. See for
example the patch attached, we can reduce this window to zero for
backends if some of them refer to ControlFile in shared memory thanks
to ControlFileLock. For clients, there will still be a small window
during which backends could write WAL and the control file status is
ARCHIVE_RECOVERY on disk. If we want to have a reliable promote wait
mode for pg_ctl, I think that we had better do something like the
attached first. Thoughts?

Looking at where is used the shared memory structure of ControlFile,
one thing to worry about is CreateRestartPoint but its code paths are
already using ControlFileLock when looking at the status file, so they
are safe with this logic.
-- 
Michael
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 00f139a..590385c 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7371,12 +7371,6 @@ StartupXLOG(void)
 	 */
 	InRecovery = false;

-	LWLockAcquire(ControlFileLock, LW_EXCLUSIVE);
-	ControlFile->state = DB_IN_PRODUCTION;
-	ControlFile->time = (pg_time_t) time(NULL);
-	UpdateControlFile();
-	LWLockRelease(ControlFileLock);
-
 	/* start the archive_timeout timer running */
 	XLogCtl->lastSegSwitchTime = (pg_time_t) time(NULL);

@@ -7434,15 +7428,32 @@ StartupXLOG(void)

Re: [HACKERS] Recovery test failure for recovery_min_apply_delay on hamster

2016-03-09 Thread Michael Paquier

On Wed, Mar 9, 2016 at 12:29 PM, Alvaro Herrera
 wrote:
> Michael Paquier wrote:
>> After sleeping (best debugger ever) on that, actually a way popped up
>> in my mind, and I propose the attached, which refactors a bit 005 and
>> checks that the LSN position of master has been applied on standby
>> after at least the delay wanted. A maximum delay of 90s is authorized,
>> like poll_query_until.
>
> Hmm, okay, that's great.  A question: what happens if the test itself is
> slow and the servers are fast, and the test doesn't manage to run two
> iterations before the two seconds have elapsed?  This may happen on
> overloaded or slow servers, if you're unlucky.

Yes, a failure would happen. The same thought occurred to me during a
long flight. And this is why the previous patch was full of meh.

> I don't have any ideas on ensuring that we don't apply earlier than the
> given period at the moment.

Attached is one, which is based on timestamp values queried from the
standby server. We could use as well perl's localtime call to
calculate the time delay.
-- 
Michael
diff --git a/src/test/recovery/t/005_replay_delay.pl b/src/test/recovery/t/005_replay_delay.pl
index 986851b..74d3c91 100644
--- a/src/test/recovery/t/005_replay_delay.pl
+++ b/src/test/recovery/t/005_replay_delay.pl
@@ -3,7 +3,7 @@ use strict;
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 2;
+use Test::More tests => 1;
 
 # Initialize master node
 my $node_master = get_new_node('master');
@@ -32,18 +32,40 @@ $node_standby->start;
 # depending on the delay of 2s applied above.
 $node_master->safe_psql('postgres',
 	"INSERT INTO tab_int VALUES (generate_series(11,20))");
-sleep 1;
 
-# Here we should have only 10 rows
-my $result = $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int");
-is($result, qq(10), 'check content with delay of 1s');
-
-# Now wait for replay to complete on standby
+# Now wait for replay to complete on standby. We check if the current
+# LSN of master has been applied after at least 2s. This uses timestamps
+# to be sure that at least the delay time has passed, as on slow machines
+# it may take more than the delay time to reach the first attempt loop.
 my $until_lsn =
   $node_master->safe_psql('postgres', "SELECT pg_current_xlog_location();");
-my $caughtup_query =
-  "SELECT '$until_lsn'::pg_lsn <= pg_last_xlog_replay_location()";
-$node_standby->poll_query_until('postgres', $caughtup_query)
-  or die "Timed out while waiting for standby to catch up";
-$result = $node_standby->safe_psql('postgres', "SELECT count(*) FROM tab_int");
-is($result, qq(20), 'check content with delay of 2s');
+my $standby_insert_time =
+  $node_standby->safe_psql('postgres', "SELECT now();");
+
+my $max_attempts = 90;
+my $attempts = 0;
+my $delay_status = 0;
+while ($attempts < $max_attempts)
+{
+	my $replay_status =
+	  $node_standby->safe_psql('postgres',
+		"SELECT (pg_last_xlog_replay_location() - '$until_lsn'::pg_lsn) >= 0;");
+	my $delay_applied =
+	  $node_standby->safe_psql('postgres',
+		"SELECT now() - '$standby_insert_time'::timestamptz >= interval '2s';");
+	$delay_status = $delay_applied eq 't' ? 1 : 0;
+
+	# Leave now if standby has replayed at least up to the point of
+	# master. It's time to see if data has been applied after the wanted
+	# delay.
+	last if $replay_status eq 't';
+
+	sleep 1;
+	$attempts++;
+}
+
+die "Maximum number of attempts reached" if $attempts >= $max_attempts;
+
+# This test is valid only if LSN has been applied with at least
+# the minimum apply delay expected.
+ok($delay_status, 'Check if WAL is applied on standby after delay of 2s');

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Alexander Korotkov

On Wed, Mar 9, 2016 at 5:47 PM, Tom Lane  wrote:

> Alexander Korotkov  writes:
> > I have a question about Sort path. AFAICS this question wasn't mentioned
> in
> > the upthread discussion.
> > We're producing Sort plans in two ways: from explicit Sort paths, and
> from
> > other paths which implicitly assumes sorting (like MergeJoin path).
> > Would it be better to produce Sort plan only from explicit Sort path?
> Thus,
> > MergeJoin path would add explicit children Sort paths. That would be more
> > unified way.
>
> Meh.  I think the only real effect that would have is to make it slower to
> build MergeJoin paths (and in a typical join problem, we build a lot of
> those).  The entire point of the Path/Plan distinction is to postpone
> until createplan.c any work that's not necessary to arrive at a cost
> estimate.  So the way MergeJoinPath works now seems fine to me.
>
> > I ask about this from point of view of my "Partial Sort" patch. The
> absence
> > of implicit sorts would help to make this patch more simple and clean.
>
> There are other implicit sorts besides those.  Admittedly, the efficiency
> argument for not making the sort explicit in UniquePath or MergeAppendPath
> is a lot weaker than it is for MergeJoin, just because those are less
> common path types.  But getting rid of the behavior entirely would be
> a lot of work, and I'm not convinced it'd be an improvement.
>

Thank you for the explanation. I'll try to rebase "Partial Sort" leaving
this situation as is.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Alter or rename enum value

2016-03-09 Thread Matthias Kurz

Besides not being able to rename enum values there are two other
limitations regarding enums which would be nice to get finally fixed:

1) There is also no possibility to drop a value.

2) Quoting the docs (
http://www.postgresql.org/docs/9.5/static/sql-altertype.html):
"ALTER TYPE ... ADD VALUE (the form that adds a new value to an enum type)
cannot be executed inside a transaction block." Example:
# CREATE TYPE bogus AS ENUM('good');
CREATE TYPE
# BEGIN;
BEGIN
# ALTER TYPE bogus ADD VALUE 'bad';
ERROR:  ALTER TYPE ... ADD cannot run inside a transaction block

To summarize it:
For enums to finally be really usable it would nice if we would have (or
similiar):
ALTER TYPE name DROP VALUE [ IF EXISTS ] enum_value
and
ALTER TYPE name RENAME VALUE [ IF EXISTS ] old_enum_value_name TO
new_enum_value_name

And all of the operations (adding, renaming, dropping) should also work
when done within a new transaction on an enum that existed before that
transaction.

I did some digging and maybe following commits are useful in this context:
7b90469b71761d240bf5efe3ad5bbd228429278e
c9e2e2db5c2090a880028fd8c1debff474640f50

Also there are these discussions where some of the messages contain some
useful information:
http://www.postgresql.org/message-id/29f36c7c98ab09499b1a209d48eaa615b7653db...@mail2a.alliedtesting.com
http://www.postgresql.org/message-id/50324f26.3090...@dunslane.net
http://www.postgresql.org/message-id/20130819122938.gb8...@alap2.anarazel.de

Also have a look at this workaround:
http://en.dklab.ru/lib/dklab_postgresql_enum/

How high is the chance that given the above information someone will tackle
these 3 issues/requests in the near future? It seems there were some
internal chances since the introduction of enums in 8.x so maybe this
changes wouldn't be that disruptive anymore?

Regards,
Matthias

On 9 March 2016 at 18:13, Tom Lane  wrote:

> Andrew Dunstan  writes:
> > On 03/09/2016 11:07 AM, Tom Lane wrote:
> >> I have a vague recollection that we discussed this at the time the enum
> >> stuff went in, and there are concurrency issues?  Don't recall details
> >> though.
>
> > Rings a vague bell, but should it be any worse than adding new labels?
>
> I think what I was recalling is the hazards discussed in the comments for
> RenumberEnumType.  However, the problem there is that a backend could make
> inconsistent ordering decisions due to seeing two different pg_enum rows
> under different snapshots.  Updating a single row to change its name
> doesn't seem to have a comparable hazard, and it wouldn't affect ordering
> anyway.  So it's probably no worse than any other object-rename situation.
>
> regards, tom lane
>

Re: [HACKERS] pgbench small bug fix

2016-03-09 Thread Fabien COELHO




OK, I've committed the fix for the -T part.  It didn't back-patch
cleanly, and it is a minor bug, so I'm not inclined to worry about it
further.


I agree that it is a very minor bug and not necessary worth back-patching.


I didn't commit the fix for the -P part, because Alvaro objected to
the proposed method of fixing it as ugly, and I think he's right.
Unless you can come up with a nicer-looking fix, I think that part is
going to stay unfixed.


A bug kept on esthetical ground, that's a first!

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] extend pgbench expressions with functions

2016-03-09 Thread Fabien COELHO



Hello Robert,

[...] With your patch, you get different behavior depending on exactly 
how the input is malformed.


I understand that you require only one possible error message on malformed 
input, instead of failing when converting to double if the input looked 
like a double (there was a '.' clue int it, but that is not proof), or 
something else if it was assumed to be an int.


So I'm going to assume that you do not like the type guessing.

I'm not going to commit it this way, and frankly, neither is anyone 
else.


Hmmm. Probably my unconcious self is trying to reach 42.

Here is a v36 which inspect very carefully the string to decide whether 
it is an int or a double. You may, or may not, find it to your taste, I 
can't say.


--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index cc80b3f..2133bf7 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -794,9 +794,10 @@ pgbench  options  dbname
 
 
  
-  Sets variable varname to an integer value calculated
+  Sets variable varname to a value calculated
   from expression.
   The expression may contain integer constants such as 5432,
+  double constants such as 3.14159,
   references to variables :variablename,
   and expressions composed of unary (-) or binary operators
   (+, -, *, /,
@@ -809,7 +810,7 @@ pgbench  options  dbname
   Examples:
 
 \set ntellers 10 * :scale
-\set aid (1021 * :aid) % (10 * :scale) + 1
+\set aid (1021 * random(1, 10 * :scale)) % (10 * :scale) + 1
 
 

@@ -829,66 +830,35 @@ pgbench  options  dbname
  
 
  
-  By default, or when uniform is specified, all values in the
-  range are drawn with equal probability.  Specifying gaussian
-  or  exponential options modifies this behavior; each
-  requires a mandatory parameter which determines the precise shape of the
-  distribution.
- 
+  
+   
+
+ \setrandom n 1 10 or \setrandom n 1 10 uniform
+ is equivalent to \set n random(1, 10) and uses a uniform
+ distribution.
+
+   
 
- 
-  For a Gaussian distribution, the interval is mapped onto a standard
-  normal distribution (the classical bell-shaped Gaussian curve) truncated
-  at -parameter on the left and +parameter
-  on the right.
-  Values in the middle of the interval are more likely to be drawn.
-  To be precise, if PHI(x) is the cumulative distribution
-  function of the standard normal distribution, with mean mu
-  defined as (max + min) / 2.0, with
-
- f(x) = PHI(2.0 * parameter * (x - mu) / (max - min + 1)) /
-(2.0 * PHI(parameter) - 1.0)
-
-  then value i between min and
-  max inclusive is drawn with probability:
-  f(i + 0.5) - f(i - 0.5).
-  Intuitively, the larger parameter, the more
-  frequently values close to the middle of the interval are drawn, and the
-  less frequently values close to the min and
-  max bounds. About 67% of values are drawn from the
-  middle 1.0 / parameter, that is a relative
-  0.5 / parameter around the mean, and 95% in the middle
-  2.0 / parameter, that is a relative
-  1.0 / parameter around the mean; for instance, if
-  parameter is 4.0, 67% of values are drawn from the
-  middle quarter (1.0 / 4.0) of the interval (i.e. from
-  3.0 / 8.0 to 5.0 / 8.0) and 95% from
-  the middle half (2.0 / 4.0) of the interval (second and
-  third quartiles). The minimum parameter is 2.0 for
-  performance of the Box-Muller transform.
- 
+  
+   
+\setrandom n 1 10 exponential 3.0 is equivalent to
+\set n random_exponential(1, 10, 3.0) and uses an
+exponential distribution.
+   
+  
 
- 
-  For an exponential distribution, parameter
-  controls the distribution by truncating a quickly-decreasing
-  exponential distribution at parameter, and then
-  projecting onto integers between the bounds.
-  To be precise, with
-
-f(x) = exp(-parameter * (x - min) / (max - min + 1)) / (1.0 - exp(-parameter))
-
-  Then value i between min and
-  max inclusive is drawn with probability:
-  f(x) - f(x + 1).
-  Intuitively, the larger parameter, the more
-  frequently values close to min are accessed, and the
-  less frequently values close to max are accessed.
-  The closer to 0 parameter, the flatter (more uniform)
-  the access distribution.
-  A crude approximation of the distribution is that the most frequent 1%
-  values in the range, close to min, are drawn
-  parameter% of the time.
-  parameter value must be strictly positive.
+  
+   
+\setrandom n 1 10 gaussian 2.0 is equivalent to
+\set n random_gaussian(1, 10, 2.0), and uses a gaussian
+distribution.
+   
+  
+ 
+
+   See the documentation of these functions below for

Re: [HACKERS] Freeze avoidance of very large table.

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 9:09 AM, Masahiko Sawada
 wrote: Attached latest 2 patches.
> * 000 patch : Incorporated the review comments and made rewriting
> logic more clearly.

That's better, thanks.  But your comments don't survive pgindent.
After running pgindent, I get this:

+   /*
+* These old_* variables point to old visibility map page.
+*
+* cur_old: Points to current position on old
page. blkend_old :
+* Points to end of old block. break_old  : Points to
old page break
+* position for rewriting a new page. After wrote a
new page, old_end
+* proceeds rewriteVmBytesPerPgae bytes.
+*/

You need to either surround this sort of thing with dashes to make
pgindent ignore it, or, probably better, rewrite it using complete
sentences that together form a paragraph.

+   Oid pg_database_oid;/* OID of
pg_database relation */

Not used anywhere?

Instead of vm_need_rewrite, how about vm_must_add_frozenbit?

Can you explain the changes to test.sh?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Re: [HACKERS] Re: [HACKERS] Re: [HACKERS] Windows service is not starting so there’s message in log: FATAL: "could not create shared memory segment “Global/PostgreSQL.851401618”: Permission

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 7:16 AM, Haribabu Kommi  wrote:
> On Wed, Mar 9, 2016 at 10:06 PM, Amit Kapila  wrote:
>> On Wed, Mar 9, 2016 at 11:46 AM, Haribabu Kommi 
>> wrote:
>>>
>>>
>>> I tried replacing the random() with PostmaterRandom() for a test and it
>>> worked.
>>> This is generating different random values, so the issue is not occurring.
>>>
>>> "Global/PostgreSQL.2115609797"
>>>
>>> I feel, we should add the the data directory path + the random number to
>>> generate the name for dynamic shared memory, this can fix problem.
>>>
>>
>> As mentioned above, I think if we can investigate why this error is
>> generated, that will be helpful.  Currently the code ensures that if the
>> segment already exists, it should retry to create a segment with other name
>> (refer dsm_impl_windows()), so the point of investigation is, why it is not
>> going via that path?  I am guessing due to some reason CreateFileMapping()
>> is returning NULL in this case whereas ideally it should return the existing
>> handle with an error ERROR_ALREADY_EXISTS.
>
> DEBUG:  mapped win32 error code 5 to 13
>
> Yes, the CreateFileMapping() is returning NULL with an error of
> ERROR_ACCESS_DENIED.
> I am not able to find the reason for this error. This error is occurring only
> when the PostgreSQL is started as a service only.

Another question is: why are both postmasters returning the same
random number?  That's not very, uh, random.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 12:33 PM, Tom Lane  wrote:
> Alvaro Herrera  writes:
>> Tom Lane wrote:
>>> Yeah, fixed.  I had assumed that the existing coding in create_gather_plan
>>> was OK, because it looked like it was right for a non-projecting node.
>>> But actually Gather can project (why though?), so it's not right.
>
>> This looks related:
>> https://www.postgresql.org/message-id/CA%2BTgmoai9Ruhwk61110rY4cNAzoO6npsFEOaEKjM7Zz8i3evHg%40mail.gmail.com
>
> Ah, thanks for remembering that thread; I'd forgotten it.
>
> Gather is a bit weird, because although it can project (and needs to,
> per the example of needing to compute a non-parallel-safe function),
> you would rather push down as much work as possible to the child node;
> and doing so is semantically OK for parallel-safe functions.  (Pushing
> functions down past a Sort node, for a counterexample, is not so OK
> if you are concerned about function evaluation order, or even number
> of executions.)
>
> In the current code structure it would perhaps be reasonable to teach
> apply_projection_to_path about that --- although this would require
> logic to separate parallel-safe and non-parallel-safe subexpressions,
> which doesn't quite seem like something apply_projection_to_path
> should be doing.

I think for v1 it would be fine to make this all-or-nothing; that's
what I had in mind to do.  That is, if the entire tlist is
parallel-safe, push it all down.  If not, let the workers just return
the necessary Vars and have Gather compute the final tlist.  Now,
obviously one could do better.  If the tlist contains several
expensive functions that are parallel-safe and one inexpensive
function that isn't, you'd obviously prefer to compute the
parallel-safe stuff in the workers and just compute the one
inexpensive thing in the leader.  But that's significantly more
complicated and I'm not sure it's worth spending a lot of time getting
it right just now.  Not that I'm complaining if you feel the urge.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 1:12 PM, Tom Lane  wrote:
> Robert Haas  writes:
>> Overall, I think this is looking pretty good.
>
> I hadn't been paying any attention to this thread, but I wonder whether
> this entire approach isn't considerably inferior to what we can do now
> with the planner pathification patch.  To quote from the new docs:
>
>PlanForeignModify and the other callbacks described in Section 54.2.3
>are designed around the assumption that the foreign relation will be
>scanned in the usual way and then individual row updates will be driven
>by a local ModifyTable plan node. This approach is necessary for the
>general case where an update requires reading local tables as well as
>foreign tables. However, if the operation could be executed entirely by
>the foreign server, the FDW could generate a path representing that and
>insert it into the UPPERREL_FINAL upper relation, where it would
>compete against the ModifyTable approach. This approach could also be
>used to implement remote SELECT FOR UPDATE, rather than using the row
>locking callbacks described in Section 54.2.4. Keep in mind that a path
>inserted into UPPERREL_FINAL is responsible for implementing all
>behavior of the query.
>
> I don't really see anything that this patch does that wouldn't be better
> done that way.  And I'd kind of like to get a working example of that type
> of path insertion into 9.6, so that we can find out if we need any hooks
> or callbacks that aren't there today.

Well, I guess I'm not quite seeing it.  What do you have in mind?
Just taking a guess here, you might be thinking that instead of
something like this...

  Update on public.ft2
->  Foreign Update on public.ft2

We could boil it all the way down to this:

Foreign Update on public.ft2

But can you, really?  What if the UPDATE is targeting an inheritance
hierarchy containing some local tables and some remote tables?

Apologies if I've completely misunderstood what you have in mind here,
but you haven't really explained what you have in mind here.

IMHO, if you want to do something really cool with the new
pathification stuff, the thing to do would be pushing down aggregates
to foreign servers.  A lot of people would be really happy to see
SELECT count(*) FROM ft ship the count operation to the remote side!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics v14

2016-03-09 Thread Tomas Vondra

On Wed, 2016-03-09 at 18:21 +0100, Tomas Vondra wrote:
> Hi,
> 
> On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
> > On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
> >  wrote:
> > > Hi,
> > >
> > > thanks for the feedback. Attached is v14 of the patch series, fixing
> > > most of the points you've raised.
> > 
> > 
> > Hi Tomas,
> > 
> > Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
> > check if I configure without --enable-cassert.
> 
> Ah, after disabling asserts I can reproduce it too. And the reason why
> it fails is quite simple - clauselist_selectivity modifies the original
> list of clauses, which then confuses cost_qual_eval.

More precisely, it gets confused because the first clause in the list
gets deleted but cost_qual_eval never learns about that, and follows
stale pointer to the next cell, thus a segfault.

> 
> Can you try if the attached patch fixes the issue? I'll need to rework a
> bit more of the code, but let's see if this fixes the issue on your
> machine too.
> 
> > With --enable-cassert, it passes the regression test.
> 
> I wonder how can it work with casserts and fail without them. That's
> kinda exactly the opposite to what I'd expect ...

FWIW it seems to be somehow related to this assert in clausesel.c:

   Assert(count_mv_attnums(list_union(stat_clauses, stat_conditions),   
  relid, MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);

With the assert in place, the code passes without a failure. After
removing the assert (commenting it out), or even just changing it to

Assert(count_mv_attnums(stat_clauses, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST)
 + count_mv_attnums(stat_conditions, relid,
MV_CLAUSE_TYPE_MCV | MV_CLAUSE_TYPE_HIST) >= 2);

i.e. removing the list_union, it fails as expected.

The only thing that I can think of is that list_union happens to place
the right stuff at the right position in memory - pure luck.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-03-09 Thread Tom Lane

Robert Haas  writes:
> Overall, I think this is looking pretty good.

I hadn't been paying any attention to this thread, but I wonder whether
this entire approach isn't considerably inferior to what we can do now
with the planner pathification patch.  To quote from the new docs:

   PlanForeignModify and the other callbacks described in Section 54.2.3
   are designed around the assumption that the foreign relation will be
   scanned in the usual way and then individual row updates will be driven
   by a local ModifyTable plan node. This approach is necessary for the
   general case where an update requires reading local tables as well as
   foreign tables. However, if the operation could be executed entirely by
   the foreign server, the FDW could generate a path representing that and
   insert it into the UPPERREL_FINAL upper relation, where it would
   compete against the ModifyTable approach. This approach could also be
   used to implement remote SELECT FOR UPDATE, rather than using the row
   locking callbacks described in Section 54.2.4. Keep in mind that a path
   inserted into UPPERREL_FINAL is responsible for implementing all
   behavior of the query.

I don't really see anything that this patch does that wouldn't be better
done that way.  And I'd kind of like to get a working example of that type
of path insertion into 9.6, so that we can find out if we need any hooks
or callbacks that aren't there today.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgbench small bug fix

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 4:12 AM, Fabien COELHO  wrote:
>> On Wed, Jan 27, 2016 at 2:31 PM, Fabien COELHO 
>> wrote:
>>>
>>>  - when a duration (-T) is specified, ensure that pgbench ends at that
>>>time (i.e. do not wait for a transaction beyond the end of the run).
>>
>>
>> Every other place where doCustom() returns false is implemented as
>> return clientDone(...).  I think this should probably do the same.
>
>
> Why not. clientDone() second arguments is totally ignored, I put true
> because it looks better.
>
>> I also think that we should probably store the end time someplace
>> instead of recomputing it in multiple places (this patch adds two such
>> places).
>
>
> Why not.
>
> Attached is a v4.

OK, I've committed the fix for the -T part.  It didn't back-patch
cleanly, and it is a minor bug, so I'm not inclined to worry about it
further.

I didn't commit the fix for the -P part, because Alvaro objected to
the proposed method of fixing it as ugly, and I think he's right.
Unless you can come up with a nicer-looking fix, I think that part is
going to stay unfixed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics v14

2016-03-09 Thread Jeff Janes

On Wed, Mar 9, 2016 at 9:21 AM, Tomas Vondra
 wrote:
> Hi,
>
> On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
>> On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
>>  wrote:
>> > Hi,
>> >
>> > thanks for the feedback. Attached is v14 of the patch series, fixing
>> > most of the points you've raised.
>>
>>
>> Hi Tomas,
>>
>> Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
>> check if I configure without --enable-cassert.
>
> Ah, after disabling asserts I can reproduce it too. And the reason why
> it fails is quite simple - clauselist_selectivity modifies the original
> list of clauses, which then confuses cost_qual_eval.
>
> Can you try if the attached patch fixes the issue? I'll need to rework a
> bit more of the code, but let's see if this fixes the issue on your
> machine too.

Yes, that fixes it.


>
>> With --enable-cassert, it passes the regression test.
>
> I wonder how can it work with casserts and fail without them. That's
> kinda exactly the opposite to what I'd expect ...

I too was surprised by that.  Maybe cassert makes a copy of some data
structure which is used in-place without cassert?

Thanks,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: add s2k-count

2016-03-09 Thread Alvaro Herrera

Jeff Janes wrote:
> On Tue, Mar 8, 2016 at 4:09 PM, Alvaro Herrera  
> wrote:

> Yeah, I find that pretty impenetrable too.  I just treated it as a
> black box, I changed how the number passed into it gets set, but not
> the meaning of that number.  Initially I had the user set the one-byte
> format directly because that was much simpler, but before submitting
> the patch I changed it to take the human-readable value and do the
> conversion to the one byte format, because the gpg command-line tool
> takes the number of iterations, not the one byte format, as the input
> for its own s2k-count setting.

Funny -- I partially edited the patch to use the one-byte number instead
too, because that seemed more reasonable, but eventually (looking at
gnupg) decided not to.  And deleted the email on which I explained that,
without sending.

> > I would love to be able to read gnupg's code to figure out what it is
> > that they do, but the structure of their code is even more impenetrable
> > than pgcrypto's.  Perhaps it would be easier to measure the time it
> > takes to run some s2k operations ...
> 
> The timings are about the same between the patched pgcrypto and gpg
> when using the same settings for s2k-count.  Also, if I encrypt with
> gpg with a certain setting, pgcrypto properly detects that iteration
> count and uses it (if it didn't get it correct, it would be unable to
> decrypt).  And vice versa.

OK, it seems we're good then.

> Do we know why the default for pgcrypto is to use a stochastic number
> of iterations?  I don't see how that can enhance security, as the
> number of iterations actually done is not a secret.

Nope, unless Marko has some input there.  I find it baffling too.

> And I see that you committed it now, so thanks for that too.

You're welcome, thanks for the patch.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: add s2k-count

2016-03-09 Thread Jeff Janes

On Tue, Mar 8, 2016 at 4:09 PM, Alvaro Herrera  wrote:
> Jeff Janes wrote:
>> pgcrypto supports s2k-mode for key-stretching during symmetric
>> encryption, and even defaults to s2k-mode=3, which means configurable
>> iterations.  But it doesn't support s2k-count to actually set those
>> iterations to be anything other than the default.  If you are
>> interested in key-stretching, the default is not going to cut it.
>
> Talking about deep rabbit holes ...
>
> I gave this a look.  I adjusted it here and there for project style and
> general cleanliness (something that could be applied to pgcrypto much
> more generally) and eventually arrived at trying to figure out how is
> the s2k count actually used by the underlying algorithm.  I arrived at
> the function calc_s2k_iter_salted which is where it is actually used to
> encrypt things.  But that function is completely devoid of comments and
> not completely trivial.  At this point I cannot for the life of me
> determine whether that function should use the one-byte format specified
> by the relevant RFC (4880) or the decoded, human-understandable number
> of iterations.

Thanks for taking this up.

Yeah, I find that pretty impenetrable too.  I just treated it as a
black box, I changed how the number passed into it gets set, but not
the meaning of that number.  Initially I had the user set the one-byte
format directly because that was much simpler, but before submitting
the patch I changed it to take the human-readable value and do the
conversion to the one byte format, because the gpg command-line tool
takes the number of iterations, not the one byte format, as the input
for its own s2k-count setting.

> I would love to be able to read gnupg's code to figure out what it is
> that they do, but the structure of their code is even more impenetrable
> than pgcrypto's.  Perhaps it would be easier to measure the time it
> takes to run some s2k operations ...

The timings are about the same between the patched pgcrypto and gpg
when using the same settings for s2k-count.  Also, if I encrypt with
gpg with a certain setting, pgcrypto properly detects that iteration
count and uses it (if it didn't get it correct, it would be unable to
decrypt).  And vice versa.

Do we know why the default for pgcrypto is to use a stochastic number
of iterations?  I don't see how that can enhance security, as the
number of iterations actually done is not a secret.

And I see that you committed it now, so thanks for that too.

Cheers,

Jeff

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Optimization for updating foreign tables in Postgres FDW

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 3:30 AM, Etsuro Fujita
 wrote:
> Great!  I changed the naming.  I also updated docs as proposed by you in a
> previous email, and rebased the patch to the latest HEAD.  Please find
> attached an updated version of the patch.

Thanks.  The new naming looks much better (and better also than what I
suggested).

I see that you went and changed all of the places that tested for !=
CMD_SELECT and made them test for == CMD_INSERT || == CMD_UPDATE || ==
CMD_DELETE instead.  I think that's the wrong direction.  I think that
we should use the != CMD_SELECT version of the test everywhere.
That's a single test instead of three, so marginally faster, and maybe
marginally more future-proof.

I think deparsePushedDownUpdateSql should be renamed to use the new
"direct modify" naming, like deparseDirectUpdateSql, maybe.

I would suggest not numbering the tests in postgresPlanDirectModify.
That just becomes a nuisance to keep up to date as things change.

Overall, I think this is looking pretty good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] enums and indexing

2016-03-09 Thread Andrew Dunstan

Currently we don't have a way to create a GIN index on an array of 
enums, or to use an enum field in a GIST index, so it can't be used in 
an exclusion constraint, among other things.


I'd like to work on fixing that if possible. Are there any insuperable 
barriers? If not, what do we need to do?


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Declarative partitioning

2016-03-09 Thread Corey Huinker

>
>
> I think we're converging on a good syntax, but I don't think the
> choice of nothingness to represent an open range is a good idea, both
> because it will probably create grammar conflicts now or later and
> also because it actually is sort of confusing and unintuitive to read
> given the rest of our syntax.  I suggest using UNBOUNDED instead.
>
>
As much as it reminds me of the window syntax I loathe (ROWS BETWEEN
UNBOUNDED gah), I'm inclined to agree with Robert here.

It also probably helps for code forensics in the sense that it's easier to
text search for a something than a nothing.

Re: [HACKERS] raw output from copy

2016-03-09 Thread Corey Huinker

>
>
>> The regression tests seem to adequately cover all new functionality,
>> though I wonder if we should add some cases that highlight situations where
>> BINARY mode is insufficient.
>>
>>
One thing I tried to test RAW was to load an existing json file.

My own personal test was to load an existing .json file into a 1x1 bytea
table, which worked. From there I was able to
select encode(col_name,'escape')::text::jsonb from test_table
and the json was correctly converted.

A similar test copying binary failed.

A write up of the test looks like this:


\copy (select '{"foo": "bar"}') to '/tmp/raw_test.jsonb' (format raw);
COPY 1
create temporary table raw_byte (b bytea);
CREATE TABLE
create temporary table raw_text (t text);
CREATE TABLE
\copy raw_jsonb from '/tmp/raw_test.blob' (format raw);
psql:/home/ubuntu/raw_test.sql:9: ERROR:  relation "raw_jsonb" does not
exist
\copy raw_byte from '/tmp/raw_test.blob' (format raw);
COPY 1
select encode(b,'escape')::text::json from raw_byte;
 encode

 {"foo": "bar"}
(1 row)

\copy raw_text from '/tmp/raw_test.blob' (format raw);
COPY 1
select t::jsonb from raw_text;
   t

 {"foo": "bar"}
(1 row)

create temporary table binary_byte (b bytea);
CREATE TABLE
create temporary table binary_text (t text);
CREATE TABLE
\copy binary_byte from '/tmp/raw_test.blob' (format binary);
psql:/home/ubuntu/raw_test.sql:22: ERROR:  COPY file signature not
recognized
select encode(b,'escape')::jsonb from binary_byte;
 encode

(0 rows)

\copy binary_text from '/tmp/raw_test.blob' (format binary);
psql:/home/ubuntu/raw_test.sql:26: ERROR:  COPY file signature not
recognized
select t::jsonb from binary_text;
 t
---
(0 rows)


So, *if* we want to add a regression test to demonstrate to posterity why
we need RAW for cases that BINARY can't handle, I offer the attached file.

Does anyone else see value in adding that to the regression tests?



> Before I give my approval, I want to read it again more closely to make
>> sure that no cases were skipped with regard to the  (binary || raw) and
>> (binary || !raw) tests. Also, I want to use it on some of my problematic
>> files. Maybe I'll find a good edge case. Probably not.
>>
>
I don't know why I thought this, but when I looked at the patch, I assumed
that the ( binary || raw ) tests were part of a large if/elseif/else
waterfall. They are not. They stand alone. There are no edge cases to find.

Review complete and passed. I can re-review if we want to add the
additional test.


raw_test.sql
Description: application/sql

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pgcrypto: add s2k-count

2016-03-09 Thread Alvaro Herrera

Alvaro Herrera wrote:

> Anyway, assuming that the iteration count was already being used
> correctly, then as far as I'm concerned we're ready to go.  The attached
> patch is what I would commit.

I read some more (gnupg code as well as our own) and applied some more
tweaks, and pushed.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Tom Lane

Alvaro Herrera  writes:
> Tom Lane wrote:
>> Yeah, fixed.  I had assumed that the existing coding in create_gather_plan
>> was OK, because it looked like it was right for a non-projecting node.
>> But actually Gather can project (why though?), so it's not right.

> This looks related:
> https://www.postgresql.org/message-id/CA%2BTgmoai9Ruhwk61110rY4cNAzoO6npsFEOaEKjM7Zz8i3evHg%40mail.gmail.com

Ah, thanks for remembering that thread; I'd forgotten it.

Gather is a bit weird, because although it can project (and needs to,
per the example of needing to compute a non-parallel-safe function),
you would rather push down as much work as possible to the child node;
and doing so is semantically OK for parallel-safe functions.  (Pushing
functions down past a Sort node, for a counterexample, is not so OK
if you are concerned about function evaluation order, or even number
of executions.)

In the current code structure it would perhaps be reasonable to teach
apply_projection_to_path about that --- although this would require
logic to separate parallel-safe and non-parallel-safe subexpressions,
which doesn't quite seem like something apply_projection_to_path
should be doing.

This seems rather closely related to the discussion around Konstantin
Knizhnik's patch to delay function evaluations to after the ORDER BY
sort when possible/useful.  What I think that patch should be doing is
for grouping_planner (or subroutines thereof) to classify tlist items
as volatile or expensive or not and generate pre-sort and post-sort
targetlists appropriately.  That's okay for a top-level feature like
ORDER BY, but it doesn't work for Gather which can appear much further
down in the plan.  So maybe apply_projection_to_path is the only
feasible place.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Patch to implement pg_current_logfile() function

2016-03-09 Thread Gilles Darold

Hi,

Here is a patch that is supposed to solve the remaining problem to
find the current log file used by the log collector after a rotation.
There is lot of external command to try to find this information but
it seems useful to have an internal function to retrieve the name
of the current log file from the log collector.

There is a corresponding item in the TODO list at "Administration"
section. The original thread can be reach at the following link
http://archives.postgresql.org/pgsql-general/2008-11/msg00418.php
The goal is to provide a way to query the log collector subprocess
to determine the name of the currently active log file.

This patch implements the pg_current_logfile() function that can be
used as follow. The function returns NULL when logging_collector
is not active and outputs a warning.

postgres=# \pset null *
postgres=# SELECT pg_current_logfile();
WARNING:  current log can not be reported because log collection is not
active
 pg_current_logfile

 *
(1 line)

So a better query should be:

postgres=# SELECT CASE WHEN current_setting('logging_collector')='on'
THEN pg_current_logfile()
ELSE current_setting('log_destination')
END;
 current_setting
-
 syslog
(1 line)

Same query with log collection active and, for example,
log_filename = 'postgresql-%Y-%m-%d_%H%M%S.log'

postgres=# SELECT CASE WHEN current_setting('logging_collector')='on'
THEN pg_current_logfile()
ELSE current_setting('log_destination')
END;
 current_setting
-
 pg_log/postgresql-2016-03-09_152827.log
(1 line)

Then after a log rotation:

postgres=# SELECT pg_rotate_logfile();
 pg_rotate_logfile
---
 t
(1 line)

postgres=# select pg_current_logfile();
   pg_current_logfile   
-
 pg_log/postgresql-2016-03-09_152908.log
(1 line)

I choose to allow the log collector to write his current log file name
into the lock file 'postmaster.pid'. This allow simple access to this
information through system commands, for example:

postgres@W230ST:~$ tail -n1 /usr/local/pgql-devel/data/postmaster.pid
pg_log/postgresql-2016-03-09_152908.log

Log filename is written at the 8th line position when log collection
is active and all other information have been written to lock file.

The function pg_current_logfile() use in SQL mode read the lock file
to report the information.

I don't know if there's any limitation on using postmaster.pid file to
do that but it seems to me a bit weird to log this information to an
other file. My first attempt was to use a dedicated file and save it
to global/pg_current_logfile or pg_stat_tmp/pg_current_logfile but I
think it is better to use the postmaster.pid file for that. I also
though about a communication protocol or notification with the log
collector subprocess to query and retrieve the name of the currently
active log file. But obviously, it would be too much work for just this
simple function and I can't see any other feature that need such a
work.

Any though? Should I add this patch to the commit fest? If the use
of the postmater.pid file is a problem I can easily modify the patch
to use an alternate file.

Best regards,

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 4b5ee81..313403e 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -15027,6 +15027,12 @@ SELECT * FROM pg_ls_dir('.') WITH ORDINALITY AS t(ls,n);
   
 
   
+   pg_current_logfile()
+   text
+   current log file used by the logging collector
+  
+
+  
pg_notification_queue_usage()
double
fraction of the asynchronous notification queue currently occupied (0-1)
@@ -15264,6 +15270,16 @@ SET search_path TO schema , schema, ..

 

+pg_current_logfile
+   
+
+   
+pg_current_logfile returns the name of the current log
+file used by the logging collector, as a text. Log collection
+must be active.
+   
+
+   
 pg_postmaster_start_time

 
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 9b2e09e..41aaf5d 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -166,7 +166,8 @@ last started with
   Unix-domain socket directory path (empty on Windows),
   first valid listen_address (IP address or *, or empty if
   not listening on TCP),
-  and shared memory segment ID
+  the shared memory segment ID,
+  and the path to the current log file used by the syslogger
   (this file is not present after server shutdown)
 
 
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index e7e488a..40fec7a 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -54,7 +54,6 @@
  */
 #define READ_BUF_SIZE (2 * PIPE_CHUNK_SIZE)
 
-
 /*

Re: [HACKERS] extend pgbench expressions with functions

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 3:09 AM, Fabien COELHO  wrote:
> I'm not sure what is "not acceptable" as it "totally breaks the error
> handling" in the above code.
>
> I assumed that you want to check that sscanf can read what sprintf generated
> when handling "\set". I'd guess that libc would be broken if it was the
> case. I've made pgbench check that it is not.

If the variable contains something that is not an integer, the
existing code will end up here:

fprintf(stderr, "invalid input syntax for integer: \"%s\"\n", str);

With your patch, you get different behavior depending on exactly how
the input is malformed.  I have a strong suspicion you're going to
tell me that this is another one of those cases where it's OK to make
the error handling worse than it is today, but I'm tired of arguing
that point.  I'm not going to commit it this way, and frankly, neither
is anyone else.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics v14

2016-03-09 Thread Tomas Vondra

Hi,

On Wed, 2016-03-09 at 08:45 -0800, Jeff Janes wrote:
> On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
>  wrote:
> > Hi,
> >
> > thanks for the feedback. Attached is v14 of the patch series, fixing
> > most of the points you've raised.
> 
> 
> Hi Tomas,
> 
> Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
> check if I configure without --enable-cassert.

Ah, after disabling asserts I can reproduce it too. And the reason why
it fails is quite simple - clauselist_selectivity modifies the original
list of clauses, which then confuses cost_qual_eval.

Can you try if the attached patch fixes the issue? I'll need to rework a
bit more of the code, but let's see if this fixes the issue on your
machine too.

> With --enable-cassert, it passes the regression test.

I wonder how can it work with casserts and fail without them. That's
kinda exactly the opposite to what I'd expect ...

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/optimizer/path/clausesel.c b/src/backend/optimizer/path/clausesel.c
index 2540da9..ddfdc3b 100644
--- a/src/backend/optimizer/path/clausesel.c
+++ b/src/backend/optimizer/path/clausesel.c
@@ -279,6 +279,10 @@ clauselist_selectivity(PlannerInfo *root,
 		List *solution = choose_mv_statistics(root, relid, stats,
 			  clauses, conditions);
 
+		/* FIXME we must not scribble over the original list */
+		if (solution)
+			clauses = list_copy(clauses);
+
 		/*
 		 * We have a good solution, which is merely a list of statistics that
 		 * we need to apply. We'll apply the statistics one by one (in the order

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 12:07 PM, Alvaro Herrera
 wrote:
> Tom Lane wrote:
>> Amit Kapila  writes:
>
>> > Without setting max_parallel_degree, it works fine and generate the
>> > appropriate results.  Here the issue seems to be that the code in
>> > grouping_planner doesn't apply the required PathTarget to Path below Gather
>> > Path due to which when we generate target list for scan plan,
>>
>> Yeah, fixed.  I had assumed that the existing coding in create_gather_plan
>> was OK, because it looked like it was right for a non-projecting node.
>> But actually Gather can project (why though?), so it's not right.
>
> This looks related:
> https://www.postgresql.org/message-id/CA%2BTgmoai9Ruhwk61110rY4cNAzoO6npsFEOaEKjM7Zz8i3evHg%40mail.gmail.com

I actually wrote a patch for this:

http://www.postgresql.org/message-id/CA+TgmoaqWH8W35c7pssuufxOO=axneqen4da_bkeqaa3se3...@mail.gmail.com

I assume it no longer applies :-( but I think it would be good to get
this into 9.6.  It's a pretty simple optimization with a lot to
recommend it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PROPOSAL] VACUUM Progress Checker.

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 2:37 AM, Amit Langote
 wrote:
> On 2016/03/09 10:11, Amit Langote wrote:
>> The attached revision addresses above and one of Horiguchi-san's comments
>> in his email yesterday.
>
> I fixed one more issue in 0002 per Horiguchi-san's comment.  Sorry about
> so many versions.

I've committed 0001 with heavy revisions.  Just because we don't need
an SQL-visible function to clear the command progress doesn't mean we
don't need to clear it at all; rather, it has to happen automatically.
I also did a bunch of identifier renaming, added datid to the view
output, adjusted the comments, and so on.  Please rebase the remainder
of the series.  Thanks.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-09 Thread Tom Lane

Andrew Dunstan  writes:
> On 03/09/2016 11:07 AM, Tom Lane wrote:
>> I have a vague recollection that we discussed this at the time the enum
>> stuff went in, and there are concurrency issues?  Don't recall details
>> though.

> Rings a vague bell, but should it be any worse than adding new labels?

I think what I was recalling is the hazards discussed in the comments for
RenumberEnumType.  However, the problem there is that a backend could make
inconsistent ordering decisions due to seeing two different pg_enum rows
under different snapshots.  Updating a single row to change its name
doesn't seem to have a comparable hazard, and it wouldn't affect ordering
anyway.  So it's probably no worse than any other object-rename situation.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Alvaro Herrera

Tom Lane wrote:
> Amit Kapila  writes:

> > Without setting max_parallel_degree, it works fine and generate the
> > appropriate results.  Here the issue seems to be that the code in
> > grouping_planner doesn't apply the required PathTarget to Path below Gather
> > Path due to which when we generate target list for scan plan,
> 
> Yeah, fixed.  I had assumed that the existing coding in create_gather_plan
> was OK, because it looked like it was right for a non-projecting node.
> But actually Gather can project (why though?), so it's not right.

This looks related:
https://www.postgresql.org/message-id/CA%2BTgmoai9Ruhwk61110rY4cNAzoO6npsFEOaEKjM7Zz8i3evHg%40mail.gmail.com

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-09 Thread Andrew Dunstan




On 03/09/2016 11:07 AM, Tom Lane wrote:

Andrew Dunstan  writes:

On 03/09/2016 09:56 AM, Matthias Kurz wrote:

Right now it is not possible to rename an enum value.
Are there plans to implement this anytime soon?

I don't know of any plans, but it would be a useful thing. I agree it
wouldn't be too hard. The workaround is to do an update on pg_enum
directly, but proper SQL support would be much nicer.

I have a vague recollection that we discussed this at the time the enum
stuff went in, and there are concurrency issues?  Don't recall details
though.





Rings a vague bell, but should it be any worse than adding new labels?

cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH v6] GSSAPI encryption support

2016-03-09 Thread David Steele

Hi Robbie,

On 3/8/16 5:44 PM, Robbie Harwood wrote:
> Hello friends,
> 
> Here's yet another version of GSSAPI encryption support.  It's also
> available for viewing on my github:

I got this warning when applying the first patch in the set:

../other/v6-0001-Move-common-GSSAPI-code-into-its-own-files.patch:245:
new blank line at EOF.
+
warning: 1 line adds whitespace errors.

I know it's minor but I'm always happier when patches apply cleanly.

The build went fine but when testing I was unable to logon at all.  I'm
using the same methodology as in
http://www.postgresql.org/message-id/56be0ff9.70...@pgmasters.net except
that I'm running against 51c0f63 and using the v6 patch set.

psql simply hangs and never returns.  I have attached a pcap of the
psql/postgres session generated with:

tcpdump -i lo -nnvvXSs 1514 port 5432 -w gssapi.pcap

If you would like me to capture more information please let me know
specifically how you would like me to capture it.

I reverted to v5 and got the same behavior I was seeing with v4 and v5,
namely that I can only logon occasionally and usually get this error:

psql: expected authentication request from server, but received

Using a fresh build from 51c0f63 I can logon reliably every time so I
don't think there's an issue in my environment.

-- 
-David
da...@pgmasters.net

gssapi.pcap
Description: Binary data

signature.asc
Description: OpenPGP digital signature

Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Christian Ullrich

* Magnus Hagander wrote:

On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
wrote:

* Magnus Hagander wrote:

How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?

Not unless mingw defines _MSC_VER.

The question is then - should it, under some conditions?

I have no clue about MinGW, so all I can say is that msvcrt.dll (even in 
Windows 10) does not use the problematic instructions, nor does it 
export anything with "FMA" in its name. If there is nothing to turn off, 
and nothing to turn it off with, then I suspect we are safe there. (This 
is based on a minimal test program, just to see where MinGW takes its 
log() from; it comes from msvcrt.)

Also, we have confirmation, in the link I sent when I started this 
thread, that the bug is only and specifically in the VS2013 CRT, not 
anywhere else before or since.

I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?

No, that is fine. I think it's just to keep the function name from getting
too ridiculously long. The functions in  are all named
for the client versions only, and only check the version number, not the
client/server capability flags. Or, rather, there is a separate function to
determine that.

Presumably the link is documented somewhere (the docs don't seem to say
anything about it).

("IsWindows7SP1OrGreater function"):

> This function does not differentiate between client and server
> releases. It will return true if the current OS version number is
> equal to or higher than the version of the client named in the call.
> For example, a call to IsWindowsXPSP3OrGreater will return true on
> Windows Server 2008. Applications that need to distinguish between
> server and client versions of Windows should call IsWindowsServer.

The internal version numbers (since NT4) are:

 Version | Client | Server
-++
 4.0 | NT4| NT4
 5.0 | 2000   | 2000
 5.1 | XP |
 5.2 || 2003
 6.0 | Vista  | 2008
 6.1 | 7  | 2008R2
 6.2 | 8  | 2012
 6.3 | 8.1| 2012R2
10.0 | 10 | [2016]

The relevant SDK header (), where constants for the version 
numbers are defined, also only has entries named for the client 
versions, with the sole exception of 2008, with the same value as Vista, 
and the special case of 2003 with no equivalent client version (unless 
you count the second attempt at an XP x64 ... but I digress again).

If Microsoft wanted the added complexity of separate symbols for what is 
basically the same code, I rather think they would have them.

--
Christian

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Alvaro Herrera

Magnus Hagander wrote:
> On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
> wrote:

> > And apparently not a single one with VS 2013. OK, I'll see what I can do
> > about setting some up soonish, at least with (server) 2008 and (client) 7.
> > FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
> > now, with no complaints.
> 
> Having that added to the buildfarm would definitely be very useful!

Yeah, ideally we would have one buildfarm member for each MSVC compiler
supported by each Windows version.  AFAICS we only have

Windows 8 8.1 Pro Visual Studio 2012 x86_64
Windows Server 2008 R2 (64bit) Visual C++ 2005 AMD64
Windows Server 2003 R2 5.2.3790 MSVC 2005 Express 8.0.50727.42 x86
Vista Ultimate 6.0.6000 MSVC 2005 Pro 8.0.50727.867 x86
Windows XP-PRO SP3 MSVC++ 2008 Express i686

which is pretty unfulfilling.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] multivariate statistics v14

2016-03-09 Thread Jeff Janes

On Wed, Mar 9, 2016 at 7:02 AM, Tomas Vondra
 wrote:
> Hi,
>
> thanks for the feedback. Attached is v14 of the patch series, fixing
> most of the points you've raised.


Hi Tomas,

Applied to aa09cd242fa7e3a694a31f, I still get the seg faults in make
check if I configure without --enable-cassert.

With --enable-cassert, it passes the regression test.

I got the core file, configured and compiled with:
CFLAGS="-fno-omit-frame-pointer"  --enable-debug

The first core dump is on this statement:

  -- check explain (expect bitmap index scan, not plain index scan)
  INSERT INTO functional_dependencies
   SELECT i/1, i/2, i/4 FROM generate_series(1,100) s(i);

bt

#0  0x006e1160 in cost_qual_eval (cost=0x2494418,
quals=0x2495550, root=0x2541b88) at costsize.c:3181
#1  0x006e1ee5 in set_baserel_size_estimates (root=0x2541b88,
rel=0x2494300) at costsize.c:3754
#2  0x006d37e8 in set_plain_rel_size (root=0x2541b88,
rel=0x2494300, rte=0x247e660) at allpaths.c:480
#3  0x006d353d in set_rel_size (root=0x2541b88, rel=0x2494300,
rti=1, rte=0x247e660) at allpaths.c:350
#4  0x006d338f in set_base_rel_sizes (root=0x2541b88) at allpaths.c:270
#5  0x006d3233 in make_one_rel (root=0x2541b88,
joinlist=0x2494628) at allpaths.c:169
#6  0x0070012e in query_planner (root=0x2541b88,
tlist=0x2541e58, qp_callback=0x7048d4 ,
qp_extra=0x7ffefa6474e0)
at planmain.c:246
#7  0x00702a33 in grouping_planner (root=0x2541b88,
inheritance_update=0 '\000', tuple_fraction=0) at planner.c:1647
#8  0x00701310 in subquery_planner (glob=0x2541af8,
parse=0x246a838, parent_root=0x0, hasRecursion=0 '\000',
tuple_fraction=0) at planner.c:740
#9  0x0070055b in standard_planner (parse=0x246a838,
cursorOptions=256, boundParams=0x0) at planner.c:290
#10 0x0070023f in planner (parse=0x246a838, cursorOptions=256,
boundParams=0x0) at planner.c:160
#11 0x007b8bf9 in pg_plan_query (querytree=0x246a838,
cursorOptions=256, boundParams=0x0) at postgres.c:798
#12 0x005d1967 in ExplainOneQuery (query=0x246a838, into=0x0,
es=0x246a778,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM
mcv_list WHERE a = 10 AND b = 5;", params=0x0) at explain.c:350
#13 0x005d16a3 in ExplainQuery (stmt=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM mcv_list
WHERE a = 10 AND b = 5;",
params=0x0, dest=0x246a6e8) at explain.c:244
#14 0x007c0afb in standard_ProcessUtility (parsetree=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM
mcv_list WHERE a = 10 AND b = 5;", context=PROCESS_UTILITY_TOPLEVEL,
params=0x0,
dest=0x246a6e8, completionTag=0x7ffefa647b60 "") at utility.c:659
#15 0x007c0299 in ProcessUtility (parsetree=0x2444f90,
queryString=0x2443d80 "EXPLAIN (COSTS off)\n SELECT * FROM mcv_list
WHERE a = 10 AND b = 5;",
context=PROCESS_UTILITY_TOPLEVEL, params=0x0, dest=0x246a6e8,
completionTag=0x7ffefa647b60 "") at utility.c:335
#16 0x007bf47b in PortalRunUtility (portal=0x23ed510,
utilityStmt=0x2444f90, isTopLevel=1 '\001', dest=0x246a6e8,
completionTag=0x7ffefa647b60 "")
at pquery.c:1183
#17 0x007bf1ce in FillPortalStore (portal=0x23ed510,
isTopLevel=1 '\001') at pquery.c:1057
#18 0x007beb19 in PortalRun (portal=0x23ed510,
count=9223372036854775807, isTopLevel=1 '\001', dest=0x253f6c0,
altdest=0x253f6c0,
completionTag=0x7ffefa647d40 "") at pquery.c:781
#19 0x007b90ae in exec_simple_query (query_string=0x2443d80
"EXPLAIN (COSTS off)\n SELECT * FROM mcv_list WHERE a = 10 AND b =
5;")
at postgres.c:1094
#20 0x007bcfac in PostgresMain (argc=1, argv=0x23d5070,
dbname=0x23d4e48 "regression", username=0x23d4e30 "jjanes") at
postgres.c:4021
#21 0x00745a62 in BackendRun (port=0x23f4110) at postmaster.c:4258
#22 0x007451d6 in BackendStartup (port=0x23f4110) at postmaster.c:3932
#23 0x00741ab7 in ServerLoop () at postmaster.c:1690
#24 0x007411c0 in PostmasterMain (argc=8, argv=0x23d3f20) at
postmaster.c:1298
#25 0x00690026 in main (argc=8, argv=0x23d3f20) at main.c:223

Cheers,

Jeff


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal: RETURNING primary_key()

2016-03-09 Thread Igal @ Lucee.org


On 3/8/2016 5:12 PM, Craig Ringer wrote:
One of the worst problems (IMO) is in the driver architecture its 
self. It attempts to prevent blocking by guestimating the server's 
send buffer state and its recv buffer state, trying to stop them 
filling and causing the server to block on writes. It should just 
avoid blocking on its own send buffer, which it can control with 
confidence. Or use some of Java's rather good concurrency/threading 
features to simultaneously consume data from the receive buffer and 
write to the send buffer when needed, like pgjdbc-ng does.


Are there good reasons to use pgjdbc over pgjdbc-ng then?



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tom Lane

"Shulgin, Oleksandr"  writes:
> Yes, I now recall that my actual concern was that sample_cnt may calculate
> to 0 due to the latest condition above, but that also implies track_cnt ==
> 0, and then we have a for loop there which will not run at all due to this,
> so I figured we can avoid calculating avgcount and running the loop
> altogether with that check.  I'm not opposed to changing the condition if
> that makes the code easier to understand (or dropping it altogether if
> calculating 0/0 is believed to be harmless anyway).

Avoiding intentional zero divides is good.  It might happen to work
conveniently on your machine, but I wouldn't think it's portable.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Shulgin, Oleksandr

On Wed, Mar 9, 2016 at 1:33 PM, Tomas Vondra 
wrote:

> Hi,
>
> On Wed, 2016-03-09 at 11:23 +0100, Shulgin, Oleksandr wrote:
> > On Tue, Mar 8, 2016 at 8:16 PM, Alvaro Herrera
> >  wrote:
> >
> > Also, I can't quite figure out why the "else" now in line 2131
> > is now "else if track_cnt != 0".  What happens if track_cnt is
> > zero?
> > The comment above the "if" block doesn't provide any guidance.
> >
> >
> > It is there only to avoid potentially dividing zero by zero when
> > calculating avgcount (which will not be used after that anyway).  I
> > agree it deserves a comment.
>
> That definitely deserves a comment. It's not immediately clear why
> (track_cnt != 0) would prevent division by zero in the code. The only
> way such error could happen is if ndistinct==0, because that's the
> actual denominator. Which means this
>
> ndistinct = ndistinct * sample_cnt
>
> would have to evaluate to 0. But ndistinct==0 can't happen as we're in
> the (nonnull_cnt > 0) branch, and that guarantees (standistinct != 0).
>
> Thus the only possibility seems to be (nonnull_cnt==toowide_cnt). Why
> not to use this condition instead?
>

Yes, I now recall that my actual concern was that sample_cnt may calculate
to 0 due to the latest condition above, but that also implies track_cnt ==
0, and then we have a for loop there which will not run at all due to this,
so I figured we can avoid calculating avgcount and running the loop
altogether with that check.  I'm not opposed to changing the condition if
that makes the code easier to understand (or dropping it altogether if
calculating 0/0 is believed to be harmless anyway).

--
Alex

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tom Lane

Tomas Vondra  writes:
> On Wed, 2016-03-09 at 12:02 -0300, Alvaro Herrera wrote:
>> Tomas Vondra wrote:
>>> FWIW while looking at the code I noticed that we skip wide varlena
>>> values but not cstrings. Seems a bit suspicious.

>> Uh, can you actually have columns of cstring type?  I don't think you
>> can ...

> Yeah, but then why do we handle that in compute_scalar_stats?

If you're looking at what I think you're looking at, we aren't bothering
because we assume that cstrings won't be very wide.  Since they're not
toastable or compressable, they certainly won't exceed BLCKSZ.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] the include-timestamp data returned by pg_logical_slot_peek_changes() is always 2000-01-01 in 9.5.1

2016-03-09 Thread hailong . li




On 2016年03月09日 23:31, Andres Freund wrote:

Did you enable track_commit_timestamps in the server?

That's unrelated, commit ts is about an xid timestamp mapping.


Yes, whether  enable track_commit_timestamp  or not, I have just done 
the test again  and the result is the same.


Thxs all!

Best regards!

--
*PostgreSQL DBA hailong.li of Qunar*

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Tomas Vondra

On Wed, 2016-03-09 at 12:02 -0300, Alvaro Herrera wrote:
> Tomas Vondra wrote:
> 
> > FWIW while looking at the code I noticed that we skip wide varlena
> > values but not cstrings. Seems a bit suspicious.
> 
> Uh, can you actually have columns of cstring type?  I don't think you
> can ...

Yeah, but then why do we handle that in compute_scalar_stats?

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-09 Thread Tom Lane

Andrew Dunstan  writes:
> On 03/09/2016 09:56 AM, Matthias Kurz wrote:
>> Right now it is not possible to rename an enum value.
>> Are there plans to implement this anytime soon?

> I don't know of any plans, but it would be a useful thing. I agree it 
> wouldn't be too hard. The workaround is to do an update on pg_enum 
> directly, but proper SQL support would be much nicer.

I have a vague recollection that we discussed this at the time the enum
stuff went in, and there are concurrency issues?  Don't recall details
though.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Alter or rename enum value

2016-03-09 Thread Andrew Dunstan




On 03/09/2016 09:56 AM, Matthias Kurz wrote:

Hi!

Right now it is not possible to rename an enum value.
Are there plans to implement this anytime soon?
I had a bit of a discussion on the IRC channel and it seems it 
shouldn't be that hard to implement this.

Again, I am talking about renaming the values, not the enum itself.





I don't know of any plans, but it would be a useful thing. I agree it 
wouldn't be too hard. The workaround is to do an update on pg_enum 
directly, but proper SQL support would be much nicer.


cheers

andrew


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Suspicious behaviour on applying XLOG_HEAP2_VISIBLE.

2016-03-09 Thread Masahiko Sawada

Hi all,

I faced suspicious behaviour on hot standby server related to visibility map.
The scenario is,

1. Create table and check internal of visibility map on master server.
postgres(1)=# create table hoge (col int);
CREATE TABLE
postgres(1)=# insert into hoge select generate_series(1,10);
INSERT 0 10
postgres(1)=# select * from pg_visibility('hoge');
 blkno | all_visible | all_frozen | pd_all_visible
---+-++
 0 | f   | f  | f
(1 row)

2. Check internal of visibility map on standby server.
postgres(1)=# select * from pg_visibility('hoge');
 blkno | all_visible | all_frozen | pd_all_visible
---+-++
 0 | f   | f  | f
(1 row)

3. Do VACUUM on master server.
postgres(1)=# vacuum hoge;
VACUUM
postgres(1)=# select * from pg_visibility('hoge');
 blkno | all_visible | all_frozen | pd_all_visible
---+-++
 0 | t   | f  | t
(1 row)

4. Check internal of visibility map on standby server again.
** Note that the we use the connection we established at #2 again **
postgres(1)=# select * from pg_visibility('hoge');
 blkno | all_visible | all_frozen | pd_all_visible
---+-++
 0 | f   | f  | t
(1 row)

Even on standby server, the result should be (t,f,t), but returned (f,f,t)
(XLOG_HEAP2_VISIBLE record actually has been reached to standby, and
has been surely  applied)

As a result of looked into code around the recvoery, ISTM that the
cause is related to relation cache clear.
In heap_xlog_visible, if the standby server receives WAL record then
relation cache is eventually cleared in vm_extend,  but If standby
server receives FPI then relation cache would not be cleared.
For example, after I applied attached patch to HEAD, (it might not be
right way but) this problem seems to be resolved.

Is this a bug? or not?

Regards,

--
Masahiko Sawada


heap_xlog_visible_invalidate_cache.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] WIP: Upper planner pathification

2016-03-09 Thread Tom Lane

Amit Kapila  writes:
> On latest commit-51c0f63e, I am seeing some issues w.r.t parallel query.
> Consider a below case:

> create table t1(c1 int, c2 char(1000));
> insert into t1 values(generate_series(1,30),'');
> analyze t1;
> set max_parallel_degree=2;
> postgres=# explain select c1, count(c1) from t1 where c1 < 1000 group by c1;
> ERROR:  ORDER/GROUP BY expression not found in targetlist

Hm.  That does not speak well for the coverage of the "run the regression
tests with force_parallel_mode enabled" testing approach.

> Without setting max_parallel_degree, it works fine and generate the
> appropriate results.  Here the issue seems to be that the code in
> grouping_planner doesn't apply the required PathTarget to Path below Gather
> Path due to which when we generate target list for scan plan,

Yeah, fixed.  I had assumed that the existing coding in create_gather_plan
was OK, because it looked like it was right for a non-projecting node.
But actually Gather can project (why though?), so it's not right.

> Approach-2
> --
> Always generate a tlist in Gather plan as we do for many other cases.  I
> think this approach will also resolve the issue but haven't tried yet.

I think this is the correct fix.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Pushing down sorted joins

2016-03-09 Thread Robert Haas

On Wed, Mar 9, 2016 at 2:23 AM, Ashutosh Bapat
 wrote:
> [ new patch ]

This looks OK to me.  Committed!

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Magnus Hagander

On Wed, Mar 9, 2016 at 4:36 PM, Christian Ullrich 
wrote:

> * Magnus Hagander wrote:
>
> On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
>> wrote:
>>
>
> On February 13, 2016 4:10:34 PM Tom Lane  wrote:
>>>
>>
> I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
 the code compiles on *exactly one* MSVC version.

>>>
>>> The bug exists in only that compiler version's CRT, also, that is not the
>>> complete version number. There may be different builds somewhere, but
>>> they
>>> all start with 18.0.
>>>
>>
> IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
>> they don't actually bump that number in different build numbers.
>>
>> How does this work wrt mingw, though? Do we have the same problem there?
>> AIUI this code can never run on mingw, correct?
>>
>
> Not unless mingw defines _MSC_VER.
>

The question is then - should it, under some conditions?



> (If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support
> instead. IMHO everyone should boycott them anyway until they come up with a
> working libc of their own instead of doing unspeakable things to a helpless
> msvcrt.dll that is not intended for use by non-system components at all.
> But I digress.)



Haha, well, at least they're not cygwin...



> I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
>> W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
>> be a separate check for server-windows?
>>
>
> No, that is fine. I think it's just to keep the function name from getting
> too ridiculously long. The functions in  are all named
> for the client versions only, and only check the version number, not the
> client/server capability flags. Or, rather, there is a separate function to
> determine that.


Presumably the link is documented somewhere (the docs don't seem to say
anything about it).


> That would
 give us some context to estimate the risks of this code executing
 when it's not really needed.

>>>
>>> Hence all the conditions. The problem is *certain* to occur under these
>>> specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
>>> when built with VS2013), and under no others, and these conditions flip
>>> the
>>> switch exactly then.
>>>
>>
> Well, it doesn't flip it based on the specifics "running on a CPU with
>> AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
>> no-op.
>>
>
> Precisely.
>
> Isn't that what the buildfarm is (among other things) for?
>>>
>>
>> The buildfarm doesn't really have a big spread of Windows animals,
>> unfortunately.
>>
>
> And apparently not a single one with VS 2013. OK, I'll see what I can do
> about setting some up soonish, at least with (server) 2008 and (client) 7.
> FWIW, I have a local build of 9.5.1 with this patch in daily use on 2008
> now, with no complaints.


Having that added to the buildfarm would definitely be very useful!


-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Christian Ullrich


* Magnus Hagander wrote:


On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
wrote:



On February 13, 2016 4:10:34 PM Tom Lane  wrote:



I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
the code compiles on *exactly one* MSVC version.


The bug exists in only that compiler version's CRT, also, that is not the
complete version number. There may be different builds somewhere, but they
all start with 18.0.



IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
they don't actually bump that number in different build numbers.

How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?


Not unless mingw defines _MSC_VER.

(If they do, I suggest we make Yury Zhuravlev cry and drop MinGW support 
instead. IMHO everyone should boycott them anyway until they come up 
with a working libc of their own instead of doing unspeakable things to 
a helpless msvcrt.dll that is not intended for use by non-system 
components at all. But I digress.)



I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?


No, that is fine. I think it's just to keep the function name from 
getting too ridiculously long. The functions in  are 
all named for the client versions only, and only check the version 
number, not the client/server capability flags. Or, rather, there is a 
separate function to determine that.



That would
give us some context to estimate the risks of this code executing
when it's not really needed.


Hence all the conditions. The problem is *certain* to occur under these
specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
when built with VS2013), and under no others, and these conditions flip the
switch exactly then.



Well, it doesn't flip it based on the specifics "running on a CPU with
AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
no-op.


Precisely.


Isn't that what the buildfarm is (among other things) for?


The buildfarm doesn't really have a big spread of Windows animals,
unfortunately.


And apparently not a single one with VS 2013. OK, I'll see what I can do 
about setting some up soonish, at least with (server) 2008 and (client) 
7. FWIW, I have a local build of 9.5.1 with this patch in daily use on 
2008 now, with no complaints.


--
Christian



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] the include-timestamp data returned by pg_logical_slot_peek_changes() is always 2000-01-01 in 9.5.1

2016-03-09 Thread Andres Freund



On March 9, 2016 4:26:01 AM PST, Craig Ringer  wrote:
>On 9 March 2016 at 18:13, 李海龙  wrote:
>
>>
>>
>> HI, pgsql-hackers
>>
>> The include-timestamp data returned by 
>pg_logical_slot_peek_changes() is
>> always 2000-01-01 in 9.5.1, is it not normal？

It's a bug in 9.5, that has been fixed. You can either update to the version 
from git, wait for the next minor release, or use 9.4, where the buff is not 
present.

>Did you enable track_commit_timestamps in the server?

That's unrelated, commit ts is about an xid timestamp mapping.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Alvaro Herrera

Aleksander Alekseev wrote:

> pg_receivexlog: could not send replication command "START_REPLICATION":
> out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
> again pg_receivexlog: starting log streaming at 0/100 (timeline 1)
> 
> Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
> msgLength=3) at fe-protocol3.c:1398 1398  const char
> *errmsg = NULL;
> ```
> 
> Granted this behaviour is a bit better then the current one. But
> basically it's the same infinite loop only with pauses and warnings. I
> wonder if this is a behaviour we really want. For instance wouldn't it
> be better just to terminate an application in out-of-memory case? "Let
> it crash" as Erlang programmers say.

Hmm.  It would be useful to retry in the case that there is a chance
that the program releases memory and can continue later.  But if it will
only stay there doing nothing other than retrying, then that obviously
will not happen.  One situation where this might help is if the overall
*system* is short on memory and we expect that situation to resolve
itself after a while -- after all, if the system is so loaded that it
can't allocate a few more bytes for the COPY message, then odds are that
other things are also crashing and eventually enough memory will be
released that pg_receivexlog can continue.

On the other hand, if the system is so loaded, perhaps it's better to
"let it crash" and have it restart later -- presumably once the admin
notices the problem and restarts it manually after cleaning up the mess.

If all programs are well behaved and nothing crashes when OOM but they
all retry instead, then everything will continue to retry infinitely and
make no progress.  That cannot be good.

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More stable query plans via more predictable column statistics

2016-03-09 Thread Alvaro Herrera

Tomas Vondra wrote:

> FWIW while looking at the code I noticed that we skip wide varlena
> values but not cstrings. Seems a bit suspicious.

Uh, can you actually have columns of cstring type?  I don't think you
can ...

-- 
Álvaro Herrerahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Crash with old Windows on new CPU

2016-03-09 Thread Magnus Hagander

On Sat, Feb 13, 2016 at 4:45 PM, Christian Ullrich 
wrote:

> On February 13, 2016 4:10:34 PM Tom Lane  wrote:
>
> > Christian Ullrich  writes:
> >> * Robert Haas wrote:
> >>> Thanks for the report and patch.  Regrettably I haven't the Windows
> >>> knowledge to have any idea whether it's right or wrong, but hopefully
> >>> someone who knows Windows will jump in here.
> >
> >> In commitfest now.
> >
> > FWIW, I'm a tad suspicious of the notion that it's our job to make this
> > case work.  How practical is it really to run a Windows release on
> > unsupported-by-Microsoft hardware --- aren't dozens of other programs
> > going to have the same issue?
>
> Why would the hardware be unsupported? The problem occurs on new CPUs, not
> old ones, and even the OS (2008) is still in extended support until next
> year, IIRC.
>
> > I'm also suspicious of the "#if _MSC_VER == 1800" tests, that is,
> > the code compiles on *exactly one* MSVC version.
>
> The bug exists in only that compiler version's CRT, also, that is not the
> complete version number. There may be different builds somewhere, but they
> all start with 18.0.
>


IIRC, the CRT itself does explicit checks against _MSC_VER == 1800. As in,
they don't actually bump that number in different build numbers.


How does this work wrt mingw, though? Do we have the same problem there?
AIUI this code can never run on mingw, correct?


> Lastly, I'd like to see some discussion of what side effects
> > "_set_FMA3_enable(0);" has ... I rather doubt that it's really
> > a magic-elixir-against-crashes-with-no-downsides.
>
> It tells the math library (in the CRT, no separate libm on Windows) not to
> use the AVX2-based implementations of log() and possibly other functions.
> AIUI, FMA means "fused multiply-add" and is apparently something that
> increases performance and accuracy in transcendental functions.
>
> I can check the CRT source later today and figure out exactly what it does.
>
> Also, if you look at the link I sent, you will find that a member of the
> Visual C++ Libraries team at MS is the source for the workaround. They
> probably know what they are doing, present circumstances excepted.
>

I notice the code checks IsWindows7SP1OrGreater() but the comment refers to
W7SP1 *or* 2008R2 SP1. I assume this is correct, or should there actually
be a separate check for server-windows?




> > That would
> > give us some context to estimate the risks of this code executing
> > when it's not really needed.
>
> Hence all the conditions. The problem is *certain* to occur under these
> specific conditions (x64 code on Windows before 7SP1 on a CPU with AVX2
> when built with VS2013), and under no others, and these conditions flip the
> switch exactly then.
>


Well, it doesn't flip it based on the specifics "running on a CPU with
AVX2". But presumably turning of AVX2 if the CPU doesn't support it is a
no-op.



> Without that, I'd be worried that
> > this cure is worse than the disease because it breaks cases that
> > weren't broken before.
>
> Isn't that what the buildfarm is (among other things) for?
>

The buildfarm doesn't really have a big spread of Windows animals,
unfortunately.

-- 
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Re: [HACKERS] OOM in libpq and infinite loop with getCopyStart()

2016-03-09 Thread Aleksander Alekseev

Hello, Michael

Thanks a lot for steps to reproduce you provided.

I tested your path on Ubuntu Linux 14.04 LTS (GCC 4.8.4) and FreeBSD
10.2 RELEASE (Clang 3.4.1). In both cases patch applies cleanly, there
are no warnings during compilation and all regression tests pass. A few
files are not properly pgindent'ed though:

```
diff --git a/src/interfaces/libpq/fe-exec.c
b/src/interfaces/libpq/fe-exec.c index c99f193..2769719 100644
--- a/src/interfaces/libpq/fe-exec.c
+++ b/src/interfaces/libpq/fe-exec.c
@@ -2031,7 +2031,7 @@ PQexecFinish(PGconn *conn)
conn->status == CONNECTION_BAD)
break;
else if ((conn->asyncStatus == PGASYNC_COPY_IN ||
- conn->asyncStatus == PGASYNC_COPY_OUT  ||
+ conn->asyncStatus == PGASYNC_COPY_OUT ||
  conn->asyncStatus == PGASYNC_COPY_BOTH) &&
 result->resultStatus == PGRES_FATAL_ERROR)
break;
diff --git a/src/interfaces/libpq/fe-protocol3.c
b/src/interfaces/libpq/fe-protocol3.c index 21a1d9b..280ca16 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -49,9 +49,9 @@ static intgetParamDescriptions(PGconn *conn, int
msgLength); static int getAnotherTuple(PGconn *conn, int msgLength);
 static int getParameterStatus(PGconn *conn);
 static int getNotify(PGconn *conn);
-static int getCopyStart(PGconn *conn,
-ExecStatusType copytype,
-int msgLength);
+static int getCopyStart(PGconn *conn,
+ExecStatusType copytype,
+int msgLength);
 static int getReadyForQuery(PGconn *conn);
 static void reportErrorPosition(PQExpBuffer msg, const char *query,
int loc, int encoding);
```

Indeed your patch solves issues you described. Still here is something
that concerns me:

```
$ gdb --args pg_receivexlog --verbose -D ./temp_data/

(gdb) b getCopyStart
Function "getCopyStart" not defined.
Make breakpoint pending on future shared library load? (y or [n]) y
Breakpoint 1 (getCopyStart) pending.
(gdb) r
Starting program: /usr/local/pgsql/bin/pg_receivexlog --verbose
-D ./temp_data/ [Thread debugging using libthread_db enabled]
Using host libthread_db library
"/lib/x86_64-linux-gnu/libthread_db.so.1". pg_receivexlog: starting log
streaming at 0/100 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610220, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398const char
*errmsg = NULL; (gdb) n
1400result = PQmakeEmptyPGresult(conn, copytype);
(gdb) 
1401if (!result)
(gdb) p result = 0
$1 = (PGresult *) 0x0
(gdb) c
Continuing.
pg_receivexlog: could not send replication command "START_REPLICATION":
out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
again pg_receivexlog: starting log streaming at 0/100 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398const char
*errmsg = NULL; (gdb) n
1400result = PQmakeEmptyPGresult(conn, copytype);
(gdb) n
1401if (!result)
(gdb) p result = 0
$2 = (PGresult *) 0x0
(gdb) c
Continuing.
pg_receivexlog: could not send replication command "START_REPLICATION":
out of memory pg_receivexlog: disconnected; waiting 5 seconds to try
again pg_receivexlog: starting log streaming at 0/100 (timeline 1)

Breakpoint 1, getCopyStart (conn=0x610180, copytype=PGRES_COPY_BOTH,
msgLength=3) at fe-protocol3.c:1398 1398const char
*errmsg = NULL;
```

Granted this behaviour is a bit better then the current one. But
basically it's the same infinite loop only with pauses and warnings. I
wonder if this is a behaviour we really want. For instance wouldn't it
be better just to terminate an application in out-of-memory case? "Let
it crash" as Erlang programmers say.

Best regards,
Aleksander


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Alter or rename enum value

2016-03-09 Thread Matthias Kurz

Hi!

Right now it is not possible to rename an enum value.
Are there plans to implement this anytime soon?
I had a bit of a discussion on the IRC channel and it seems it shouldn't be
that hard to implement this.
Again, I am talking about renaming the values, not the enum itself.

Thanks!

Greetings,
Matthias

[HACKERS] Explain [Analyze] produces parallel scan for select Into table statements.

2016-03-09 Thread Mithun Cy

Hi All,

Explain [Analyze] Select Into table. produces the plan which uses
parallel scans.

*Test:*

create table table1 (n int);
insert into table1 values (generate_series(1,500));
analyze table1;

set parallel_tuple_cost=0;

set max_parallel_degree=3;

postgres=# explain select into table2 from table1;

  QUERY PLAN


---

 Gather  (cost=1000.00..39253.03 rows=500 width=0)

   Number of Workers: 3

   ->  Parallel Seq Scan on table1  (cost=0.00..38253.03 rows=1612903
width=0)

(3 rows)

-

*So Explain Analyze Fails.*

postgres=#  explain analyze select into table2 from table1;

ERROR:  cannot insert tuples during a parallel operation

STATEMENT:  explain analyze select into table2 from table1;

*But actual execution is successful.*

postgres=# select into table2 from table1;

SELECT 500

Reason is in ExplainOneQuery we unconditionally

pass CURSOR_OPT_PARALLEL_OK to pg_plan_query even if query might be from

CreateTableAs/ SelectInto. Whereas in *ExecCreateTableAs *it is always 0*.*

*Possible Fix:*

I tried to make a patch to fix this. Now in ExplainOneQuery if into clause
is

defined then parallel plans are disabled as similar to their execution.
-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


Analyze_select_into_disable_parallel_scan.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

1 2 >

1 - 100 of 139 matches

Mail list logo