Re: [HACKERS] Re: ICU collation variant keywords and pg_collation entries (Was: [BUGS] Crash report for some ICU-52 (debian8) COLLATE and work_mem values)

2017-08-18 Thread Peter Eisentraut
On 8/17/17 23:13, Noah Misch wrote:
>> I haven't read anything since that has provided any more clarity about
>> what needs changing here.  I will entertain concrete proposals about the
>> specific points above (considering any other issues under discussion to
>> be PG11 material), but in the absence of that, I don't plan any work on
>> this right now.
> I think you're contending that, as formulated, this is not a valid v10 open
> item.  Are you?

Well, some people are not content with the current state of things, so
it is probably an open item.  I will propose patches on Monday to
hopefully close this.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Pattern based listeners for asynchronous messaging (LISTEN/NOTIFY)

2017-08-18 Thread Thomas Munro
On Tue, Aug 1, 2017 at 8:13 AM, Markus Sintonen
 wrote:
> This patch adds an ability to use patterns in LISTEN commands. Patch uses
> 'SIMILAR TO' patterns for matching NOTIFY channel names
> (https://www.postgresql.org/docs/9.0/static/functions-matching.html#FUNCTIONS-SIMILARTO-REGEXP).
>
> This patch is related to old discussion in
> https://www.postgresql.org/message-id/52693fc5.7070...@gmail.com. This
> discussion contains the reasoning behind the pattern based matching of the
> channel names.

Nice idea.

The "async" regression test consistently crashes on my FreeBSD box
when built with -O2.  It doesn't crash on another system I tried, and
I think that's just luck, because this:

+   /* convert to regex pattern */
+   datum = DirectFunctionCall1(similar_escape,
CStringGetTextDatum(pattern));

... is calling a function that takes two arguments, but passing only
one.  The second argument is random junk, so similar_escape bombs when
it does this:

esc_text = PG_GETARG_TEXT_PP(1);

(lldb) bt
* thread #1, name = 'postgres', stop reason = signal SIGSEGV
  * frame #0: postgres`pg_detoast_datum_packed(datum=0x)
at fmgr.c:1865
frame #1: postgres`similar_escape(fcinfo=0x7fffd080) at regexp.c:686
frame #2: postgres`DirectFunctionCall1Coll(func=(postgres`similar_escape
at regexp.c:659), collation=, arg1=) at
fmgr.c:717
frame #3: postgres`queue_listen(action=LISTEN_LISTEN,
pattern="notify_%", isSimilarToPattern=) at async.c:671
frame #4: postgres`standard_ProcessUtility(pstmt=0x000801824df0,
queryString="LISTEN SIMILAR TO 'notify_%';",
context=PROCESS_UTILITY_TOPLEVEL, params=0x,
queryEnv=0x, dest=0x000801824ee8,
completionTag=) at utility.c:633
frame #5: postgres`PortalRunUtility(portal=0x000801960040,
pstmt=0x000801824df0, isTopLevel='\x01',
setHoldSnapshot=, dest=, completionTag="")
at pquery.c:1178
frame #6: postgres`PortalRunMulti(portal=0x000801960040,
isTopLevel='\x01', setHoldSnapshot='\0', dest=0x000801824ee8,
altdest=0x000801824ee8, completionTag="") at pquery.c:0
frame #7: postgres`PortalRun(portal=0x000801960040,
count=, isTopLevel='\x01', run_once=,
dest=, altdest=,
completionTag=) at pquery.c:799
frame #8: postgres`exec_simple_query(query_string="LISTEN SIMILAR
TO 'notify_%';") at postgres.c:1099
frame #9: postgres`PostgresMain(argc=,
argv=, dbname=, username=) at
postgres.c:0
frame #10: postgres`PostmasterMain [inlined] BackendRun at postmaster.c:4357
frame #11: postgres`PostmasterMain [inlined] BackendStartup at
postmaster.c:4029
frame #12: postgres`PostmasterMain at postmaster.c:1753
frame #13: postgres`PostmasterMain(argc=,
argv=0x0008018d11c0) at postmaster.c:1361
frame #14: postgres`main(argc=, argv=)
at main.c:228
frame #15: 0x004823af postgres`_start + 383

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-18 Thread David Steele
On 8/18/17 3:00 PM, Robert Haas wrote:
> 
> If you update the patch I'll apply it to 11 and 10.

Attached is the updated patch.

I didn't like the vague "there can be some issues on the server if it
crashes during the backup" so I added a new paragraph at the appropriate
step to give a more detailed explanation of the problem.

Thanks,
-- 
-David
da...@pgmasters.net
diff --git a/doc/src/sgml/backup.sgml b/doc/src/sgml/backup.sgml
index 0e7c6e2051..76f81975f1 100644
--- a/doc/src/sgml/backup.sgml
+++ b/doc/src/sgml/backup.sgml
@@ -889,8 +889,11 @@ SELECT pg_start_backup('label', false, false);
 
 SELECT * FROM pg_stop_backup(false, true);
 
- This terminates the backup mode and performs an automatic switch to
- the next WAL segment.  The reason for the switch is to arrange for
+ This terminates backup mode. On a primary, it also performs an automatic
+ switch to the next WAL segment.  On a standby, it is not possible to
+ automatically switch WAL segments, so you may wish to run
+ pg_switch_wal on the primary to perform a manual
+ switch. The reason for the switch is to arrange for
  the last WAL segment file written during the backup interval to be
  ready to archive.
 
@@ -908,9 +911,12 @@ SELECT * FROM pg_stop_backup(false, true);
  Once the WAL segment files active during the backup are archived, you are
  done.  The file identified by pg_stop_backup's first return
  value is the last segment that is required to form a complete set of
- backup files.  If archive_mode is enabled,
+ backup files.  On a primary, if archive_mode is enabled and 
the
+ wait_for_archive parameter is true,
  pg_stop_backup does not return until the last segment has
  been archived.
+ On a standby, archive_mode must be always in order
+ for pg_stop_backup to wait.
  Archiving of these files happens automatically since you have
  already configured archive_command. In most cases this
  happens quickly, but you are advised to monitor your archive
@@ -943,9 +949,9 @@ SELECT * FROM pg_stop_backup(false, true);
 Making an exclusive low level backup
 
  The process for an exclusive backup is mostly the same as for a
- non-exclusive one, but it differs in a few key steps. It does not allow
- more than one concurrent backup to run, and there can be some issues on
- the server if it crashes during the backup. Prior to PostgreSQL 9.6, this
+ non-exclusive one, but it differs in a few key steps. This type of backup
+ can only be taken on a primary and does not allow concurrent backups.
+ Prior to PostgreSQL 9.6, this
  was the only low-level method available, but it is now recommended that
  all users upgrade their scripts to use non-exclusive backups if possible.
 
@@ -1003,6 +1009,11 @@ SELECT pg_start_backup('label', true);
   for things to
  consider during this backup.
 
+
+  Note well that if the server crashes during the backup it may not be
+  possible to restart until the backup_label file has been
+  manually deleted from the PGDATA directory.
+


 
@@ -1012,15 +1023,10 @@ SELECT pg_start_backup('label', true);
 
 SELECT pg_stop_backup();
 
- This function, when called on a primary, terminates the backup mode and
+ This function terminates backup mode and
  performs an automatic switch to the next WAL segment. The reason for the
  switch is to arrange for the last WAL segment written during the backup
- interval to be ready to archive.  When called on a standby, this function
- only terminates backup mode.  A subsequent WAL segment switch will be
- needed in order to ensure that all WAL files needed to restore the backup
- can be archived; if the primary does not have sufficient write activity
- to trigger one, pg_switch_wal should be executed on
- the primary.
+ interval to be ready to archive.
 


diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index b43ec30a4e..28eda97273 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18606,7 +18606,8 @@ postgres=# select pg_start_backup('label_goes_here');

 

-The function also creates a backup history file in the write-ahead log
+When executed on a primary, the function also creates a backup history file
+in the write-ahead log
 archive area. The history file includes the label given to
 pg_start_backup, the starting and ending write-ahead log 
locations for
 the backup, and the starting and ending times of the backup.  The return

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] possible encoding issues with libxml2 functions

2017-08-18 Thread Pavel Stehule
Hi

2017-08-08 2:10 GMT+02:00 Noah Misch :

> On Wed, Apr 05, 2017 at 10:53:39PM +0200, Pavel Stehule wrote:
> > 2017-03-17 4:23 GMT+01:00 Noah Misch :
> > > On Sun, Mar 12, 2017 at 10:26:33PM +0100, Pavel Stehule wrote:
> > > > 2017-03-12 21:57 GMT+01:00 Noah Misch :
> > > > > On Sun, Mar 12, 2017 at 08:36:58PM +0100, Pavel Stehule wrote:
> > > > > > 2017-03-12 0:56 GMT+01:00 Noah Misch :
> > > > > Please add a test case.
> >
> > I am sending test case.
> >
> > I am afraid so this cannot be part of regress tests due strong dependency
> > on locale win1250.
>
> Right.  Based on that, I've distilled this portable test case:
>
> -- Round-trip non-ASCII data through xpath().
> DO $$
> DECLARE
> xml_declaration text := ' encoding="ISO-8859-1"?>';
> degree_symbol text;
> res xml[];
> BEGIN
> -- Per the documentation, xpath() doesn't work on non-ASCII data
> when
> -- the server encoding is not UTF8.  The EXCEPTION block below,
> -- currently dead code, will be relevant if we remove this
> limitation.
> IF current_setting('server_encoding') <> 'UTF8' THEN
> RAISE LOG 'skip: encoding % unsupported for xml',
> current_setting('server_encoding');
> RETURN;
> END IF;
>
> degree_symbol := convert_from('\xc2b0', 'UTF8');
> res := xpath('text()', (xml_declaration ||
> '' || degree_symbol || '')::xml);
> IF degree_symbol <> res[1]::text THEN
> RAISE 'expected % (%), got % (%)',
> degree_symbol, convert_to(degree_symbol, 'UTF8'),
> res[1], convert_to(res[1]::text, 'UTF8');
> END IF;
> EXCEPTION
> -- character with byte sequence 0xc2 0xb0 in encoding "UTF8" has
> no equivalent in encoding "LATIN8"
> WHEN untranslatable_character THEN RAISE LOG 'skip: %', SQLERRM;
> -- default conversion function for encoding "UTF8" to
> "MULE_INTERNAL" does not exist
> WHEN undefined_function THEN RAISE LOG 'skip: %', SQLERRM;
> END
> $$
>

yes, probably libXML2 try to do check from utf8 encoding to header
specified encoding.


> > > > > Why not use xml_parse() instead of calling xmlCtxtReadMemory()
> > > > > directly?  The answer is probably in the archives, because someone
> > > > > understood the problem enough to document "Some XML-related
> functions
> > > > > may not work at all on non-ASCII data when the server encoding is
> not
> > > > > UTF-8. This is known to be an issue for xpath() in particular."
> > > >
> > > > Probably there are two possible issues
> > >
> > > Would you research in the archives to confirm?
> > >
> > > > 1. what I touched - recv function does encoding to database encoding
> -
> > > > but document encoding is not updated.
> > >
> > > Using xml_parse() would fix that, right?
> > >
> >
> > It should to help, because it cuts a header - but it does little bit more
> > work, than we would - it check if xml is doc or not too.
>
> That's true.  xpath() currently works on both DOCUMENT and CONTENT xml
> values.
> If xpath() used xml_parse(), this would stop working:
>
>   SELECT xpath('text()', XMLPARSE (DOCUMENT ' []>bar'));
>
> > One possible fix - and similar technique is used more times in xml.c code
> > .. xmlroot
>
> > +   /* remove header */
> > +   parse_xml_decl(string, _len, NULL, NULL, NULL);
>
> > -   doc = xmlCtxtReadMemory(ctxt, (char *) string, len, NULL, NULL,
> 0);
> > +   doc = xmlCtxtReadMemory(ctxt, (char *) string + header_len, len -
>
> > another possibility is using xml_out_internal - that is used in XMLTABLE
> > function - what was initial fix.
> >
> > xml_out_internal uses parse_xml_decl too, but it is little bit more
> > expensive due call print_xml_decl that has not any effect in this case
> > (where only encoding is interesting) - but it can has sense, when server
> > encoding is not UTF8, because in this case, xmlstr is not encoded to
> UTF8 -
> > so now, I am think the correct solution should be using xml_out_internal
> -
> > because it appends a header with server encoding to XML doc
>
> As you may have been implying here, your patch never adds an xml
> declaration
> that bears an encoding.  (That is because it passes targetencoding=0 to
> xml_out_internal().)  If we were going to fix xpath() to support non-ASCII
> characters in non-UTF8 databases, we would not do that by adding xml
> declarations.  We would do our own conversion to UTF8, like xml_parse()
> does.
> In that light, I like this parse_xml_decl() approach better.  Would you
> update
> your patch to use it and to add the test case I give above?
>
>
I need to do some recapitulation (my analyses was wrong):

a) all values created by  xml_in iterface are in database encoding - input
string is stored without any change. xml_parse is called only due
validation.

b) inside xml_parse, the 

Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Douglas Doole
>
> 1. The header comment for pass_down_bound() could mention "one or more
> levels of subqueries" rather than "a subquery".
>

Fixed

2. The first of the comments in the function body appears to have a
> whitespace issue that needs to be fixed manually or, better yet,
> addressed by pgindent.
>

Fixed


> 3. The formatting of the comment in the regression tests appears to be
> unlike any other comment in that same file.
>

A side effect of inheriting it from our branches ;-) Reworked.


> 4. I am pretty doubtful that "Memory: 25kB" is going to be stable
> enough for us to want that output memorialized in the regression ...
>

Fair enough. I wanted to be a bit more sophisticated in my check than
looking for a single value so I worked out something that distills the
explain down to the key elements.
*** a/src/backend/executor/nodeLimit.c
--- b/src/backend/executor/nodeLimit.c
***
*** 308,313  recompute_limits(LimitState *node)
--- 308,316 
   * since the MergeAppend surely need read no more than that many tuples from
   * any one input.  We also have to be prepared to look through a Result,
   * since the planner might stick one atop MergeAppend for projection purposes.
+  * We can also accept one or more levels of subqueries that have no quals or
+  * SRFs (that is, each subquery is just projecting columns) between the LIMIT
+  * and any of the above.
   *
   * This is a bit of a kluge, but we don't have any more-abstract way of
   * communicating between the two nodes; and it doesn't seem worth trying
***
*** 320,325  recompute_limits(LimitState *node)
--- 323,349 
  static void
  pass_down_bound(LimitState *node, PlanState *child_node)
  {
+ 	/*
+ 	 * If the child is a subquery that does no filtering (no predicates)
+ 	 * and does not have any SRFs in the target list then we can potentially
+ 	 * push the limit through the subquery. It is possible that we could have
+ 	 * multiple subqueries, so tunnel through them all.
+ 	 */
+ 	while (IsA(child_node, SubqueryScanState))
+ 	{
+ 		SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node;
+ 
+ 		/*
+ 		 * Non-empty predicates or an SRF means we cannot push down the limit.
+ 		 */
+ 		if (subqueryScanState->ss.ps.qual != NULL ||
+ 			expression_returns_set((Node *) child_node->plan->targetlist))
+ 			return;
+ 
+ 		/* Use the child in the following checks */
+ 		child_node = subqueryScanState->subplan;
+ 	}
+ 
  	if (IsA(child_node, SortState))
  	{
  		SortState  *sortState = (SortState *) child_node;
*** a/src/test/regress/expected/subselect.out
--- b/src/test/regress/expected/subselect.out
***
*** 1041,1043  NOTICE:  x = 9, y = 13
--- 1041,1095 
  (3 rows)
  
  drop function tattle(x int, y int);
+ --
+ -- Test that LIMIT can be pushed to SORT through a subquery that just
+ -- projects columns
+ --
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ 	(1, 1, 1),
+ 	(2, 2, 2),
+ 	(3, 3, 3),
+ 	(4, 4, 4),
+ 	(5, 1, 1),
+ 	(6, 2, 2),
+ 	(7, 3, 3),
+ 	(8, 4, 4);
+ -- The explain contains data that may not be invariant, so
+ -- filter for just the interesting bits.  The goal here is that
+ -- we should see three notices, in order:
+ --   NOTICE: Limit
+ --   NOTICE: Subquery
+ --   NOTICE: Top-N Sort
+ -- A missing step, or steps out of order means we have a problem.
+ do $$
+ 	declare x text;
+ 	begin
+ 		for x in
+ 			explain (analyze, summary off, timing off, costs off)
+ 			select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3
+ 		loop
+ 			if (left(ltrim(x), 5) = 'Limit') then
+ raise notice 'Limit';
+ 			end if;
+ 			if (left(ltrim(x), 12) = '->  Subquery') then
+ raise notice 'Subquery';
+ 			end if;
+ 			if (left(ltrim(x), 18) = 'Sort Method: top-N') then
+ raise notice 'Top-N Sort';
+ 			end if;
+ 		end loop;
+ 	end;
+ $$;
+ NOTICE:  Limit
+ NOTICE:  Subquery
+ NOTICE:  Top-N Sort
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+  pk | c2 
+ +
+   1 |  1
+   5 |  1
+   2 |  2
+ (3 rows)
+ 
+ drop table sq_limit;
*** a/src/test/regress/sql/subselect.sql
--- b/src/test/regress/sql/subselect.sql
***
*** 540,542  select * from
--- 540,588 
where tattle(x, u);
  
  drop function tattle(x int, y int);
+ 
+ --
+ -- Test that LIMIT can be pushed to SORT through a subquery that just
+ -- projects columns
+ --
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ 	(1, 1, 1),
+ 	(2, 2, 2),
+ 	(3, 3, 3),
+ 	(4, 4, 4),
+ 	(5, 1, 1),
+ 	(6, 2, 2),
+ 	(7, 3, 3),
+ 	(8, 4, 4);
+ 
+ -- The explain contains data that may not be invariant, so
+ -- filter for just the interesting bits.  The goal here is that
+ -- we should see three notices, in order:
+ --   NOTICE: Limit
+ --   NOTICE: Subquery
+ --   NOTICE: Top-N Sort
+ -- A missing step, or steps out of order means we have a problem.
+ do $$
+ 	declare x text;
+ 	begin
+ 		

Re: [HACKERS] expanding inheritance in partition bound order

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:17 AM, Ashutosh Bapat
 wrote:
> 0004 patch in partition-wise join patchset has code to expand
> partition hierarchy. That patch is expanding inheritance hierarchy in
> depth first manner. Robert commented that instead of depth first
> manner, it will be better if we expand it in partitioned tables first
> manner. With the latest changes in your patch-set I don't see the
> reason for expanding in partitioned tables first order. Can you please
> elaborate if we still need to expand in partitioned table first
> manner? May be we should just address the expansion issue in 0004
> instead of dividing it in two patches.

Let me see if I can clarify.  I think there are three requirements here:

A. Amit wants to be able to prune leaf partitions before opening and
locking those relations, so that pruning can be done earlier and,
therefore, more cheaply.

B. Partition-wise join wants to expand the inheritance hierarchy a
level at a time instead of all at once, ending up with rte->inh = true
entries for intermediate partitioned tables.

C. Partition-wise join (and lots of other things; see numerous
mentions of EIBO in
http://rhaas.blogspot.com/2017/08/plans-for-partitioning-in-v11.html)
want to expand in bound order.

Obviously, bound-order and partitioned-tables-first are incompatible
orderings, but there's no actual conflict because the first one has to
do with the order of *expansion* and the second one has to do with the
order of *locking*.  So in the end game I think
expand_inherited_rtentry looks approximately like this:

1. Calling find_all_inheritors with a new only-lock-the-partitions
flag.  This should result in locking all partitioned tables in the
inheritance hierarchy in breadth-first, low-OID-first order.  (When
the only-lock-the-partitions isn't specified, all partitioned tables
should still be locked before any unpartitioned tables, so that the
locking order in that case is consistent with what we do here.)

2. Iterate over the partitioned tables identified in step 1 in the
order in which they were returned.  For each one:
- Decide which children can be pruned.
- Lock the unpruned, non-partitioned children in low-OID-first order.

3. Make another pass over the inheritance hierarchy, starting at the
root.  Traverse the whole hierarchy in breadth-first in *bound* order.
Add RTEs and AppendRelInfos as we go -- these will have rte->inh =
true for partitioned tables and rte->inh = false for leaf partitions.

Whether we should try to go straight to the end state here or do this
via a series of incremental changes, I'm not entirely sure right now.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HISTIGNORE for psql

2017-08-18 Thread Vesa-Matti J Kari

Hello,

On Fri, 18 Aug 2017, Vesa-Matti J Kari wrote:

> A quick patch is attached. Not sure about the quality, hacked this
> together in about four hours, trying to figure out how to do it correctly
> the PostgreSQL way.

Sorry, I guess I have written too many Python scripts. I no longer
remember to free() memory in C, at least tokbuf should be free():d
before returning in parse_histignore().

Regards,
vmk
-- 

   Tietotekniikkakeskus / Helsingin yliopisto
 IT department / University of Helsinki



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] HISTIGNORE for psql

2017-08-18 Thread Vesa-Matti J Kari

Hello,

On Thu, 17 Aug 2017, Pavel Stehule wrote:

> 2017-08-17 9:23 GMT+02:00 Vesa-Matti J Kari :
>
>   Bash has HISTIGNORE feature that allows you to exclude certain commands
>   from the command history (e.g. shutdown, reboot, rm *).
>
>   Would it make any sense to add such a feature to psql (e.g. to ignore
>   DROP, DELETE commands)?
>
>
> It is not bad idea.

A quick patch is attached. Not sure about the quality, hacked this
together in about four hours, trying to figure out how to do it correctly
the PostgreSQL way.

Based on a few tests, the patch seems to work.

I do not know how the Bash implementation works, but I chose to disallow
forms such as:

:
:a
a:
a::b

So specifying empty strings seems like a syntax error to me. But I do
not know how to report failures for those, the current patch disallows
them and HISTIGNORE simply does not work with invalid syntax.

Regards,
vmk
-- 

   Tietotekniikkakeskus / Helsingin yliopisto
 IT department / University of Helsinki
diff --git a/src/bin/psql/help.c b/src/bin/psql/help.c
index b3dbb5946e..1a28a21f26 100644
*** a/src/bin/psql/help.c
--- b/src/bin/psql/help.c
***
*** 357,362  helpVariables(unsigned short int pager)
--- 357,363 
  " (default: 
0=unlimited)\n"));
fprintf(output, _("  HISTCONTROLcontrols command history 
[ignorespace, ignoredups, ignoreboth]\n"));
fprintf(output, _("  HISTFILE   file name used to store the 
command history\n"));
+   fprintf(output, _("  HISTIGNORE controls command history, 
ignores colon separated commands\n"));
fprintf(output, _("  HISTSIZE   max number of commands to store 
in the command history\n"));
fprintf(output, _("  HOST   the currently connected 
database server host\n"));
fprintf(output, _("  IGNOREEOF  number of EOFs needed to 
terminate an interactive session\n"));
diff --git a/src/bin/psql/index 62f5f77383..ffaee397cf 100644
*** a/src/bin/psql/input.c
--- b/src/bin/psql/input.c
***
*** 147,152  pg_send_history(PQExpBuffer history_buf)
--- 147,162 
  
if (useHistory && s[0])
{
+   if (pset.histignore)
+   {
+   for (i = 0; i < histignore.nstr; i++) {
+   if (pg_strncasecmp(s, histignore.str[i], 
strlen(histignore.str[i])) == 0) {
+   resetPQExpBuffer(history_buf);
+   return;
+   }
+   }
+   }
+ 
if (((pset.histcontrol & hctl_ignorespace) &&
 s[0] == ' ') ||
((pset.histcontrol & hctl_ignoredups) &&
diff --git a/src/bin/psql/sindex b78f151acd..9c436eeb88 100644
*** a/src/bin/psql/settings.h
--- b/src/bin/psql/settings.h
***
*** 77,82  enum trivalue
--- 77,89 
TRI_YES
  };
  
+ typedef struct HistIgnore
+ {
+   char **str;
+   int nstr;
+ 
+ } HistIgnore;
+ 
  typedef struct _psqlSettings
  {
PGconn *db; /* connection to backend */
***
*** 133,138  typedef struct _psqlSettings
--- 140,146 
PSQL_ERROR_ROLLBACK on_error_rollback;
PSQL_COMP_CASE comp_case;
HistControl histcontrol;
+   char*histignore;
const char *prompt1;
const char *prompt2;
const char *prompt3;
***
*** 141,146  typedef struct _psqlSettings
--- 149,155 
  } PsqlSettings;
  
  extern PsqlSettings pset;
+ extern HistIgnore histignore;
  
  
  #ifndef EXIT_SUCCESS
diff --git a/src/bin/psql/starindex 7f767976a5..b14ce2699e 100644
*** a/src/bin/psql/startup.c
--- b/src/bin/psql/startup.c
***
*** 31,36 
--- 31,37 
   * Global psql options
   */
  PsqlSettings pset;
+ HistIgnore histignore;
  
  #ifndef WIN32
  #define SYSPSQLRC "psqlrc"
***
*** 667,672  parse_psql_options(int argc, char *argv[], struct adhoc_opts 
*options)
--- 668,726 
  }
  
  
+ static bool
+ parse_histignore(const char *val)
+ {
+   const char *p;
+   char *tokbuf;
+   char *tok;
+   int i;
+   int len;
+   int ncolons;
+ 
+   len = strlen(val);
+ 
+   if (len == 0) {
+   histignore.str = NULL;
+   histignore.nstr = 0;
+   return true;
+   }
+ 
+   /* Validate syntax first. */
+   if (val[0] == ':' || val[len-1] == ':')
+   return false;
+ 
+   i = 0;
+   while (val[i] && val[i+1]) {
+   if (val[i] == ':' && val[i+1] == ':')
+   return false;
+ 

Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 2:58 PM, David Steele  wrote:
> OK, but I was trying to make it very clear that this backup method only
> works on a primary.  If you think the text is in the first paragraph is
> enough then I'm willing to go with that, though.

Yeah, I think the text is enough.

> Since the exclusive method only works on a primary...

Oh, right.  Duh.

If you update the patch I'll apply it to 11 and 10.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-18 Thread David Steele
Robert,

Thanks for reviewing!

On 8/18/17 2:45 PM, Robert Haas wrote:
> - the next WAL segment.  The reason for the switch is to arrange for
> + the next WAL segment when run on a primary.  On a standby you can call
> + pg_switch_wal on the primary to perform a manual
> + switch.
> + The reason for the switch is to arrange for
> 
> Tacking on "when run on a primary" onto the end of the existing
> sentence is a little ambiguous: does that clause apply only to the
> last part, or to the whole sentence?  I suggest something like: This
> terminates the backup mode.  On a primary, it also performs an
> automatic switch to the next WAL segment.  On a standby, it is not
> possible to automatically switch WAL segments, so you may wish to
> consider running pg_switch_wal on the primary to
> perform a manual switch.

Looks good.

> 
> -Making an exclusive low level backup
> +Making an exclusive low level backup on a primary
> 
> I'd omit this hunk.

OK, but I was trying to make it very clear that this backup method only
works on a primary.  If you think the text is in the first paragraph is
enough then I'm willing to go with that, though.

> - more than one concurrent backup to run, and there can be some issues on
> + more than one concurrent backup to run, must be run on a
> primary, and there
> + can be some issues on
> 
> Maybe this would be clearer: This type of backup can only be taken on
> a primary, does not allow more than one ...

Looks good.

> - This function, when called on a primary, terminates the backup mode and
> + This function terminates the backup mode and
>   performs an automatic switch to the next WAL segment. The reason for the
>   switch is to arrange for the last WAL segment written during the backup
> - interval to be ready to archive.  When called on a standby, this 
> function
> - only terminates backup mode.  A subsequent WAL segment switch will be
> - needed in order to ensure that all WAL files needed to restore the 
> backup
> - can be archived; if the primary does not have sufficient write activity
> - to trigger one, pg_switch_wal should be executed on
> - the primary.
> + interval to be ready to archive.
> 
> Why do you want to delete all that text?  It seems like good text to me.

Since the exclusive method only works on a primary...

-- 
-David
da...@pgmasters.net


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 4:39 AM, Piotr Stefaniak
 wrote:
> On 2017-08-17 11:24, Simon Riggs wrote:
>> Your suggestion of "furthest" is already the default behaviour.
>>
>> Why are you using 'now'? Why would you want to pick a randomly
>> selected end time?
>
> The idea in both cases was to stop the recovery in order to prevent the
> standby from selecting new timeline. I want to be able to continue the
> recovery from future WAL files.  "Furthest" really meant "as far as
> possible without completing recovery".

Can you use recovery_target_action='shutdown' instead?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [Proposal] Allow users to specify multiple tables in VACUUM commands

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:56 AM, Michael Paquier
 wrote:
> +   /*
> +* We have already checked the column list in vacuum(...),
> +* but the columns may have disappeared since then.  If
> +* this happens, emit a nice warning message and skip the
> +* undefined column.
> +*/
> I think that this would be reworded. "nice" is cute is this context.
> Why not just saying something like:
> "Do not issue an ERROR if a column is missing but use a WARNING
> instead. At the beginning of the VACUUM run, the code already checked
> for undefined columns and informed about an ERROR, but we want the
> processing to move on for existing columns."

Hmm, I find your (Michael's) suggestion substantially less clear than
the wording to which you are objecting.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Update low-level backup documentation to match actual behavior

2017-08-18 Thread Robert Haas
- the next WAL segment.  The reason for the switch is to arrange for
+ the next WAL segment when run on a primary.  On a standby you can call
+ pg_switch_wal on the primary to perform a manual
+ switch.
+ The reason for the switch is to arrange for

Tacking on "when run on a primary" onto the end of the existing
sentence is a little ambiguous: does that clause apply only to the
last part, or to the whole sentence?  I suggest something like: This
terminates the backup mode.  On a primary, it also performs an
automatic switch to the next WAL segment.  On a standby, it is not
possible to automatically switch WAL segments, so you may wish to
consider running pg_switch_wal on the primary to
perform a manual switch.

- backup files.  If archive_mode is enabled,
+ backup files.  On a primary, if archive_mode is
enabled and the
+ wait_for_archive parameter is true,
  pg_stop_backup does not return until the last segment has
  been archived.
+ On a standby, archive_mode must be always in order
+ for pg_stop_backup to wait.

Looks good.

-Making an exclusive low level backup
+Making an exclusive low level backup on a primary

I'd omit this hunk.

- more than one concurrent backup to run, and there can be some issues on
+ more than one concurrent backup to run, must be run on a
primary, and there
+ can be some issues on

Maybe this would be clearer: This type of backup can only be taken on
a primary, does not allow more than one ...

- This function, when called on a primary, terminates the backup mode and
+ This function terminates the backup mode and
  performs an automatic switch to the next WAL segment. The reason for the
  switch is to arrange for the last WAL segment written during the backup
- interval to be ready to archive.  When called on a standby, this function
- only terminates backup mode.  A subsequent WAL segment switch will be
- needed in order to ensure that all WAL files needed to restore the backup
- can be archived; if the primary does not have sufficient write activity
- to trigger one, pg_switch_wal should be executed on
- the primary.
+ interval to be ready to archive.

Why do you want to delete all that text?  It seems like good text to me.

-The function also creates a backup history file in the write-ahead log
+When executed on a primary, the function also creates a backup history file
+in the write-ahead log

Looks good.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Segmentation Fault during pg_restore using '--use-list' and '--jobs'

2017-08-18 Thread Fabrízio de Royes Mello
Hi all,

I'm facing a 'segmentation fault' error using '--use-list' and '--jobs'
options after update to 9.5.8.

We generate a list ignoring some 'TABLE DATA' toc for a selective restore.

See the test case below:

cat < /tmp/test_restore.dump.list

-- rebuild database
cat depCount == 0 &&
 			_tocEntryRestorePass(otherte) == AH->restorePass &&
-			otherte->par_prev != NULL)
+			otherte->par_prev != NULL &&
+			ready_list != NULL)
 		{
 			/* It must be in the pending list, so remove it ... */
 			par_list_remove(otherte);

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:12 PM, amul sul  wrote:
> I have a small query,  what if I want a cache entry with extended hash
> function instead standard one, I might require that while adding
> hash_array_extended function? Do you think we need to extend
> lookup_type_cache() as well?

Hmm, I thought you had changed the hash partitioning stuff so that it
didn't rely on lookup_type_cache().  You have to look up the function
using the opclass provided in the partition key definition;
lookup_type_cache() will give you the default one for the datatype.
Maybe just use get_opfamily_proc?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Hash Functions

2017-08-18 Thread amul sul
On Fri, Aug 18, 2017 at 8:49 AM, Robert Haas  wrote:

> On Wed, Aug 16, 2017 at 5:34 PM, Robert Haas 
> wrote:
> > Attached is a quick sketch of how this could perhaps be done (ignoring
> > for the moment the relatively-boring opclass pushups).
>
> Here it is with some relatively-boring opclass pushups added.  I just
> did the int4 bit; the same thing will need to be done, uh, 35 more
> times.  But you get the gist.  No, not that kind of gist.
>
> ​
I will work on this.

I have a small query,  what if I want a cache entry with extended hash
function instead standard one, I might require that while adding
hash_array_extended function? Do you think we need to extend
lookup_type_cache() as well?

Regards,
Amul


Re: [HACKERS] Add support for tuple routing to foreign partitions

2017-08-18 Thread Robert Haas
On Thu, Aug 17, 2017 at 4:27 AM, Etsuro Fujita
 wrote:
> I think that would be much more efficient than INSERTing tuples into the
> remote server one by one.  What do you think about that?

I agree, but I wonder if we ought to make it work first using the
existing APIs and then add these new APIs as an optimization.
postgres_fdw isn't the only FDW in the world, and it's better if
getting a working implementation doesn't require too many interfaces.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for tuple routing to foreign partitions

2017-08-18 Thread Robert Haas
On Thu, Aug 17, 2017 at 7:58 AM, Etsuro Fujita
 wrote:
>> The description seems to support only COPY to a foreign table from a
>> file, but probably we need the support other way round as well. This
>> looks like a feature (support copy to and from a foreign table) to be
>> handled by itself.
>
> Agreed.  I'll consider how to handle copy-from-a-foreign-table as well.

That's a completely different feature which has nothing to do with
tuple routing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Stats for triggers on partitioned tables not shown in EXPLAIN ANALYZE

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 1:15 AM, Noah Misch  wrote:
> The above-described topic is currently a PostgreSQL 10 open item.  Robert,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.
>
> [1] 
> https://www.postgresql.org/message-id/20170404140717.GA2675809%40tornado.leadboat.com

Committed and back-patched the proposed patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 11:42 AM, Douglas Doole  wrote:
> Thanks for the feedback on my original patch Robert. Here's an updated patch
> that will tunnel through multiple SubqueryScanStates.

Seems reasonable.  I have some assorted nitpicks.

1. The header comment for pass_down_bound() could mention "one or more
levels of subqueries" rather than "a subquery".

2. The first of the comments in the function body appears to have a
whitespace issue that needs to be fixed manually or, better yet,
addressed by pgindent.

3. The formatting of the comment in the regression tests appears to be
unlike any other comment in that same file.

4. I am pretty doubtful that "Memory: 25kB" is going to be stable
enough for us to want that output memorialized in the regression
tests. It seems like it might vary on different platforms - e.g.
32-bit vs. 64-bit - and it also seems like minor changes to how we do
sorting could perturb it and, perhaps, make it unstable even if today
it isn't.  So I think it would be good to give this a bit more thought
and see if you can come up with a way to test this without running
afoul of that problem.  Maybe adapt from this:

do $$declare x text; begin execute 'explain select 1' into x; if x !~
'^Result' then raise notice '%', x; else raise notice 'looks ok'; end
if; end;$$;

BTW, regarding the other patch on this thread, it struck me that maybe
it would be better to just reduce/limit the fetch count for the cursor
instead of trying to inject LIMIT n into the query itself.  That's not
as good for query optimization purposes but it's a lot more
future-proof.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why not parallel seq scan for slow functions

2017-08-18 Thread Dilip Kumar
On Fri, 18 Aug 2017 at 4:48 PM, Amit Kapila  wrote:

> On Thu, Aug 17, 2017 at 2:45 PM, Dilip Kumar 
> wrote:
> > On Thu, Aug 17, 2017 at 2:09 PM, Dilip Kumar 
> wrote:
> >>
> >> Either we can pass "num_gene" to merge_clump or we can store num_gene
> >> in the root. And inside merge_clump we can check. Do you see some more
> >> complexity?
> >>
>
> I think something like that should work.

Ok

>
>
> > After putting some more thought I see one more problem but not sure
> > whether we can solve it easily. Now, if we skip generating the gather
> > path at top level node then our cost comparison while adding the
> > element to pool will not be correct as we are skipping some of the
> > paths (gather path).  And, it's very much possible that the path1 is
> > cheaper than path2 without adding gather on top of it but with gather,
> > path2 can be cheaper.
> >
>
> I think that should not matter because the costing of gather is mainly
> based on a number of rows and that should be same for both path1 and
> path2 in this case.


Yeah, I think you are right.

>
>
>
> --
> With Regards,
> Amit Kapila.
> EnterpriseDB: http://www.enterprisedb.com
>
-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-08-18 Thread Robert Haas
On Fri, Aug 18, 2017 at 2:23 AM, Mithun Cy  wrote:
> 1. Hang in apw_detach_shmem.
> +/*
> + * Clear our PID from autoprewarm shared state.
> + */
> +static void
> +apw_detach_shmem(int code, Datum arg)
> +{
> +   LWLockAcquire(_state->lock, LW_EXCLUSIVE);
> +   if (apw_state->pid_using_dumpfile == MyProcPid)
> +   apw_state->pid_using_dumpfile = InvalidPid;
> +   if (apw_state->bgworker_pid == MyProcPid)
> +   apw_state->bgworker_pid = InvalidPid;
> +   LWLockRelease(_state->lock);
> +}
>
> The reason is that we might already be under the apw_state->lock when
> we error out and jump to apw_detach_shmem. So we should not be trying
> to take the lock again. For example, in autoprewarm_dump_now(),
> apw_dump_now() will error out under the lock if bgworker is already
> using dump file.

Ah, good catch.  While I agree that there is probably no great harm
from skipping the lock here, I think it would be better to just avoid
throwing an error while we hold the lock.  I think apw_dump_now() is
the only place where that could happen, and in the attached version,
I've fixed it so it doesn't do that any more.  Independent of the
correctness issue, I think the code is easier to read this way.

I also realize that it's not formally sufficient to use
PG_TRY()/PG_CATCH() here, because a FATAL would leave us in a bad
state.  Changed to PG_ENSURE_ERROR_CLEANUP().

> 2) I also found one issue which was my own mistake in my previous patch 19.
> In "apw_dump_now" I missed calling FreeFile() on first write error,
> whereas on othercases I am already calling the same.
> ret = fprintf(file, "<<" INT64_FORMAT ">>\n", num_blocks);
> + if (ret < 0)
> + {
> + int save_errno = errno;
> +
> + unlink(transient_dump_file_path);

Changed in the attached version.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


autoprewarm-rmh-v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Push limit to sort through a subquery

2017-08-18 Thread Douglas Doole
Thanks for the feedback on my original patch Robert. Here's an updated
patch that will tunnel through multiple SubqueryScanStates.

- Doug
Salesforce

On Thu, Aug 17, 2017 at 6:33 PM Robert Haas  wrote:

> On Thu, Aug 17, 2017 at 11:36 AM, Douglas Doole 
> wrote:
>
>> I completely agree. The further a limit can be pushed down, the better.
>>
>> The patch looks good to me.
>>
>
> It seems like a somewhat ad-hoc approach; it supposes that we can take any
> query produced by deparseSelectStmtForRel() and stick a LIMIT clause onto
> the very end and all will be well.  Maybe that's not a problematic
> assumption, not sure.  The grammar happens to allow both FOR UPDATE LIMIT n
> and LIMIT n FOR UPDATE even though only the latter syntax is documented.
>
> Regarding the other patch on this thread, you mentioned upthread that "If
> it is possible to get more than one SubqueryScanState and/or ResultState
> between the limit and sort, then the first block of code could be placed in
> a while loop."  I think that's not possible for a ResultState, but I think
> it *is* possible for a SubqueryScanState.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>
*** a/src/backend/executor/nodeLimit.c
--- b/src/backend/executor/nodeLimit.c
***
*** 308,313  recompute_limits(LimitState *node)
--- 308,315 
   * since the MergeAppend surely need read no more than that many tuples from
   * any one input.  We also have to be prepared to look through a Result,
   * since the planner might stick one atop MergeAppend for projection purposes.
+  * We can also accept a subquery that has no quals or SRFs (that is, the
+  * subquery is just projecting columns) between the LIMIT and any of the above.
   *
   * This is a bit of a kluge, but we don't have any more-abstract way of
   * communicating between the two nodes; and it doesn't seem worth trying
***
*** 320,325  recompute_limits(LimitState *node)
--- 322,348 
  static void
  pass_down_bound(LimitState *node, PlanState *child_node)
  {
+ 	/*
+ 	 * If the child is a subquery that does no filtering (no predicates)
+ 	 * and does not have any SRFs in the target list then we can potentially
+ 	 * push the limit through the subquery. It is possible that we could have
+  * multiple subqueries, so tunnel through them all.
+ 	 */
+ 	while (IsA(child_node, SubqueryScanState))
+ 	{
+ 		SubqueryScanState *subqueryScanState = (SubqueryScanState *) child_node;
+ 
+ 		/*
+ 		 * Non-empty predicates or an SRF means we cannot push down the limit.
+ 		 */
+ 		if (subqueryScanState->ss.ps.qual != NULL ||
+ 			expression_returns_set((Node *) child_node->plan->targetlist))
+ 			return;
+ 
+ 		/* Use the child in the following checks */
+ 		child_node = subqueryScanState->subplan;
+ 	}
+ 
  	if (IsA(child_node, SortState))
  	{
  		SortState  *sortState = (SortState *) child_node;
*** a/src/test/regress/expected/subselect.out
--- b/src/test/regress/expected/subselect.out
***
*** 1041,1043  NOTICE:  x = 9, y = 13
--- 1041,1077 
  (3 rows)
  
  drop function tattle(x int, y int);
+ -
+ --TEST LIMIT pushdown through subquery scan node
+ -
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ (1, 1, 1),
+ (2, 2, 2),
+ (3, 3, 3),
+ (4, 4, 4),
+ (5, 1, 1),
+ (6, 2, 2),
+ (7, 3, 3),
+ (8, 4, 4);
+ explain (analyze, summary off, timing off, costs off)
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+QUERY PLAN   
+ 
+  Limit (actual rows=3 loops=1)
+->  Subquery Scan on x (actual rows=3 loops=1)
+  ->  Sort (actual rows=3 loops=1)
+Sort Key: sq_limit.c1, sq_limit.pk
+Sort Method: top-N heapsort  Memory: 25kB
+->  Seq Scan on sq_limit (actual rows=8 loops=1)
+ (6 rows)
+ 
+ select * from (select pk,c2 from sq_limit order by c1,pk) as x limit 3;
+  pk | c2 
+ +
+   1 |  1
+   5 |  1
+   2 |  2
+ (3 rows)
+ 
+ drop table sq_limit;
*** a/src/test/regress/sql/subselect.sql
--- b/src/test/regress/sql/subselect.sql
***
*** 540,542  select * from
--- 540,563 
where tattle(x, u);
  
  drop function tattle(x int, y int);
+ 
+ -
+ --TEST LIMIT pushdown through subquery scan node
+ -
+ create table sq_limit (pk int primary key, c1 int, c2 int);
+ insert into sq_limit values
+ (1, 1, 1),
+ (2, 2, 2),
+ (3, 3, 3),
+ (4, 4, 4),
+ (5, 1, 1),
+ (6, 2, 2),
+ (7, 3, 3),
+ (8, 4, 4);
+ 
+ explain (analyze, summary off, timing off, costs off)
+ select * from 

Re: [HACKERS] Quorum commit for multiple synchronous replication.

2017-08-18 Thread Robert Haas
On Thu, Aug 17, 2017 at 1:13 AM, Michael Paquier
 wrote:
> I had in mind a ereport(WARNING) in create_syncrep_config. Extra
> thoughts/opinions welcome.

I think for v10 we should just document the behavior we've got; I
think it's too late to be whacking things around now.

For v11, we could emit a warning if we plan to deprecate and
eventually remove the syntax without ANY/FIRST, but let's not do:

WARNING:  what you did is ok, but you might have wanted to do something else

First of all, whether or not that can properly be called a warning is
highly debatable.  Also, if you do that sort of thing to your spouse
and/or children, they call it "nagging".  I don't think users will
like it any more than family members do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Alvaro Herrera
Erikjan Rijkers wrote:
> On 2017-08-18 11:12, Ildar Musin wrote:
> > Hi hackers,
> > 
> > While we've been developing pg_pathman extension one of the most
> > frequent questions we got from our users was about global index
> > support. We cannot provide it within an extension. And I couldn't find
> > any recent discussion about someone implementing it. So I'm thinking
> > about giving it a shot and start working on a patch for postgres.
> 
> Sorry to be dense but what exactly is a "Global Index"?

A global index covers all partitions of a partitioned table.  It allows
you to have unique indexes across the partitioned table.

The main disadvantage of global indexes is that you need some kind of
cleanup after you drop a partition.  Either make partition drop wait
until all the index pointers are removed, or you need some kind of
after-commit cleanup process that removes them afterwards (which
requires some assurance that they are really all gone).  You can't let
them linger forever, or you risk a new partition that reuses the same
OID causing the whole index to become automatically corrupt.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Alvaro Herrera
Ildar Musin wrote:

> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.
> 
> One possible solution is to create an extended version of item pointer which
> would store relation oid along with block number and position:

I've been playing with the index code in order to allow indirect tuples,
which are stored in a format different from IndexTupleData.

I've been adding an "InMemoryIndexTuple" (maybe there's a better name)
which internally has pointers to both IndexTupleData and
IndirectIndexTupleData, which makes it easier to pass around the index
tuple in either format.  It's very easy to add an OID to that struct,
which then allows to include the OID in either an indirect index tuple
or a regular one.

Then, wherever we're using IndexTupleData in the index AM code, we would
replace it with InMemoryIndexTuple.  This should satisfy both your use
case and mine.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Erikjan Rijkers

On 2017-08-18 11:12, Ildar Musin wrote:

Hi hackers,

While we've been developing pg_pathman extension one of the most
frequent questions we got from our users was about global index
support. We cannot provide it within an extension. And I couldn't find
any recent discussion about someone implementing it. So I'm thinking
about giving it a shot and start working on a patch for postgres.


Sorry to be dense but what exactly is a "Global Index"?

You mention pg_pathman; is a global index related to just partitions? Or 
is it a more generally applicable concept?


Could you (or someone) perhaps expand a little?

thanks,

Erik Rijkers



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Ildar Musin

Hi Chris,

On 18.08.2017 16:15, Chris Travers wrote:

I would really like to see global indexes.  It would make things a lot
easier for things like unique constraints across table inheritance trees.

On Fri, Aug 18, 2017 at 11:12 AM, Ildar Musin > wrote:

Hi hackers,

While we've been developing pg_pathman extension one of the most
frequent questions we got from our users was about global index
support. We cannot provide it within an extension. And I couldn't
find any recent discussion about someone implementing it. So I'm
thinking about giving it a shot and start working on a patch for
postgres.

One possible solution is to create an extended version of item
pointer which would store relation oid along with block number and
position:

struct ItemPointerExt
{
Oid ip_relid;
BlockIdData ip_blkid;
OffsetNumber ip_posid;
};

and use it in global index (regular index will still use old
version). This will require in-depth refactoring of existing index
nodes to make them support both versions. Accordingly, we could
replace ItemPointer with ItemPointerExt in index AM to make unified
API to access both regular and global indexes. TIDBitmap will
require refactoring as well to be able to deal with relation oids.


So, to be clear on-disk representations would be unchanged for old
indexes (ensuring that pg_upgrade would not be broken), right?


Yes, the idea is to keep old indexes untouched so that there would be no 
need in any further conversion. And global indexes in turn will have 
extended TID format.





It seems to be quite an invasive patch since it requires changes in
general index routines, existing index nodes, catalog, vacuum
routines and syntax. So I'm planning to implement it step by step.
As a first prototype it could be:

* refactoring of btree index to be able to store both regular and
extended item pointers;


Do you foresee any performance implementation of handling both?


It's hard to say until there is some working prototype. I think there 
can be (and most like will be) some overhead due to unified API (and 
hence addition conversion operations). It will require benchmarking to 
say how bad is it.



--
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com

Saarbrücker Straße 37a, 10405 Berlin



--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for tuple routing to foreign partitions

2017-08-18 Thread David Fetter
On Fri, Aug 18, 2017 at 05:10:29PM +0900, Etsuro Fujita wrote:
> On 2017/08/17 23:48, David Fetter wrote:
> >On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote:
> >>On 2017/07/11 6:56, Robert Haas wrote:
> >>>On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita
> >>> wrote:
> So, I dropped the COPY part.
> >>>
> >>>Ouch.  I think we should try to figure out how the COPY part will be
> >>>handled before we commit to a design.
> >>
> >>I spent some time on this.  To handle that, I'd like to propose doing
> >>something similar to \copy (frontend copy): submit a COPY query "COPY ...
> >>FROM STDIN" to the remote server and route data from a file to the remote
> >>server.  For that, I'd like to add new FDW APIs called during CopyFrom that
> >>allow us to copy to foreign tables:
> >>
> >>* BeginForeignCopyIn: this would be called after creating a ResultRelInfo
> >>for the target table (or each leaf partition of the target partitioned
> >>table) if it's a foreign table, and perform any initialization needed before
> >>the remote copy can start.  In the postgres_fdw case, I think this function
> >>would be a good place to send "COPY ... FROM STDIN" to the remote server.
> >>
> >>* ExecForeignCopyInOneRow: this would be called instead of heap_insert if
> >>the target is a foreign table, and route the tuple read from the file by
> >>NextCopyFrom to the remote server.  In the postgres_fdw case, I think this
> >>function would convert the tuple to text format for portability, and then
> >>send the data to the remote server using PQputCopyData.
> >>
> >>* EndForeignCopyIn: this would be called at the bottom of CopyFrom, and
> >>release resources such as connections to the remote server.  In the
> >>postgres_fdw case, this function would do PQputCopyEnd to terminate data
> >>transfer.
> >
> >These primitives look good.  I know it seems unlikely at first
> >blush, but do we know of bulk load APIs for non-PostgreSQL data
> >stores that this would be unable to serve?
> 
> Maybe I'm missing something, but I think these would allow the FDW
> to do the remote copy the way it would like; in
> ExecForeignCopyInOneRow, for example, the FDW could buffer tuples in
> a buffer memory and transmit the buffered data to the remote server
> if the data size exceeds a threshold.  The naming is not so good?
> Suggestions are welcome.

The naming seems reasonable.

I was trying to figure out whether this fits well enough with the bulk
load APIs for databases other than PostgreSQL.  I'm guessing it's
"well enough" based on checking MySQL, Oracle, and MS SQL Server.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal: global index

2017-08-18 Thread Chris Travers
I would really like to see global indexes.  It would make things a lot
easier for things like unique constraints across table inheritance trees.

On Fri, Aug 18, 2017 at 11:12 AM, Ildar Musin 
wrote:

> Hi hackers,
>
> While we've been developing pg_pathman extension one of the most frequent
> questions we got from our users was about global index support. We cannot
> provide it within an extension. And I couldn't find any recent discussion
> about someone implementing it. So I'm thinking about giving it a shot and
> start working on a patch for postgres.
>
> One possible solution is to create an extended version of item pointer
> which would store relation oid along with block number and position:
>
> struct ItemPointerExt
> {
> Oid ip_relid;
> BlockIdData ip_blkid;
> OffsetNumber ip_posid;
> };
>
> and use it in global index (regular index will still use old version).
> This will require in-depth refactoring of existing index nodes to make them
> support both versions. Accordingly, we could replace ItemPointer with
> ItemPointerExt in index AM to make unified API to access both regular and
> global indexes. TIDBitmap will require refactoring as well to be able to
> deal with relation oids.
>

So, to be clear on-disk representations would be unchanged for old indexes
(ensuring that pg_upgrade would not be broken), right?

>
> It seems to be quite an invasive patch since it requires changes in
> general index routines, existing index nodes, catalog, vacuum routines and
> syntax. So I'm planning to implement it step by step. As a first prototype
> it could be:
>
> * refactoring of btree index to be able to store both regular and extended
> item pointers;


Do you foresee any performance implementation of handling both?


>

* refactoring of TIDBitmap;
> * refactoring of general index routines (index_insert, index_getnext, etc)
> and indexAM api;
> * catalog (add pg_index.indisglobal attribute and/or a specific relkind as
> discussed in [1] thread);
> * syntax for global index definition. E.g., it could be oracle-like syntax:
>
> CREATE INDEX my_idx ON my_tbl (key) GLOBAL;
>
> If it goes well, then I’ll do the rest of indexes and vacuuming. If you
> have any ideas or concerns I’ll be glad to hear it.
>
> [1] https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e
> 3-e936a35f0cfd%40postgrespro.ru
>
> Thanks!
>
> --
> Ildar Musin
> i.mu...@postgrespro.ru
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


[HACKERS] [PROPOSAL] Text search configuration extension

2017-08-18 Thread Aleksandr Parfenov
Hello hackers!

I'm working on a new approach in text search configuration and want to
share my thought with community in order to get some feedback and maybe
some new ideas.

Nowadays we can't configure text search engine in Postgres for some
useful scenarios such as multi-language search or exact and
morphological search in one configuration. Additionally, we can't use
dictionaries as a filter-dictionary if it wasn't taken into
consideration during dictionary development. Also I think to split
result set building configuration and command selection configuration.
The last but not the least goal is to keep backward compatibility in
terms of syntax and behavior in currently available scenarios.

In order to meet mentioned goals I propose following syntax for text
search configurations (current syntax could be used as well):

ALTER TEXT SEARCH CONFIGURATION  ADD/ALTER MAPPING FOR
 WITH
CASE
   WHEN  THEN 
   <...>
   [ELSE ]
END;

A  is an expression with dictionary names used as operands
and boolean operators AND, OR and NOT. Additionally, after dictionary
name there could be options for result check IS [NOT] NULL or IS [NOT]
STOP. If there is no check-options for a dictionary, it will be
evaluated as:
   dict IS NOT NULL and dict IS NOT STOP

A  is an expression on sets of lexemes with support of
operators UNION, EXCEPT, INTERSECT and MAP BY. A MAP BY operator is a
way to configure filter-dictionaries, so the output of the righthand
subexpression used as an input of lefthand subexpression. In other
words, MAP BY operator used instead of TSL_FILTER flagged output.

An example of configuration for both English and German search:

ALTER TEXT SEARCH CONFIGURATION en_de_search ADD MAPPING FOR asciiword,
word WITH
CASE
   WHEN english_hunspell IS NOT NULL THEN english_hunspell
   WHEN german_hunspell IS NOT NULL THEN german_hunspell
   ELSE
 -- stem dictionaries can't be used for language detection
 english_stem UNION german_stem
END;

And example with unaccent:

ALTER TEXT SEARCH CONFIGURATION german_unaccent ADD MAPPING FOR
asciiword, word WITH
CASE
   WHEN german_hunspell IS NOT NULL THEN german_hunspell MAP BY unaccent
   ELSE
 german_stem MAP BY unaccent
END;

In the last example the input for german_hunspell is replaced by output
of the unaccent if it is not NULL. If dictionary returns more than one
lexeme, each lexeme processed independently.

I'm not sure should we provide ability to use MAP BY operator in
condition, since MAP BY operates on sets and condition is a boolean
expression. I think to allow this with restriction on obligatory place
it inside parenthesis with check-options. Something like:

(german_hunspell MAP BY unaccent) IS NOT NULL

Because this type of check can be useful in some situations, but we
should isolate set-related subexpression.

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Vacuum: allow usage of more than 1GB of work mem

2017-08-18 Thread Claudio Freire
On Fri, Apr 7, 2017 at 10:51 PM, Claudio Freire  wrote:
> Indeed they do, and that's what motivated this patch. But I'd need
> TB-sized tables to set up something like that. I don't have the
> hardware or time available to do that (vacuum on bloated TB-sized
> tables can take days in my experience). Scale 4000 is as big as I can
> get without running out of space for the tests in my test hardware.
>
> If anybody else has the ability, I'd be thankful if they did test it
> under those conditions, but I cannot. I think Anastasia's test is
> closer to such a test, that's probably why it shows a bigger
> improvement in total elapsed time.
>
> Our production database could possibly be used, but it can take about
> a week to clone it, upgrade it (it's 9.5 currently), and run the
> relevant vacuum.

It looks like I won't be able to do that test with a production
snapshot anytime soon.

Getting approval for the budget required to do that looks like it's
going to take far longer than I thought.

Regardless of that, I think the patch can move forward. I'm still
planning to do the test at some point, but this patch shouldn't block
on it.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Hash take II

2017-08-18 Thread Thomas Munro
On Wed, Aug 2, 2017 at 10:06 PM, Thomas Munro
 wrote:
> On Tue, Aug 1, 2017 at 1:11 PM, Andres Freund  wrote:
>> WRT the main patch:
>
> Thanks for the review.  I will respond soon, but for now I just wanted
> to post a rebased version (no changes) because v16 no longer applies.

Rebased with better commit messages.  Sorry for the changed patch
names, I switched to using git-format properly...  (I'll be posting a
new version with some bigger changes to the 0010 patch and some
answers to good questions you've asked soon.)

-- 
Thomas Munro
http://www.enterprisedb.com


parallel-hash-v18.patchset.tgz
Description: GNU Zip compressed data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] why not parallel seq scan for slow functions

2017-08-18 Thread Amit Kapila
On Thu, Aug 17, 2017 at 2:45 PM, Dilip Kumar  wrote:
> On Thu, Aug 17, 2017 at 2:09 PM, Dilip Kumar  wrote:
>>
>> Either we can pass "num_gene" to merge_clump or we can store num_gene
>> in the root. And inside merge_clump we can check. Do you see some more
>> complexity?
>>

I think something like that should work.

> After putting some more thought I see one more problem but not sure
> whether we can solve it easily. Now, if we skip generating the gather
> path at top level node then our cost comparison while adding the
> element to pool will not be correct as we are skipping some of the
> paths (gather path).  And, it's very much possible that the path1 is
> cheaper than path2 without adding gather on top of it but with gather,
> path2 can be cheaper.
>

I think that should not matter because the costing of gather is mainly
based on a number of rows and that should be same for both path1 and
path2 in this case.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [BUGS] [postgresql 10 beta3] unrecognized node type: 90

2017-08-18 Thread Amit Kapila
On Fri, Aug 18, 2017 at 3:50 AM, Tom Lane  wrote:
> I wrote:
>> Ah-hah, I see my dromedary box is one of the ones failing, so I'll
>> have a look there.  I can't reproduce it on my other machines.
>
> OK, so this is a whole lot more broken than I thought :-(.
>
> Bear in mind that the plan for this (omitting uninteresting detail) is
>
>  Nested Loop Left Join
>->  Values Scan on "*VALUES*"
>->  Finalize GroupAggregate
>  ->  Gather Merge
>->  Partial GroupAggregate
>  ->  Sort
>->  Parallel Seq Scan on tenk1
>
> What seems to be happening is that:
>
> 1. On the first pass, the parallel seqscan work gets doled out to several
> workers, plus the leader, as expected.
>
> 2. When the nestloop rescans its right input, ExecReScanGatherMerge
> supposes that this is good enough to handle rescanning its subnodes:
>
> ExecReScan(node->ps.lefttree);
>
> Leaving aside the question of why that doesn't look like nearly every
> other child rescan call, what happens is that that invokes ExecReScanAgg,
> which does the more usual thing:
>
> if (outerPlan->chgParam == NULL)
> ExecReScan(outerPlan);
>
> and so that invokes ExecReScanSort, and then behold what ExecReScanSort
> thinks it can optimize away:
>
>  * If subnode is to be rescanned then we forget previous sort results; we
>  * have to re-read the subplan and re-sort.  Also must re-sort if the
>  * bounded-sort parameters changed or we didn't select randomAccess.
>  *
>  * Otherwise we can just rewind and rescan the sorted output.
>
> So we never get to ExecReScanSeqScan, and thus not to heap_rescan,
> with the effect that parallel_scan->phs_nallocated never gets reset.
>
> 3. On the next pass, we fire up all the workers as expected, but they all
> perceive phs_nallocated >= rs_nblocks and conclude they have nothing to
> do.  Meanwhile, in the leader, nodeSort just re-emits the sorted data it
> had last time.  Net effect is that the GatherMerge node returns only the
> fraction of the data that was scanned by the leader in the first pass.
>
> 4. The fact that the test succeeds on many machines implies that the
> leader process is usually doing *all* of the work.  This is in itself not
> very good.  Even on the machines where it fails, the fact that the tuple
> counts are usually a pretty large fraction of the expected values
> indicates that the leader usually did most of the work.  We need to take
> a closer look at why that is.
>
> However, the bottom line here is that parallel scan is completely broken
> for rescans, and it's not (solely) the fault of nodeGatherMerge; rather,
> the issue is that nobody bothered to wire up parallelism to the rescan
> parameterization mechanism.
>

I think we don't generate parallel plans for parameterized paths, so I
am not sure whether any work is required in that area.

>  I imagine that related bugs can be
> demonstrated in 9.6 with little effort.
>
> I think that the correct fix probably involves marking each parallel scan
> plan node as dependent on a pseudo executor parameter, which the parent
> Gather or GatherMerge node would flag as being changed on each rescan.
> This would cue the plan layers in between that they cannot optimize on the
> assumption that the leader's instance of the parallel scan will produce
> exactly the same rows as it did last time, even when "nothing else
> changed".  The "wtParam" pseudo parameter that's used for communication
> between RecursiveUnion and its descendant WorkTableScan node is a good
> model for what needs to happen.
>

Yeah, that seems like a good idea.  I think another way could be to
*not* optimize rescanning when we are in parallel mode
(IsInParallelMode()), that might be restrictive as compared to what
you are suggesting, but will be somewhat simpler.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Create replication slot in pg_basebackup if requested and not yet present

2017-08-18 Thread Michael Banck
Hi,

On Tue, Aug 15, 2017 at 02:14:58PM -0700, Jeff Janes wrote:
> On Sun, Apr 9, 2017 at 4:22 AM, Michael Banck 
> wrote:
> > Rebased, squashed and slighly edited version attached. I've added this
> > to the 2017-07 commitfest now as well:
> >
> > https://commitfest.postgresql.org/14/1112/
> 
> Can you rebase this past some conflicting changes?

Thanks for letting me know, PFA a rebased version.

I also adressed the following message which appeared when starting the
TAP tests:

't/010_pg_basebackup.pl ... "my" variable $lsn masks earlier declaration
in same scope at t/010_pg_basebackup.pl line 287.'


Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
>From 62973bba3338abfbf4d9e2f0f05a338eb283b4b8 Mon Sep 17 00:00:00 2001
From: Michael Banck 
Date: Fri, 18 Aug 2017 11:28:26 +0200
Subject: [PATCH] Add option to create a replication slot in pg_basebackup if
 not yet present.

When requesting a particular replication slot, the new pg_basebackup option
--create-slot tries to create it before starting to replicate from it in case
it does not exist already.

In order to allow reporting of temporary replication slot creation in --verbose
mode, further refactor the slot creation logic. Add new argument `is_temporary'
to CreateReplicationSlot() in streamutil.c which specifies whether a physical
replication slot is temporary or not. Update all callers. Move the creation of
the temporary replication slot for pg_basebackup from receivelog.c to
pg_basebackup.c and mention it in --verbose mode. At the same time, also create
the temporary slot via CreateReplicationSlot() instead of creating the
temporary slot with an explicit SQL command.
---
 doc/src/sgml/ref/pg_basebackup.sgml  | 12 ++
 src/bin/pg_basebackup/pg_basebackup.c| 58 ++--
 src/bin/pg_basebackup/pg_receivewal.c|  2 +-
 src/bin/pg_basebackup/pg_recvlogical.c   |  2 +-
 src/bin/pg_basebackup/receivelog.c   | 18 -
 src/bin/pg_basebackup/streamutil.c   | 18 ++---
 src/bin/pg_basebackup/streamutil.h   |  2 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 25 ++--
 8 files changed, 103 insertions(+), 34 deletions(-)

diff --git a/doc/src/sgml/ref/pg_basebackup.sgml b/doc/src/sgml/ref/pg_basebackup.sgml
index 2454d35af3..5397a185d2 100644
--- a/doc/src/sgml/ref/pg_basebackup.sgml
+++ b/doc/src/sgml/ref/pg_basebackup.sgml
@@ -269,6 +269,18 @@ PostgreSQL documentation
  
 
  
+  -C
+  --create-slot
+  
+   
+This option can only be used together with the --slot 
+option.  It causes the specified slot name to be created if it does not
+exist already.  
+   
+  
+ 
+
+ 
   -T olddir=newdir
   --tablespace-mapping=olddir=newdir
   
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index dfb9b5ddcb..2bc7e868e0 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -92,6 +92,8 @@ static pg_time_t last_progress_report = 0;
 static int32 maxrate = 0;		/* no limit by default */
 static char *replication_slot = NULL;
 static bool temp_replication_slot = true;
+static bool create_slot = false;
+static bool no_slot = false;
 
 static bool success = false;
 static bool made_new_pgdata = false;
@@ -337,6 +339,7 @@ usage(void)
 			 " write recovery.conf for replication\n"));
 	printf(_("  -S, --slot=SLOTNAMEreplication slot to use\n"));
 	printf(_("  --no-slot  prevent creation of temporary replication slot\n"));
+	printf(_("  -C, --create-slot  create replication slot if not present already\n"));
 	printf(_("  -T, --tablespace-mapping=OLDDIR=NEWDIR\n"
 			 " relocate tablespace in OLDDIR to NEWDIR\n"));
 	printf(_("  -X, --wal-method=none|fetch|stream\n"
@@ -492,8 +495,6 @@ LogStreamerMain(logstreamer_param *param)
 	stream.partial_suffix = NULL;
 	stream.replication_slot = replication_slot;
 	stream.temp_slot = param->temp_slot;
-	if (stream.temp_slot && !stream.replication_slot)
-		stream.replication_slot = psprintf("pg_basebackup_%d", (int) PQbackendPID(param->bgconn));
 
 	if (format == 'p')
 		stream.walmethod = CreateWalDirectoryMethod(param->xlog, 0, do_sync);
@@ -586,6 +587,29 @@ StartLogStreamer(char *startpos, uint32 timeline, char *sysidentifier)
 	else
 		param->temp_slot = temp_replication_slot;
 
+	/*
+	 * Create replication slot if one is needed.
+	 */
+	if (!no_slot && (temp_replication_slot || create_slot))
+	{
+		if (!replication_slot)
+			replication_slot = psprintf("pg_basebackup_%d", (int) getpid());
+
+		if (verbose)
+		{
+			

[HACKERS] Proposal: global index

2017-08-18 Thread Ildar Musin

Hi hackers,

While we've been developing pg_pathman extension one of the most 
frequent questions we got from our users was about global index support. 
We cannot provide it within an extension. And I couldn't find any recent 
discussion about someone implementing it. So I'm thinking about giving 
it a shot and start working on a patch for postgres.


One possible solution is to create an extended version of item pointer 
which would store relation oid along with block number and position:


struct ItemPointerExt
{
Oid ip_relid;
BlockIdData ip_blkid;
OffsetNumber ip_posid;
};

and use it in global index (regular index will still use old version). 
This will require in-depth refactoring of existing index nodes to make 
them support both versions. Accordingly, we could replace ItemPointer 
with ItemPointerExt in index AM to make unified API to access both 
regular and global indexes. TIDBitmap will require refactoring as well 
to be able to deal with relation oids.


It seems to be quite an invasive patch since it requires changes in 
general index routines, existing index nodes, catalog, vacuum routines 
and syntax. So I'm planning to implement it step by step. As a first 
prototype it could be:


* refactoring of btree index to be able to store both regular and 
extended item pointers;

* refactoring of TIDBitmap;
* refactoring of general index routines (index_insert, index_getnext, 
etc) and indexAM api;
* catalog (add pg_index.indisglobal attribute and/or a specific relkind 
as discussed in [1] thread);

* syntax for global index definition. E.g., it could be oracle-like syntax:

CREATE INDEX my_idx ON my_tbl (key) GLOBAL;

If it goes well, then I’ll do the rest of indexes and vacuuming. If you 
have any ideas or concerns I’ll be glad to hear it.


[1] 
https://www.postgresql.org/message-id/c8fe4f6b-ff46-aae0-89e3-e936a35f0cfd%40postgrespro.ru


Thanks!

--
Ildar Musin
i.mu...@postgrespro.ru


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-18 Thread Piotr Stefaniak
On 2017-08-17 11:24, Simon Riggs wrote:
> Your suggestion of "furthest" is already the default behaviour.
> 
> Why are you using 'now'? Why would you want to pick a randomly
> selected end time?

The idea in both cases was to stop the recovery in order to prevent the
standby from selecting new timeline. I want to be able to continue the
recovery from future WAL files.  "Furthest" really meant "as far as
possible without completing recovery".

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] ECPG: WHENEVER statement with DO CONTINUE action

2017-08-18 Thread vinayak


On 2017/06/20 17:35, vinayak wrote:

Hi Sawada-san,

On 2017/06/20 17:22, Masahiko Sawada wrote:

On Tue, Jun 20, 2017 at 1:51 PM, vinayak
 wrote:


On 2017/06/12 13:09, vinayak wrote:

Hi,

On 2017/06/10 12:23, Vinayak Pokale wrote:

Thank you for your reply

On Jun 9, 2017 5:39 PM, "Michael Meskes"  wrote:

Could you please add a "DO CONTINUE" case to one of the test cases? Or
add a new one? We would need a test case IMO.


Yes I will add test case and send updated patch.

I have added new test case for DO CONTINUE.
Please check the attached patch.

I have added this in Sept. CF
https://commitfest.postgresql.org/14/1173/


--
In whenever_do_continue.pgc file, the following line seems not to be
processed successfully by ecpg but should we fix that?

+
+   exec sql whenever sqlerror continue;
+

Also, you wrote the test case using "WHENEVER sqlerror DO CONTINUE"
action but that seems not to emit sqlerror, so "DO CONTINUE" is not
executed. I think the test case for DO CONTINUE should be a C code
that executes the "continue" clause.

Thank you for testing the patch.
I agreed with your comments. I will update the patch.

Please check the attached updated patch.

Regards,
Vinayak Pokale
NTT Open Source Software Center
>From cd71bf7229a8566cadfde3d0e89b1b445baf1fee Mon Sep 17 00:00:00 2001
From: Vinayak Pokale 
Date: Thu, 22 Jun 2017 11:08:38 +0900
Subject: [PATCH] WHENEVER statement DO CONTINUE support

---
 doc/src/sgml/ecpg.sgml |   12 ++
 src/interfaces/ecpg/preproc/ecpg.trailer   |6 +
 src/interfaces/ecpg/preproc/output.c   |3 +
 src/interfaces/ecpg/test/ecpg_schedule |1 +
 .../test/expected/preproc-whenever_do_continue.c   |  159 
 .../expected/preproc-whenever_do_continue.stderr   |  112 ++
 .../expected/preproc-whenever_do_continue.stdout   |2 +
 src/interfaces/ecpg/test/preproc/Makefile  |1 +
 .../ecpg/test/preproc/whenever_do_continue.pgc |   61 
 9 files changed, 357 insertions(+), 0 deletions(-)
 create mode 100644 src/interfaces/ecpg/test/expected/preproc-whenever_do_continue.c
 create mode 100644 src/interfaces/ecpg/test/expected/preproc-whenever_do_continue.stderr
 create mode 100644 src/interfaces/ecpg/test/expected/preproc-whenever_do_continue.stdout
 create mode 100644 src/interfaces/ecpg/test/preproc/whenever_do_continue.pgc

diff --git a/doc/src/sgml/ecpg.sgml b/doc/src/sgml/ecpg.sgml
index f13a0e9..3cb4001 100644
--- a/doc/src/sgml/ecpg.sgml
+++ b/doc/src/sgml/ecpg.sgml
@@ -4763,6 +4763,17 @@ EXEC SQL WHENEVER condition action
 
  
+  DO CONTINUE
+  
+   
+Execute the C statement continue.  This should
+only be used in loops statements.  if executed, will cause the flow 
+of control to return to the top of the loop.
+   
+  
+ 
+
+ 
   CALL name (args)
   DO name (args)
   
@@ -7799,6 +7810,7 @@ WHENEVER { NOT FOUND | SQLERROR | SQLWARNING } ac
 
 EXEC SQL WHENEVER NOT FOUND CONTINUE;
 EXEC SQL WHENEVER NOT FOUND DO BREAK;
+EXEC SQL WHENEVER NOT FOUND DO CONTINUE;
 EXEC SQL WHENEVER SQLWARNING SQLPRINT;
 EXEC SQL WHENEVER SQLWARNING DO warn();
 EXEC SQL WHENEVER SQLERROR sqlprint;
diff --git a/src/interfaces/ecpg/preproc/ecpg.trailer b/src/interfaces/ecpg/preproc/ecpg.trailer
index 1c10879..b42bca4 100644
--- a/src/interfaces/ecpg/preproc/ecpg.trailer
+++ b/src/interfaces/ecpg/preproc/ecpg.trailer
@@ -1454,6 +1454,12 @@ action : CONTINUE_P
 			$$.command = NULL;
 			$$.str = mm_strdup("break");
 		}
+		| DO CONTINUE_P
+		{
+			$$.code = W_CONTINUE;
+			$$.command = NULL;
+			$$.str = mm_strdup("continue");
+		}
 		| SQL_CALL name '(' c_args ')'
 		{
 			$$.code = W_DO;
diff --git a/src/interfaces/ecpg/preproc/output.c b/src/interfaces/ecpg/preproc/output.c
index 59d5d30..14d7066 100644
--- a/src/interfaces/ecpg/preproc/output.c
+++ b/src/interfaces/ecpg/preproc/output.c
@@ -51,6 +51,9 @@ print_action(struct when * w)
 		case W_BREAK:
 			fprintf(base_yyout, "break;");
 			break;
+		case W_CONTINUE:
+			fprintf(base_yyout, "continue;");
+			break;
 		default:
 			fprintf(base_yyout, "{/* %d not implemented yet */}", w->code);
 			break;
diff --git a/src/interfaces/ecpg/test/ecpg_schedule b/src/interfaces/ecpg/test/ecpg_schedule
index c3ec125..cff4eeb 100644
--- a/src/interfaces/ecpg/test/ecpg_schedule
+++ b/src/interfaces/ecpg/test/ecpg_schedule
@@ -28,6 +28,7 @@ test: preproc/type
 test: preproc/variable
 test: preproc/outofscope
 test: preproc/whenever
+test: preproc/whenever_do_continue
 test: sql/array
 test: sql/binary
 test: sql/code100
diff --git a/src/interfaces/ecpg/test/expected/preproc-whenever_do_continue.c b/src/interfaces/ecpg/test/expected/preproc-whenever_do_continue.c
new file mode 100644
index 000..22d98ca
--- /dev/null
+++ 

Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-18 Thread Simon Riggs
On 18 August 2017 at 07:30, Michael Paquier  wrote:
> On Thu, Aug 17, 2017 at 6:24 PM, Simon Riggs  wrote:
>> On 15 August 2017 at 15:37, Piotr Stefaniak  
>> wrote:
>>
>>> One thing I tried was a combination of recovery_target_action =
>>> 'shutdown' and recovery_target_time = 'now'. The result is surprising
>>
>> Indeed, special timestamp values were never considered in the design,
>> so I'm not surprised they don't work and have never been tested.
>
> We could always use a TRY/CATCH block and add an error in
> GetCurrentDateTime and GetCurrentTimeUsec if they are called out of a
> transaction context. Rather-safe-than-sorry.
>
>> Your suggestion of "furthest" is already the default behaviour.
>>
>> Why are you using 'now'? Why would you want to pick a randomly
>> selected end time?
>
> "now" is not much interesting, targets in the past are more, like
> 'yesterday'. This could create back an instance back to the beginning
> of the previous day, simplifying scripts creating recovery.conf a bit,
> even if that's not much simplification as we are talking about
> creating a timestamp string.

I can't see any value in allowing imprecise and effective random timestamps.

ISTM if we care, it would be better to simply exclude the 7 named
timestamps prior to their being sent, as in the attached patch.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


exclude_special_values_in_recovery_target_time.v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   tested, passed
Spec compliant:   tested, passed
Documentation:tested, passed

Oops, missed those checkboxes. Sorry for the noise. Here's correct checklist: 
installcheck-world passes, documented as expected. Feature is there.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add support for tuple routing to foreign partitions

2017-08-18 Thread Etsuro Fujita

On 2017/08/17 23:48, David Fetter wrote:

On Thu, Aug 17, 2017 at 05:27:05PM +0900, Etsuro Fujita wrote:

On 2017/07/11 6:56, Robert Haas wrote:

On Thu, Jun 29, 2017 at 6:20 AM, Etsuro Fujita
 wrote:

So, I dropped the COPY part.


Ouch.  I think we should try to figure out how the COPY part will be
handled before we commit to a design.


I spent some time on this.  To handle that, I'd like to propose doing
something similar to \copy (frontend copy): submit a COPY query "COPY ...
FROM STDIN" to the remote server and route data from a file to the remote
server.  For that, I'd like to add new FDW APIs called during CopyFrom that
allow us to copy to foreign tables:

* BeginForeignCopyIn: this would be called after creating a ResultRelInfo
for the target table (or each leaf partition of the target partitioned
table) if it's a foreign table, and perform any initialization needed before
the remote copy can start.  In the postgres_fdw case, I think this function
would be a good place to send "COPY ... FROM STDIN" to the remote server.

* ExecForeignCopyInOneRow: this would be called instead of heap_insert if
the target is a foreign table, and route the tuple read from the file by
NextCopyFrom to the remote server.  In the postgres_fdw case, I think this
function would convert the tuple to text format for portability, and then
send the data to the remote server using PQputCopyData.

* EndForeignCopyIn: this would be called at the bottom of CopyFrom, and
release resources such as connections to the remote server.  In the
postgres_fdw case, this function would do PQputCopyEnd to terminate data
transfer.


These primitives look good.  I know it seems unlikely at first blush,
but do we know of bulk load APIs for non-PostgreSQL data stores that
this would be unable to serve?


Maybe I'm missing something, but I think these would allow the FDW to do 
the remote copy the way it would like; in ExecForeignCopyInOneRow, for 
example, the FDW could buffer tuples in a buffer memory and transmit the 
buffered data to the remote server if the data size exceeds a threshold. 
 The naming is not so good?  Suggestions are welcome.


Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PROPOSAL] Use SnapshotAny in get_actual_variable_range

2017-08-18 Thread Andrey Borodin
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

Hi! I've looked into the patch.

There is not so much of the code. The code itself is pretty clear and well 
documented. I've found just one small typo "transactionn" in 
HeapTupleSatisfiesNonVacuumable() comments.

Chosen Snapshot type is not a solution to the author's problem at hand. Though 
implemented SnapshotNonVacuumable is closer to rational choice  than currently 
used SnapshotDirty for range sketching. As far as I can see this is what is 
get_actual_variable_range() is used for, despite word "actual" in it's name. 
The only place where get_actual_variable_range() is used for actual range is 
preprocessed-out in get_variable_range():
/*
 * XXX It's very tempting to try to use the actual column min and max, 
if
 * we can get them relatively-cheaply with an index probe.  However, 
since
 * this function is called many times during join planning, that could
 * have unpleasant effects on planning speed.  Need more investigation
 * before enabling this.
 */
#ifdef NOT_USED
if (get_actual_variable_range(root, vardata, sortop, min, max))
return true;
#endif

I think it would be good if we could have something like "give me actual 
values, but only if it is on first index page", like conditional locks. But I 
think this patch is a step to right direction.

Best regards, Andrey Borodin.
-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] recovery_target_time = 'now' is not an error but still impractical setting

2017-08-18 Thread Michael Paquier
On Thu, Aug 17, 2017 at 6:24 PM, Simon Riggs  wrote:
> On 15 August 2017 at 15:37, Piotr Stefaniak  
> wrote:
>
>> One thing I tried was a combination of recovery_target_action =
>> 'shutdown' and recovery_target_time = 'now'. The result is surprising
>
> Indeed, special timestamp values were never considered in the design,
> so I'm not surprised they don't work and have never been tested.

We could always use a TRY/CATCH block and add an error in
GetCurrentDateTime and GetCurrentTimeUsec if they are called out of a
transaction context. Rather-safe-than-sorry.

> Your suggestion of "furthest" is already the default behaviour.
>
> Why are you using 'now'? Why would you want to pick a randomly
> selected end time?

"now" is not much interesting, targets in the past are more, like
'yesterday'. This could create back an instance back to the beginning
of the previous day, simplifying scripts creating recovery.conf a bit,
even if that's not much simplification as we are talking about
creating a timestamp string.

> recovery_target_time = 'allballs' sounds fun for recovering corrupted 
> databases

This one would not work :)
=# select timestamptz_in('allballs', 0, 0);
ERROR:  22007: invalid input syntax for type timestamp with time zone:
"allballs"
LOCATION:  DateTimeParseError, datetime.c:3800
Still that would be fun.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Proposal : For Auto-Prewarm.

2017-08-18 Thread Mithun Cy
On Wed, Aug 16, 2017 at 2:08 AM, Robert Haas  wrote:
> On Fri, Jul 14, 2017 at 8:17 AM, Mithun Cy  wrote:
>> [ new patch ]
> It's quite possible that in making all of these changes I've
> introduced some bugs, so I think this needs some testing and review.
> It's also possible that some of the changes that I made are actually
> not improvements and should be reverted, but it's always hard to tell
> that about your own code.  Anyway, please see the attached version.

Sorry, Robert, I was on vacation so could not pick this immediately. I
have been testing and reviewing the patch and I found following
issues.

1. Hang in apw_detach_shmem.
+/*
+ * Clear our PID from autoprewarm shared state.
+ */
+static void
+apw_detach_shmem(int code, Datum arg)
+{
+   LWLockAcquire(_state->lock, LW_EXCLUSIVE);
+   if (apw_state->pid_using_dumpfile == MyProcPid)
+   apw_state->pid_using_dumpfile = InvalidPid;
+   if (apw_state->bgworker_pid == MyProcPid)
+   apw_state->bgworker_pid = InvalidPid;
+   LWLockRelease(_state->lock);
+}

The reason is that we might already be under the apw_state->lock when
we error out and jump to apw_detach_shmem. So we should not be trying
to take the lock again. For example, in autoprewarm_dump_now(),
apw_dump_now() will error out under the lock if bgworker is already
using dump file.

===
+autoprewarm_dump_now(PG_FUNCTION_ARGS)
+{
+int64 num_blocks;
+
+apw_init_shmem();
+
+PG_TRY();
+{
+ num_blocks = apw_dump_now(false, true);
+}
+PG_CATCH();
+{
+ apw_detach_shmem(0, 0);
+ PG_RE_THROW();
+}
+PG_END_TRY();
+
+PG_RETURN_INT64(num_blocks);
+}

===
+ LWLockAcquire(_state->lock, LW_EXCLUSIVE);
+ if (apw_state->pid_using_dumpfile == InvalidPid)
+ apw_state->pid_using_dumpfile = MyProcPid;
+ else
+ {
+ if (!is_bgworker)
+ ereport(ERROR,
+(errmsg("could not perform block dump because
dump file is being used by PID %d",
+ apw_state->pid_using_dumpfile)));

This attempt to take lock again hangs the autoprewarm module. I think
there is no need to take lock while we reset those variables as we
reset only if we have set it ourselves.

2) I also found one issue which was my own mistake in my previous patch 19.
In "apw_dump_now" I missed calling FreeFile() on first write error,
whereas on othercases I am already calling the same.
ret = fprintf(file, "<<" INT64_FORMAT ">>\n", num_blocks);
+ if (ret < 0)
+ {
+ int save_errno = errno;
+
+ unlink(transient_dump_file_path);

Other than this, the patch is working as it was previously doing. If
you think my presumed fix(not to take lock) to hang issue is right I
will produce a patch for the same.

-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] expanding inheritance in partition bound order

2017-08-18 Thread Amit Khandekar
On 18 August 2017 at 10:55, Amit Langote  wrote:
> On 2017/08/18 4:54, Robert Haas wrote:
>> On Thu, Aug 17, 2017 at 8:39 AM, Ashutosh Bapat
>>  wrote:
>>> [2] had a patch with some changes to the original patch you posted. I
>>> didn't describe those changes in my mail, since they rearranged the
>>> comments. Those changes are not part of this patch and you haven't
>>> comments about those changes as well. If you have intentionally
>>> excluded those changes, it's fine. In case, you haven't reviewed them,
>>> please see if they are good to be incorporated.
>>
>> I took a quick look at your version but I think I like Amit's fine the
>> way it is, so committed that and back-patched it to v10.
>
> Thanks for committing.
>
>> I find 0002 pretty ugly as things stand.  We get a bunch of tuple maps
>> that we don't really need, only to turn around and free them.  We get
>> a bunch of tuple slots that we don't need, only to turn around and
>> drop them.  We don't really need the PartitionDispatch objects either,
>> except for the OIDs they contain.  There's a lot of extra stuff being
>> computed here that is really irrelevant for this purpose.  I think we
>> should try to clean that up somehow.
>
> One way to do that might be to reverse the order of the remaining patches
> and put the patch to refactor RelationGetPartitionDispatchInfo() first.
> With that refactoring, PartitionDispatch itself has become much simpler in
> that it does not contain a relcache reference to be closed eventually by
> the caller, the tuple map, and the tuple table slot.  Since those things
> are required for tuple-routing, the refactoring makes
> ExecSetupPartitionTupleRouting itself create them from the (minimal)
> information returned by RelationGetPartitionDispatchInfo and ultimately
> destroy when done using it.  I kept the indexes field in
> PartitionDispatchData though, because it's essentially free to create
> while we are walking the partition tree in
> RelationGetPartitionDispatchInfo() and it seems undesirable to make the
> caller compute that information (indexes) by traversing the partition tree
> all over again, if it doesn't otherwise have to.  I am still considering
> some counter-arguments raised by Amit Khandekar about this last assertion.
>
> Thoughts?

One another approach (that I have used in update-partition-key patch)
is to *not* generate the oids beforehand, and instead, call a
partition_walker_next() function to traverse through the tree. Each
next() function would return a ChildInfo that includes child oid,
parent oid, etc. All users of this would guarantee a fixed order of
oids. In the update-partition-key patch, I am opening and closing each
of the children, which of course, we need to avoid.




-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers