Re: track_planning causing performance regression

2020-07-05 Thread Fujii Masao




On 2020/07/04 12:22, Pavel Stehule wrote:



pá 3. 7. 2020 v 13:02 odesílatel Fujii Masao mailto:masao.fu...@oss.nttdata.com>> napsal:



On 2020/07/03 16:02, Pavel Stehule wrote:
 >
 >
 > pá 3. 7. 2020 v 8:57 odesílatel Fujii Masao mailto:masao.fu...@oss.nttdata.com> >> napsal:
 >
 >
 >
 >     On 2020/07/03 13:05, Pavel Stehule wrote:
 >      > Hi
 >      >
 >      > pá 3. 7. 2020 v 4:39 odesílatel Fujii Masao mailto:masao.fu...@oss.nttdata.com> > 
       >
 >      >
 >      >
 >      >     On 2020/07/01 7:37, Peter Geoghegan wrote:
 >      >      > On Tue, Jun 30, 2020 at 6:40 AM Fujii Masao mailto:masao.fu...@oss.nttdata.com> > 
       >      >> Ants and Andres suggested to replace the spinlock used in 
pgss_store() with
 >      >      >> LWLock. I agreed with them and posted the POC patch doing 
that. But I think
 >      >      >> the patch is an item for v14. The patch may address the 
reported performance
 >      >      >> issue, but may cause other performance issues in other 
workloads. We would
 >      >      >> need to measure how the patch affects the performance in 
various workloads.
 >      >      >> It seems too late to do that at this stage of v13. 
Thought?
 >      >      >
 >      >      > I agree that it's too late for v13.
 >      >
 >      >     Thanks for the comment!
 >      >
 >      >     So I pushed the patch and changed default of track_planning 
to off.
 >      >
 >      >
 >      > Maybe there can be documented so enabling this option can have a 
negative impact on performance.
 >
 >     Yes. What about adding either of the followings into the doc?
 >
 >           Enabling this parameter may incur a noticeable performance 
penalty.
 >
 >     or
 >
 >           Enabling this parameter may incur a noticeable performance 
penalty,
 >           especially when a fewer kinds of queries are executed on many
 >           concurrent connections.
 >
 >
 > This second variant looks perfect for this case.

Ok, so patch attached.


+1


Thanks for the review! Pushed.

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: archive status ".ready" files may be created too early

2020-07-05 Thread Kyotaro Horiguchi
Hello, Matsumura-san.

At Mon, 6 Jul 2020 04:02:23 +, "matsumura@fujitsu.com" 
 wrote in 
> Hello, Horiguchi-san
> 
> Thank you for your comment and patch.
> 
> At Thursday, June 25, 2020 3:36 PM(JST),  "Kyotaro Horiguchi 
> " wrote in
> > I think we don't need most of that shmem stuff.  XLogWrite is called
> 
> I wanted no more shmem stuff too, but other ideas need more lock
> that excludes inserter and writer each other.
> 
> > after WAL buffer is filled up to the requested position. So when it
> > crosses segment boundary we know the all past corss segment-boundary
> > records are stable. That means all we need to remember is only the
> > position of the latest corss-boundary record.
> 
> I could not agree. In the following case, it may not work well.
> - record-A and record-B (record-B is a newer one) is copied, and
> - lastSegContRecStart/End points to record-B's, and
> - FlushPtr is proceeded to in the middle of record-A.

IIUC, that means record-B is a cross segment-border record and we hav e
flushed beyond the recrod-B. In that case crash recovery afterwards
can read the complete record-B and will finish recovery *after* the
record-B. That's what we need here.

> In the above case, the writer should notify segments before record-A,
> but it notifies ones before record-B. If the writer notifies

If you mean that NotifyStableSegments notifies up-to the previous
segment of the segment where record-A is placed, that's wrong. The
issue here is crash recovery sees an incomplete record at a
segment-border. So it is sufficient that crash recoery can read the
last record by looking pg_wal.

> only when it flushes the latest record completely, it works well.

It confirms that "lastSegContRecEnd < LogwrtResult.Flush", that means
the last record(B) is completely flushed-out, isn't that? So it works
well.

> But the writer may not be enable to notify any segment forever when
> WAL records crossing-segment-boundary are inserted contiunuously.

No. As I mentioned in the preivous main, if we see a
cross-segment-boundary record, the previous cross-segment-boundary
record is flushed completely, and the segment containing the
first-half of the previous cross-segment-boundary record has already
been flushed.  I didin't that but we can put an assertion in
XLogInsertRecord like this:

 +  /* Remember the range of the record if it spans over segments */
 +  XLByteToSeg(StartPos, startseg, wal_segment_size);
 +  XLByteToPrevSeg(EndPos, endseg, wal_segment_size);
 +
 +  if (startseg != endseg)
 +  {
++  /* we shouldn't have a record spanning over three or more segments 
*/
++  Assert(endseg = startseg + 1);
 +  SpinLockAcquire(>info_lck);
 +  if (XLogCtl->lastSegContRecEnd < StartPos)
 +  {
 +  XLogCtl->lastSegContRecStart = StartPos;
 +  XLogCtl->lastSegContRecEnd = EndPos;

> So I think that we must remeber all such cross-segement-boundary records's 
> EndRecPtr in buffer.
> 
> 
> > If we call XLogMarkEndRecPtrIfNeeded() there, the function is called
> > every time a record is written, most of which are wasteful.
> > XLogInsertRecord already has a code block executed only at every page
> > boundary.
> 
> I agree.
> XLogMarkEndRecPtrIfNeeded() is moved into the code block before updating
> LogwrtRqst.Write for avoiding passing-each-other with writer.
> 
> 
> > Now we can identify stable portion of WAL stream. It's enough to
> > prevent walsender from sending data that can be overwritten
> > afterwards. GetReplicationTargetRecPtr() in the attached does that.
> 
> I didn't notice it.
> I agree basically, but it is based on lastSegContRecStart/End.
> 
> So, first of all, we have to agree what should be remebered.

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: proposal - function string_to_table

2020-07-05 Thread Pavel Stehule
ne 5. 7. 2020 v 13:30 odesílatel Pavel Stehule 
napsal:

>
>
> pá 5. 6. 2020 v 13:55 odesílatel Pavel Stehule 
> napsal:
>
>> Hi
>>
>> čt 4. 6. 2020 v 11:49 odesílatel movead...@highgo.ca 
>> napsal:
>>
>>> +{ oid => '2228', descr => 'split delimited text',
>>> +  proname => 'string_to_table', prorows => '1000', proretset => 't',
>>> +  prorettype => 'text', proargtypes => 'text text',
>>> +  prosrc => 'text_to_table' },
>>> +{ oid => '2282', descr => 'split delimited text with null string',
>>> +  proname => 'string_to_table', prorows => '1000', proretset => 't',
>>> +  prorettype => 'text', proargtypes => 'text text text',
>>> +  prosrc => 'text_to_table_null' },
>>>
>>> I go through the patch, and everything looks good to me. But I do not
>>> know
>>> why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table'
>>> there, I think.
>>>
>>
>> It is a convention in Postgres - every SQL unique signature has its own
>> unique internal C function.
>>
>> I am sending a refreshed patch.
>>
>
> rebase
>

two fresh fix

a) remove garbage from patch that breaks doc

b) these functions should not be strict - be consistent with
string_to_array functions

Regards

Pavel


>
> Regards
>
> Pavel
>
>
>> Regards
>>
>> Pavel
>>
>>
>>
>>
>>>
>>> --
>>> Regards,
>>> Highgo Software (Canada/China/Pakistan)
>>> URL : www.highgo.ca
>>> EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
>>>
>>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f065856535..b686037524 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3458,6 +3458,27 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ string_to_table
+
+string_to_table ( string text, delimiter text  , nullstr text   )
+set of text
+   
+   
+splits string into table using supplied delimiter and
+optional null string.
+   
+   
+string_to_table('xx~^~yy~^~zz', '~^~', 'yy')
+
+xx
+yy,
+zz
+   
+  
+
   

 
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index df10bfb906..87980b0803 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -26,6 +26,7 @@
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/execnodes.h"
 #include "parser/scansup.h"
 #include "port/pg_bswap.h"
 #include "regex/regex.h"
@@ -35,6 +36,7 @@
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/sortsupport.h"
+#include "utils/tuplestore.h"
 #include "utils/varlena.h"
 
 
@@ -92,6 +94,16 @@ typedef struct
 	pg_locale_t locale;
 } VarStringSortSupport;
 
+/*
+ * Holds target metadata used for split string to array or to table.
+ */
+typedef struct
+{
+	ArrayBuildState	*astate;
+	Tuplestorestate *tupstore;
+	TupleDesc	tupdesc;
+} SplitStringTargetData;
+
 /*
  * This should be large enough that most strings will fit, but small enough
  * that we feel comfortable putting it on the stack
@@ -139,7 +151,7 @@ static bytea *bytea_substring(Datum str,
 			  bool length_not_specified);
 static bytea *bytea_overlay(bytea *t1, bytea *t2, int sp, int sl);
 static void appendStringInfoText(StringInfo str, const text *t);
-static Datum text_to_array_internal(PG_FUNCTION_ARGS);
+static bool text_to_array_internal(FunctionCallInfo fcinfo, SplitStringTargetData *tstate);
 static text *array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	const char *fldsep, const char *null_string);
 static StringInfo makeStringAggState(FunctionCallInfo fcinfo);
@@ -4679,7 +4691,19 @@ text_isequal(text *txt1, text *txt2, Oid collid)
 Datum
 text_to_array(PG_FUNCTION_ARGS)
 {
-	return text_to_array_internal(fcinfo);
+	SplitStringTargetData tstate;
+
+	/* reset tstate */
+	memset(, 0, sizeof(tstate));
+
+	if (!text_to_array_internal(fcinfo, ))
+		PG_RETURN_NULL();
+
+	if (!tstate.astate)
+		PG_RETURN_ARRAYTYPE_P(construct_empty_array(TEXTOID));
+
+	PG_RETURN_ARRAYTYPE_P(makeArrayResult(tstate.astate,
+		  CurrentMemoryContext));
 }
 
 /*
@@ -4693,16 +4717,98 @@ text_to_array(PG_FUNCTION_ARGS)
 Datum
 text_to_array_null(PG_FUNCTION_ARGS)
 {
-	return text_to_array_internal(fcinfo);
+	return text_to_array(fcinfo);
+}
+
+/*
+ * text_to_table
+ * Parse input string and returns substrings as a table.
+ */
+Datum
+text_to_table(PG_FUNCTION_ARGS)
+{
+	ReturnSetInfo	   *rsi = (ReturnSetInfo *) fcinfo->resultinfo;
+	SplitStringTargetData tstate;
+	MemoryContext		old_cxt;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsi == NULL || !IsA(rsi, ReturnSetInfo))
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot accept a set")));
+
+	if (!(rsi->allowedModes & SFRM_Materialize))
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("materialize mode required, but it is not "
+		

"tuple concurrently updated" in pg_restore --jobs

2020-07-05 Thread Justin Pryzby
I hit this issue intermittently (roughly half the time) while working with a
patch David submitted, and finally found a recipe to reproduce it on an
unpatched v12 instance.

I was surprised to see pg_restore -j2 is restoring ACLs in pre-data in
parallel.  Note different session IDs and PIDs:

2020-07-05 23:31:27.448 
CDT,"pryzbyj","secondary_dump",24037,"[local]",5f02a91f.5de5,70,,LOG,0,"statement:
 REVOKE SELECT ON TABLE pg_catalog.pg_proc FROM PUBLIC; 
","pg_restore","client backend"
2020-07-05 23:31:27.448 
CDT,"pryzbyj","secondary_dump",24036,"[local]",5f02a91f.5de4,78,,LOG,0,"statement:
 GRANT SELECT(tableoid) ON TABLE pg_catalog.pg_proc TO PUBLIC; 
","pg_restore","client backend"
2020-07-05 23:31:27.450 
CDT,"pryzbyj","secondary_dump",24036,"[local]",5f02a91f.5de4,79,,LOG,0,"statement:
 GRANT SELECT(oid) ON TABLE pg_catalog.pg_proc TO PUBLIC; 
","pg_restore","client backend"
2020-07-05 23:31:27.450 
CDT,"pryzbyj","secondary_dump",24037,"[local]",5f02a91f.5de5,71,,ERROR,XX000,"tuple
 concurrently updated",,"REVOKE SELECT ON TABLE pg_catalog.pg_proc FROM 
PUBLIC;

postgres=# CREATE DATABASE pryzbyj;
postgres=# \c pryzbyj 
pryzbyj=# REVOKE ALL ON pg_proc FROM postgres;
pryzbyj=# GRANT SELECT (tableoid, oid, proname) ON pg_proc TO public;
pryzbyj=# \dp+ pg_catalog.pg_proc
   Schema   |  Name   | Type  | Access privileges | Column privileges | 
Policies 
+-+---+---+---+--
 pg_catalog | pg_proc | table | =r/postgres   | tableoid:+| 
| |   |   |   =r/postgres+| 
| |   |   | oid: +| 
| |   |   |   =r/postgres+| 
| |   |   | proname: +| 
| |   |   |   =r/postgres | 

[pryzbyj@database ~]$ pg_dump pryzbyj -Fc -f pg_dump.out
[pryzbyj@database ~]$ pg_restore pg_dump.out -j2 -d pryzbyj --clean -v
...
pg_restore: entering main parallel loop
pg_restore: launching item 3744 ACL TABLE pg_proc
pg_restore: launching item 3745 ACL COLUMN pg_proc.proname
pg_restore: creating ACL "pg_catalog.TABLE pg_proc"
pg_restore: creating ACL "pg_catalog.COLUMN pg_proc.proname"
pg_restore:pg_restore:  while PROCESSING TOC:
finished item 3745 ACL COLUMN pg_proc.proname
pg_restore: from TOC entry 3744; 0 0 ACL TABLE pg_proc postgres
pg_restore: error: could not execute query: ERROR:  tuple concurrently updated
Command was: REVOKE ALL ON TABLE pg_catalog.pg_proc FROM postgres;




Re: proposal: possibility to read dumped table's name from file

2020-07-05 Thread Pavel Stehule
ne 5. 7. 2020 v 22:37 odesílatel Pavel Stehule 
napsal:

>
>
> ne 5. 7. 2020 v 22:31 odesílatel Justin Pryzby 
> napsal:
>
>> On Wed, Jul 01, 2020 at 04:24:52PM -0500, Justin Pryzby wrote:
>> > On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
>> > > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby 
>> napsal:
>> > > > > + /* ignore empty
>> rows */
>> > > > > + if (*line != '\0')
>> > > >
>> > > > Maybe: if line=='\0': continue
>> > > > We should also support comments.
>> >
>> > Comment support is still missing but easily added :)
>>
>> Still missing from the latest patch.
>>
>
> I can implement a comment support. But I am not sure about the format. The
> start can be "--" or classic #.
>
> but "--" can be in this context messy
>

here is support for comment's line - first char should be #

Regards

Pavel


>
>
>> With some added documentation, I think this can be RfC.
>>
>> --
>> Justin
>>
>
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2f0807e912..332f0c312f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -813,6 +813,15 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Read filters from file. Format "(+|-)(tnfd) objectname:
+   
+  
+ 
+
  
   --load-via-partition-root
   
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a41a3db876..0b16d528c2 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -290,6 +290,8 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(TableInfo *tbinfo);
+static size_t pg_getline(char **lineptr, size_t *n, FILE *fp);
+static void exit_invalid_filter_format(FILE *fp, char *filename, char *message, char *line, int lineno);
 
 
 int
@@ -364,6 +366,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, _exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -603,6 +606,145 @@ main(int argc, char **argv)
 		  optarg);
 break;
 
+			case 12:			/* filter implementation */
+{
+	FILE	   *f;
+	char	   *line;
+	ssize_t		chars;
+	size_t		line_size = 1024;
+	int			lineno = 0;
+
+	/* use "-" as symbol for stdin */
+	if (strcmp(optarg, "-") != 0)
+	{
+		f = fopen(optarg, "r");
+		if (!f)
+			fatal("could not open the input file \"%s\": %m",
+  optarg);
+	}
+	else
+		f = stdin;
+
+	line = pg_malloc(line_size);
+
+	while ((chars = pg_getline(, _size, f)) != -1)
+	{
+		bool		is_include;
+		char		objecttype;
+		char	   *objectname;
+
+		lineno += 1;
+
+		if (line[chars - 1] == '\n')
+			line[chars - 1] = '\0';
+
+		/* ignore empty rows */
+		if (*line == '\0')
+			continue;
+
+		/* when first char is hash, ignore whole line */
+		if (*line == '#')
+			continue;
+
+		if (chars < 2)
+			exit_invalid_filter_format(f,
+	   optarg,
+	   "too short line",
+	   line,
+	   lineno);
+
+		if (line[0] == '+')
+			is_include = true;
+		else if (line[0] == '-')
+			is_include = false;
+		else
+			exit_invalid_filter_format(f,
+	   optarg,
+	   "invalid option type (use [+-]",
+	   line,
+	   lineno);
+
+		objecttype = line[1];
+		objectname = [2];
+
+		/* skip initial spaces */
+		while (isspace(*objectname))
+			objectname++;
+
+		if (*objectname == '\0')
+			exit_invalid_filter_format(f,
+	   optarg,
+	   "missing object name",
+	   line,
+	   lineno);
+
+		if (objecttype == 't')
+		{
+			if (is_include)
+			{
+simple_string_list_append(_include_patterns,
+		  objectname);
+dopt.include_everything = false;
+			}
+			else
+simple_string_list_append(_exclude_patterns,
+		  objectname);
+		}
+		else if (objecttype == 'n')
+		{
+			if (is_include)
+			{
+simple_string_list_append(_include_patterns,
+		  objectname);
+dopt.include_everything = false;
+			}
+			else
+simple_string_list_append(_exclude_patterns,
+		  objectname);
+		}
+		else if (objecttype == 'd')
+		{
+			if (is_include)
+exit_invalid_filter_format(f,
+		   optarg,
+		   "include filter is not 

Re: Cleanup - adjust the code crossing 80-column window limit

2020-07-05 Thread Bharath Rupireddy
>
> Is there a test case covering this part of the code(I'm not sure if
> one exists in the regression test suite)?
> If no, can we add one?
>

I observed that the code areas this patch is trying to modify are
pretty much generic and are being called from many places.
The code basically handles exit callbacks on proc exits, on or before
shared memory exits which is very generic and common code.
I'm sure these parts are covered with the existing regression test suites.

Since I have previously run the regression tests, now finally, +1 for
the patch from my end.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com




Re: Cache lookup errors with functions manipulation object addresses

2020-07-05 Thread Michael Paquier
On Fri, Jul 03, 2020 at 12:34:39PM +0200, Daniel Gustafsson wrote:
> No objections from me after skimming over the updated patchset.  I still
> maintain that the format_operator_extended comment should be amended to 
> include
> that it can return NULL and not alwatys palloced string, but that's it.

Okay, 0001 and 0002 are now committed.  The same comment applies to
the procedure part, and I have noticed three more inconsistencies in
0002, the comments of format_procedure_extended had better match its 
operator sibling, the new declarations in regproc.h still used
type_oid for the first argument, and the description of the flags
still referred to a type name for both routines.  I am looking at
0003 with fresh eyes now, and I'll try to send a rebased version
shortly.
--
Michael


signature.asc
Description: PGP signature


RE: archive status ".ready" files may be created too early

2020-07-05 Thread matsumura....@fujitsu.com
Hello, Horiguchi-san

Thank you for your comment and patch.

At Thursday, June 25, 2020 3:36 PM(JST),  "Kyotaro Horiguchi 
" wrote in
> I think we don't need most of that shmem stuff.  XLogWrite is called

I wanted no more shmem stuff too, but other ideas need more lock
that excludes inserter and writer each other.

> after WAL buffer is filled up to the requested position. So when it
> crosses segment boundary we know the all past corss segment-boundary
> records are stable. That means all we need to remember is only the
> position of the latest corss-boundary record.

I could not agree. In the following case, it may not work well.
- record-A and record-B (record-B is a newer one) is copied, and
- lastSegContRecStart/End points to record-B's, and
- FlushPtr is proceeded to in the middle of record-A.

In the above case, the writer should notify segments before record-A,
but it notifies ones before record-B. If the writer notifies
only when it flushes the latest record completely, it works well.
But the writer may not be enable to notify any segment forever when
WAL records crossing-segment-boundary are inserted contiunuously.

So I think that we must remeber all such cross-segement-boundary records's 
EndRecPtr in buffer.


> If we call XLogMarkEndRecPtrIfNeeded() there, the function is called
> every time a record is written, most of which are wasteful.
> XLogInsertRecord already has a code block executed only at every page
> boundary.

I agree.
XLogMarkEndRecPtrIfNeeded() is moved into the code block before updating
LogwrtRqst.Write for avoiding passing-each-other with writer.


> Now we can identify stable portion of WAL stream. It's enough to
> prevent walsender from sending data that can be overwritten
> afterwards. GetReplicationTargetRecPtr() in the attached does that.

I didn't notice it.
I agree basically, but it is based on lastSegContRecStart/End.

So, first of all, we have to agree what should be remebered.


Regards
Ryo Matsumura




Re: Cleanup - adjust the code crossing 80-column window limit

2020-07-05 Thread Amul Sul
On Fri, Jul 3, 2020 at 1:32 PM Peter Eisentraut
 wrote:
>
> On 2020-07-01 09:00, Amul Sul wrote:
> > Attached patch makes an adjustment to ipc.c code to be in the 80-column
> > window.
>
> I can see an argument that this makes the code a bit easier to read, but
> making code fit into 80 columns doesn't have to be a goal by itself.
>
I wouldn't disagree with that. I believe the 80 column rule has been documented
for the code readability.

Regards,
Amul




Re: A patch for get origin from commit_ts.

2020-07-05 Thread movead...@highgo.ca

>+SELECT pg_replication_origin_create('test_commit_ts: get_origin_1');
>+SELECT pg_replication_origin_create('test_commit_ts: get_origin_2');
>+SELECT pg_replication_origin_create('test_commit_ts: get_origin_3');
>
>Why do you need three replication origins to test three times the same
>pattern?  Wouldn't one be enough and why don't you check after the
>timestamp?  I would also two extra tests: one with a NULL input and an
>extra one where the data could not be found.
> 
>+   found = TransactionIdGetCommitTsData(xid, , );
>+
>+   if (!found)
>+   PG_RETURN_NULL();
> 
>This part also looks incorrect to me, I think that you should still
>return two tuples, both marked as NULL.  You can do that just by
>switching the nulls flags to true for the two values if nothing can be
>found.
Thanks for the points and follow them, new patch attached.



Regards,
Highgo Software (Canada/China/Pakistan) 
URL : www.highgo.ca 
EMAIL: mailto:movead(dot)li(at)highgo(dot)ca


get_origin_from_commit_ts_v4.patch
Description: Binary data


Re: Creating a function for exposing memory usage of backend process

2020-07-05 Thread torikoshia
On Fri, Jul 3, 2020 at 7:33 PM Fujii Masao  
wrote:


Thanks for your review!


I like more specific name like pg_backend_memory_contexts.


Agreed.

When I was trying to add this function as statistics function,
I thought that naming pg_stat_getbackend_memory_context()
might make people regarded it as a "per-backend statistics
function", whose parameter is backend ID number.
So I removed "backend", but now there is no necessity to do
so.


But I'd like to hear more opinions about the name from others.


I changed the name to pg_backend_memory_contexts for the time
being.



- function name: pg_get_memory_contexts()
- source file: mainly src/backend/utils/mmgr/mcxt.c



+       Identification information of the memory context. This field 
is truncated if the identification field is longer than 1024 
characters


"characters" should be "bytes"?


Fixed, but I used "characters" while referring to the
descriptions on the manual of pg_stat_activity.query
below.

| By default the query text is truncated at 1024 characters;

It has nothing to do with this thread, but considering
multibyte characters, it also may be better to change it
to "bytes".


Regarding the other comments, I revised the patch as you pointed.


Regards,

--
Atsushi Torikoshi
NTT DATA CORPORATIONFrom 055af903a3dbf146d97dd3fb01a6a7d3d3bd2ae0 Mon Sep 17 00:00:00 2001
From: Atsushi Torikoshi 
Date: Mon, 6 Jul 2020 10:50:49 +0900
Subject: [PATCH] Add a function exposing memory usage of local backend.

Previously, the only way to examine the usage of backend processes
was attaching a debugger and call MemoryContextStats() and it was
not so convenient in some cases.

This patch implements a new SQL-callable function
pg_get_backend_memory_contexts which exposes memory usage of the
local backend.
It also adds a new view pg_backend_memory_contexts for exposing
local backend memory contexts.
---
 doc/src/sgml/catalogs.sgml   | 126 +
 src/backend/catalog/system_views.sql |   3 +
 src/backend/utils/mmgr/mcxt.c| 132 ++-
 src/include/catalog/pg_proc.dat  |   9 ++
 src/test/regress/expected/rules.out  |  10 ++
 5 files changed, 279 insertions(+), 1 deletion(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 49a881b262..abc097179f 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -9248,6 +9248,11 @@ SCRAM-SHA-256$iteration count:
   materialized views
  
 
+ 
+  pg_backend_memory_contexts
+  backend memory contexts
+ 
+
  
   pg_policies
   policies
@@ -10526,6 +10531,127 @@ SELECT * FROM pg_locks pl LEFT JOIN pg_prepared_xacts ppx
 
  
 
+ 
+  pg_backend_memory_contexts
+
+  
+   pg_backend_memory_contexts
+  
+
+  
+   backend memory contexts
+  
+
+  
+   The view pg_backend_memory_contexts displays all
+   the local backend memory contexts.
+  
+  
+   pg_backend_memory_contexts contains one row
+   for each memory context.
+  
+
+  
+   pg_backend_memory_contexts Columns
+   
+
+ 
+  
+   Column Type
+  
+  
+   Description
+  
+ 
+
+
+
+ 
+  
+   name text
+  
+  
+   Name of the memory context
+  
+ 
+
+ 
+  
+   ident text
+  
+  
+   Identification information of the memory context. This field is truncated at 1024 bytes
+  
+ 
+
+ 
+  
+   parent text
+  
+  
+   Name of the parent of this memory context
+  
+ 
+
+ 
+  
+   level int4
+  
+  
+   Distance from TopMemoryContext in context tree
+  
+ 
+
+ 
+  
+   total_bytes int8
+  
+  
+   Total bytes allocated for this memory context
+  
+ 
+
+ 
+  
+   total_nblocks int8
+  
+  
+   Total number of blocks allocated for this memory context
+  
+ 
+
+ 
+  
+   free_bytes int8
+  
+  
+   Free space in bytes
+  
+ 
+
+ 
+  
+   free_chunks int8
+  
+  
+   Total number of free chunks
+  
+ 
+
+ 
+  
+   used_bytes int8
+  
+  
+   Used space in bytes
+  
+ 
+
+   
+  
+
+ 
+
  
   pg_matviews
 
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index 5314e9348f..403954039b 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -554,6 +554,9 @@ CREATE VIEW pg_shmem_allocations AS
 REVOKE ALL ON pg_shmem_allocations FROM PUBLIC;
 REVOKE EXECUTE ON FUNCTION pg_get_shmem_allocations() FROM PUBLIC;
 
+CREATE VIEW pg_backend_memory_contexts AS
+SELECT * FROM pg_get_backend_memory_contexts();
+
 -- Statistics views
 
 CREATE VIEW pg_stat_all_tables AS
diff --git a/src/backend/utils/mmgr/mcxt.c b/src/backend/utils/mmgr/mcxt.c
index abda22fa57..b0732ec560 100644
--- a/src/backend/utils/mmgr/mcxt.c
+++ b/src/backend/utils/mmgr/mcxt.c
@@ 

Re: Include access method in listTables output

2020-07-05 Thread Michael Paquier
On Sun, Jul 05, 2020 at 07:13:10AM +0530, vignesh C wrote:
> I'm not sure if we should include showViews, I had seen that the
> access method was not getting selected for view.

+1.  These have no physical storage, so you are looking here for
relkinds that satisfy RELKIND_HAS_STORAGE().
--
Michael


signature.asc
Description: PGP signature


Re: TAP tests and symlinks on Windows

2020-07-05 Thread Michael Paquier
On Sun, Jul 05, 2020 at 08:18:46AM -0400, Andrew Dunstan wrote:
> On 7/3/20 10:11 AM, Peter Eisentraut wrote:
>> I think all of this is still a bit too fragile it needs further
>> consideration.

Indeed.  I would need a MSYS2 environment to dig into that.  This
looks trickier than what I am used to on Windows.

> I'll see what I can come up with.

Thanks, Andrew.
--
Michael


signature.asc
Description: PGP signature


Re: Can I use extern "C" in an extension so I can use C++?

2020-07-05 Thread Isaac Morland
On Sun, 5 Jul 2020 at 18:49, Tom Lane  wrote:

>
> My example wrapped the Postgres #include's, the PG_MODULE_MAGIC call,
> and the PG_FUNCTION_INFO_V1 call(s) in extern "C" { ... }.  I'm pretty
> sure you need to do all three of those things to get a working result
> without mangled external function names.
>

I wrapped my entire file - #includes and all - in extern "C" { ... } and it
worked perfectly. Thanks very much for your assistance and patience.


Re: A patch for get origin from commit_ts.

2020-07-05 Thread Michael Paquier
On Sat, Jul 04, 2020 at 06:01:28PM +0800, movead...@highgo.ca wrote:
> I make a patch as Michael Paquier described that use a new function to
> return transactionid and origin, and I add a origin version to 
> pg_last_committed_xact() too,  now it looks like below:

+SELECT pg_replication_origin_create('test_commit_ts: get_origin_1');
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_2');
+SELECT pg_replication_origin_create('test_commit_ts: get_origin_3');

Why do you need three replication origins to test three times the same
pattern?  Wouldn't one be enough and why don't you check after the
timestamp?  I would also two extra tests: one with a NULL input and an
extra one where the data could not be found.

+   found = TransactionIdGetCommitTsData(xid, , );
+
+   if (!found)
+   PG_RETURN_NULL();

This part also looks incorrect to me, I think that you should still
return two tuples, both marked as NULL.  You can do that just by
switching the nulls flags to true for the two values if nothing can be
found.
--
Michael


signature.asc
Description: PGP signature


Re: Postgres Windows build system doesn't work with python installed in Program Files

2020-07-05 Thread Michael Paquier
On Sat, Jul 04, 2020 at 09:17:52PM +0200, Juan José Santamaría Flecha wrote:
> Going through the open items in the commitfest, I see that this patch has
> not been pushed. It still applies and solves the warning so, I am marking
> it as RFC.

Thanks, applied.  I was actually waiting to see if you had more
comments.

> Adding a help option is a new feature, that can have its own patch without
> delaying this one any further.

Yep.  And I am not sure if that's worth worrying either.
--
Michael


signature.asc
Description: PGP signature


Re: Default setting for enable_hashagg_disk (hash_mem)

2020-07-05 Thread Peter Geoghegan
On Sat, Jul 4, 2020 at 1:54 PM Jeff Davis  wrote:
> I agree that it's good to wait for actual problems. But the challenge
> is that we can't backport an added GUC. Are there other, backportable
> changes we could potentially make if a lot of users have a problem with
> v13 after release?

I doubt that there are.

> I'm OK not having a GUC, but we need consensus around what our response
> will be if a user experiences a regression. If our only answer is
> "tweak X, Y, and Z; and if that doesn't work, wait for v14" then I'd
> like almost everyone to be on board with that.

I'm practically certain that there will be users that complain about
regressions. It's all but inevitable given that in general grouping
estimates are often wrong by orders of magnitude.

> Without some backportable potential solutions, I'm inclined to ship
> with either one or two escape-hatch GUCs, with warnings that they
> should be used as a last resort. Hopefully users will complain on the
> lists (so we understand the problem) before setting them.

Where does that leave the hash_mem idea (or some other similar proposal)?

I think that we should offer something like hash_mem that can work as
a multiple of work_mem, for the reason that Justin mentioned recently.
This can be justified as something that more or less maintains some
kind of continuity with the old design.

I think that it should affect hash join too, though I suppose that
that part might be controversial -- that is certainly more than an
escape hatch for this particular problem. Any thoughts on that?

> It's not very principled, and we may be stuck with some cruft, but it
> mitigates the risk a lot. There's a good chance we can remove them
> later, especially if it's part of a larger overhall of
> work_mem/hash_mem (which might happen fairly soon, given the interest
> in this thread), or if we change something about HashAgg that makes the
> GUCs harder to maintain.

There are several reasons to get rid of work_mem entirely in the
medium to long term. Some relatively obvious, others less so.

An example in the latter category is "hash teams" [1]: a design that
teaches multiple hash operations (e.g. a hash join and a hash
aggregate that hash on the same columns) to cooperate in processing
their inputs. It's more or less the hashing equivalent of what are
sometimes called "interesting sort orders" (e.g. cases where the same
sort/sort order is used by both a merge join and a group aggregate).
The hash team controls spilling behavior for related hash nodes as a
whole. That's the most sensible way of thinking about the related hash
nodes, to enable a slew of optimizations. For example, I think that it
enables bushy plans with multiple hash joins that can have much lower
high watermark memory consumption.

This hash teams business seems quite important in general, but it is
fundamentally incompatible with the work_mem model, which supposes
that each node exists on its own in a vacuum. (I suspect you already
knew about this, Jeff, but not everyone will.)

[1] 
http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.114.3183=rep1=pdf
-- 
Peter Geoghegan




Re: Can I use extern "C" in an extension so I can use C++?

2020-07-05 Thread Tom Lane
Isaac Morland  writes:
> On Sun, 5 Jul 2020 at 18:07, Tom Lane  wrote:
>> Something like the attached works for me; what problem are you having
>> *exactly*?

> I've attached a .cpp file.

My example wrapped the Postgres #include's, the PG_MODULE_MAGIC call,
and the PG_FUNCTION_INFO_V1 call(s) in extern "C" { ... }.  I'm pretty
sure you need to do all three of those things to get a working result
without mangled external function names.

regards, tom lane




Re: Can I use extern "C" in an extension so I can use C++?

2020-07-05 Thread Isaac Morland
On Sun, 5 Jul 2020 at 18:07, Tom Lane  wrote:

> Isaac Morland  writes:
> > I'm writing a small extension, and I'm trying to use C++ constructs. I'm
> > not actually doing anything that needs C++, but I *really* like declaring
> > variables when I first initialize them (for example), and I also *really*
> > like warning-free compiles.
>
> Well, you could get that with -Wno-declaration-after-statement ...
> but yeah, this is supposed to work, modulo all the caveats on the
> page you already found.
>
> > The C++ compiler is mangling the names so they aren't visible to the
> > extension mechanism.
>
> Something like the attached works for me; what problem are you having
> *exactly*?
>

I've attached a .cpp file. When I run "make", I get:

g++ -Wall -Wpointer-arith -Wendif-labels -Wmissing-format-attribute
-Wformat-security -fno-strict-aliasing -fwrapv -g -g -O2
-fstack-protector-strong -Wformat -Werror=format-security -I. -I./
-I/usr/include/postgresql/12/server -I/usr/include/postgresql/internal
 -Wdate-time -D_FORTIFY_SOURCE=2 -D_GNU_SOURCE -I/usr/include/libxml2
 -I/usr/include/mit-krb5  -c -o hashblob.o hashblob.cpp
hashblob.cpp:9:18: error: conflicting declaration of ‘Datum
hashblob_touch(FunctionCallInfo)’ with ‘C’ linkage
9 | extern "C" Datum hashblob_touch (PG_FUNCTION_ARGS) {
  |  ^~
In file included from hashblob.cpp:2:
hashblob.cpp:6:21: note: previous declaration with ‘C++’ linkage
6 | PG_FUNCTION_INFO_V1(hashblob_touch);
  | ^~
/usr/include/postgresql/12/server/fmgr.h:405:14: note: in definition of
macro ‘PG_FUNCTION_INFO_V1’
  405 | extern Datum funcname(PG_FUNCTION_ARGS); \
  |  ^~~~
[... and then the same set of 3 errors again for the other function ...]

Without the extern "C" stuff it compiles fine but the names are mangled;
renaming to .c makes it compile fine with non-mangled names. It also
compiles if I get rid of the PG_FUNCTION_INFO_V1 invocations; but the same
documentation page is pretty clear that is supposed to be needed; and then
I think it C++-mangles the names of functions that get called behind the
scenes ("undefined symbol: _Z23pg_detoast_datum_packedP7varlena" when I try
to CREATE EXTENSION).

I should add I'm using a Makefile I've also attached; it uses PGXS with
nothing special, but on the other hand it means I don't really understand
everything that is going on (in particular, I didn't pick whether to say
g++ or gcc, nor did I explicitly choose any of the arguments to the
command).

$ g++ -Wall -fno-strict-aliasing -fwrapv -g -O2 -D_GNU_SOURCE -c
> -I/home/postgres/pgsql/src/include -o test.o test.cpp
> $ nm --ext --def test.o
>  T Pg_magic_func
> 0010 T pg_finfo_silly_func
> 0020 T silly_func
>


hashblob.cpp
Description: Binary data


Makefile
Description: Binary data


Re: Can I use extern "C" in an extension so I can use C++?

2020-07-05 Thread Tom Lane
Isaac Morland  writes:
> I'm writing a small extension, and I'm trying to use C++ constructs. I'm
> not actually doing anything that needs C++, but I *really* like declaring
> variables when I first initialize them (for example), and I also *really*
> like warning-free compiles.

Well, you could get that with -Wno-declaration-after-statement ...
but yeah, this is supposed to work, modulo all the caveats on the
page you already found.

> The C++ compiler is mangling the names so they aren't visible to the
> extension mechanism.

Something like the attached works for me; what problem are you having
*exactly*?

$ g++ -Wall -fno-strict-aliasing -fwrapv -g -O2 -D_GNU_SOURCE -c 
-I/home/postgres/pgsql/src/include -o test.o test.cpp
$ nm --ext --def test.o
 T Pg_magic_func
0010 T pg_finfo_silly_func
0020 T silly_func

regards, tom lane

extern "C" {

#include "postgres.h"

#include "fmgr.h"

PG_MODULE_MAGIC;

PG_FUNCTION_INFO_V1(silly_func);

}

extern "C" Datum
silly_func(PG_FUNCTION_ARGS)
{
	char	   *str = PG_GETARG_CSTRING(0);

	PG_RETURN_INT32(strlen(str));
}


Re: Binary support for pgoutput plugin

2020-07-05 Thread Daniel Gustafsson
> On 5 Jul 2020, at 23:11, Alvaro Herrera  wrote:
> 
> On 2020-Jul-05, Daniel Gustafsson wrote:
> 
>>> On 2 Jul 2020, at 18:41, Dave Cramer  wrote:
>>> 
>>> rebased
>> 
>> Thanks!  The new version of 0001 patch has a compiler warning due to mixed
>> declarations and code:
>> 
>> worker.c: In function ‘slot_store_data’:
>> worker.c:366:5: error: ISO C90 forbids mixed declarations and code 
>> [-Werror=declaration-after-statement]
> 
> AFAICS this is fixed in 0005.

Yes and no, 0005 fixes one such instance but the one failing the build is
another one in worker.c (the below being from 0008 which in turn change the row
in question from previous patches):

+   int cursor = tupleData->values[remoteattnum]->cursor;

>  I'm going to suggest to use "git rebase -i"

+1

cheers ./daniel



Re: Binary support for pgoutput plugin

2020-07-05 Thread Alvaro Herrera
On 2020-Jul-05, Daniel Gustafsson wrote:

> > On 2 Jul 2020, at 18:41, Dave Cramer  wrote:
> > 
> > rebased
> 
> Thanks!  The new version of 0001 patch has a compiler warning due to mixed
> declarations and code:
> 
> worker.c: In function ‘slot_store_data’:
> worker.c:366:5: error: ISO C90 forbids mixed declarations and code 
> [-Werror=declaration-after-statement]

AFAICS this is fixed in 0005.  I'm going to suggest to use "git rebase
-i" so that fixes for bugs that earlier patches introduce are applied as
fix-ups in those patches; we don't need or want to see the submitter's
intermediate versions.  Ideally, each submitted patch should be free of
such problems, so that we can consider each individual patch in the
series in isolation.  Indeed, evidently the cfbot consider things that
way.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Online checksums verification in the backend

2020-07-05 Thread Daniel Gustafsson
> On 5 Apr 2020, at 13:17, Julien Rouhaud  wrote:
> On Sun, Apr 05, 2020 at 08:01:36PM +0900, Masahiko Sawada wrote:

>> Thank you for updating the patch! The patch looks good to me.
>> 
>> I've marked this patch as Ready for Committer. I hope this patch will
>> get committed to PG13.

> Thanks a lot!

This patch has been through quite thorough review, and skimming the thread all
concerns raised have been addressed.  It still applies and tests gree in the CF
Patchtester.  The feature in itself certainly gets my +1 for inclusion, it
seems a good addition.

Is any committer who has taken part in the thread (or anyone else for that
matter) interested in seeing this to some form of closure in this CF?

cheers ./daniel



Can I use extern "C" in an extension so I can use C++?

2020-07-05 Thread Isaac Morland
I'm writing a small extension, and I'm trying to use C++ constructs. I'm
not actually doing anything that needs C++, but I *really* like declaring
variables when I first initialize them (for example), and I also *really*
like warning-free compiles.

The C++ compiler is mangling the names so they aren't visible to the
extension mechanism. The following page suggests using extern C (it doesn't
specify that this means extern "C", but I suppose anybody who actually knew
what they were doing would have known that immediately):

https://www.postgresql.org/docs/current/xfunc-c.html

So I'm writing my functions to start:

extern "C" Datum ...

The problem is that PG_FUNCTION_INFO_V1 generates its own function
declaration with a conflicting extern specification (just plain extern,
which I guess means "C++" in the context of C++ code).

I also tried wrapping everything - both the functions and
the PG_FUNCTION_INFO_V1 invocations - in extern "C" { ... }, but now a
variable declaration from fmgr.h conflicts with one from the macros.

Is there a simple fix I'm missing? Any hints much appreciated.


Re: vacuum verbose detail logs are unclear; log at *start* of each stage

2020-07-05 Thread Daniel Gustafsson
> On 24 Mar 2020, at 22:58, Tom Lane  wrote:

> The really fundamental problem with this is that you are trying to make
> the server do what is properly the client's job, namely format messages
> nicely.  Please read the message style guidelines [1], particularly
> the bit about "Formatting", which basically says "don't":

This thread has stalled since the last CF with Tom's raised issue unanswered,
and the patch no longer applies.  I'm closing this as Returned with Feedback,
if there is an updated patchset then please re-open the entry.

cheers ./daniel




Re: Binary support for pgoutput plugin

2020-07-05 Thread Daniel Gustafsson
> On 2 Jul 2020, at 18:41, Dave Cramer  wrote:
> 
> rebased

Thanks!  The new version of 0001 patch has a compiler warning due to mixed
declarations and code:

worker.c: In function ‘slot_store_data’:
worker.c:366:5: error: ISO C90 forbids mixed declarations and code 
[-Werror=declaration-after-statement]
 int cursor = tupleData->values[remoteattnum]->cursor;
 ^
worker.c: In function ‘slot_modify_data’:
worker.c:485:5: error: ISO C90 forbids mixed declarations and code 
[-Werror=declaration-after-statement]
 int cursor = tupleData->values[remoteattnum]->cursor;
 ^

I didn't investigate to see if it was new, but Travis is running with Werror
which fails this build.

cheers ./daniel



Re: proposal: possibility to read dumped table's name from file

2020-07-05 Thread Pavel Stehule
ne 5. 7. 2020 v 22:31 odesílatel Justin Pryzby 
napsal:

> On Wed, Jul 01, 2020 at 04:24:52PM -0500, Justin Pryzby wrote:
> > On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> > > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby 
> napsal:
> > > > > + /* ignore empty rows
> */
> > > > > + if (*line != '\0')
> > > >
> > > > Maybe: if line=='\0': continue
> > > > We should also support comments.
> >
> > Comment support is still missing but easily added :)
>
> Still missing from the latest patch.
>

I can implement a comment support. But I am not sure about the format. The
start can be "--" or classic #.

but "--" can be in this context messy



> With some added documentation, I think this can be RfC.
>
> --
> Justin
>


Re: [Patch] Invalid permission check in pg_stats for functional indexes

2020-07-05 Thread Daniel Gustafsson
> On 5 Jul 2020, at 16:16, Tom Lane  wrote:
> 
> Daniel Gustafsson  writes:
>> This patch still hasn't progressed since Tom's draft patch, is anyone still
>> interested in pursuing it or should we close it for now?
> 
> It seems clearly reasonable to me to close the CF item as RWF,
> expecting that a new one can be made whenever somebody re-tackles
> this problem.

Thanks for confirmation, done.

>  The CF list is not a to-do list.

Yes. Multiple +1's.

cheers ./daniel



Re: proposal: possibility to read dumped table's name from file

2020-07-05 Thread Justin Pryzby
On Wed, Jul 01, 2020 at 04:24:52PM -0500, Justin Pryzby wrote:
> On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby  
> > napsal:
> > > > + /* ignore empty rows */
> > > > + if (*line != '\0')
> > >
> > > Maybe: if line=='\0': continue
> > > We should also support comments.
> 
> Comment support is still missing but easily added :)

Still missing from the latest patch.

With some added documentation, I think this can be RfC.

-- 
Justin




Re: Ideas about a better API for postgres_fdw remote estimates

2020-07-05 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> 2. Wedging this into EXPLAIN would be quite ugly, because (at least
>> as I envision it) the output would have just about nothing to do with
>> any existing EXPLAIN output.

> This is a better argument for not making it part of EXPLAIN, though I
> don't really feel like I've got a decent idea of what you are suggesting
> the output *would* look like, so it's difficult for me to agree (or
> disagree) about this particular point.

Per postgres_fdw's get_remote_estimate(), the only data we use right now
is the startup_cost, total_cost, rows and width estimates from the
top-level Plan node.  That's available immediately from the Plan tree,
meaning that basically *nothing* of the substantial display effort
expended by explain.c and ruleutils.c is of any value.  So the level-zero
implementation of this would be to run the parser and planner, format
those four numbers into a JSON object (which would require little more
infrastructure than sprintf), and return that.  Sure, we could make that
into some kind of early-exit path in explain.c, but I think it'd be a
pretty substantial wart, especially since it'd mean that none of the
other EXPLAIN options are sensible in combination with this one.

Further down the road, we might want to rethink the whole idea of
completely constructing a concrete Plan.  We could get the data we need
at the list-of-Paths stage.  Even more interesting, we could (with very
little more work) return data about multiple Paths, so that the client
could find out, for example, the costs of sorted and unsorted output
without paying two network round trips to discover that.  That'd
definitely require changes in the core planner, since it has no API to
stop at that point.  And it's even less within the charter of EXPLAIN.

I grant your point that there might be other users for this besides
postgres_fdw, but that doesn't mean it must be a core feature.

>> 3. We surely would not back-patch a core change like this.  OTOH, if
>> the added infrastructure is in an extension, somebody might want to
>> back-patch that (even if unofficially).

> Since postgres_fdw is part of core and core's release cycle, and the
> packagers manage the extensions from core in a way that they have to
> match up, this argument doesn't hold any weight with me.

Certainly only v14 (or whenever) and later postgres_fdw would be able
to *use* this data.  The scenario I'm imagining is that somebody wants
to be able to use that client against an older remote server, and is
willing to install some simple extension on the remote server to do so.
Perhaps this scenario is not worth troubling over, but I don't think
it's entirely far-fetched.

regards, tom lane




Re: proposal: possibility to read dumped table's name from file

2020-07-05 Thread Pavel Stehule
st 1. 7. 2020 v 23:24 odesílatel Justin Pryzby 
napsal:

> On Thu, Jun 11, 2020 at 09:36:18AM +0200, Pavel Stehule wrote:
> > st 10. 6. 2020 v 0:30 odesílatel Justin Pryzby 
> napsal:
> > > > + /* ignore empty rows */
> > > > + if (*line != '\0')
> > >
> > > Maybe: if line=='\0': continue
> > > We should also support comments.
>
> Comment support is still missing but easily added :)
>
> I tried this patch and it works for my purposes.
>
> Also, your getline is dynamically re-allocating lines of arbitrary length.
> Possibly that's not needed.  We'll typically read "+t schema.relname",
> which is
> 132 chars.  Maybe it's sufficient to do
> char buf[1024];
> fgets(buf);
> if strchr(buf, '\n') == NULL: error();
> ret = pstrdup(buf);
>

63 bytes is max effective identifier size, but it is not max size of
identifiers. It is very probably so buff with 1024 bytes will be enough for
all, but I do not want to increase any new magic limit. More when dynamic
implementation is not too hard.

Table name can be very long - sometimes the data names (table names) can be
stored in external storages with full length and should not be practical to
require truncating in filter file.

For this case it is very effective, because a resized (increased) buffer is
used for following rows, so realloc should not be often. So when I have to
choose between two implementations with similar complexity, I prefer more
dynamic code without hardcoded limits. This dynamic hasn't any overhead.


> In any case, you could have getline return a char* and (rather than
> following
> GNU) no need to take char**, int* parameters to conflate inputs and
> outputs.
>

no, it has a special benefit. It eliminates the short malloc/free cycle.
When some lines are longer, then the buffer is increased (and limits), and
for other rows with same or less size is not necessary realloc.


> I realized that --filter has an advantage over the previous implementation
> (with multiple --exclude-* and --include-*) in that it's possible to use
> stdin
> for includes *and* excludes.
>

yes, it looks like better choose


> By chance, I had the opportunity yesterday to re-use with rsync a regex
> that
> I'd previously been using with pg_dump and grep.  What this patch calls
> "--filter" in rsync is called "--filter-from".  rsync's --filter-from
> rejects
> filters of length longer than max filename, so I had to split it up into
> multiple lines instead of using regex alternation ("|").  This option is a
> close parallel in pg_dump.
>

we can talk about option name - maybe "--filter-from" is better than just
"--filter"

Regards

Pavel




>
> --
> Justin
>


Re: proposal: possibility to read dumped table's name from file

2020-07-05 Thread Pavel Stehule
so 27. 6. 2020 v 14:55 odesílatel vignesh C  napsal:

> On Thu, Jun 11, 2020 at 1:07 PM Pavel Stehule 
> wrote:
> >
> > Thank you for comments, attached updated patch
> >
>
> Few comments:
> +invalid_filter_format(char *message, char *filename, char *line, int
> lineno)
> +{
> +   char   *displayname;
> +
> +   displayname = *filename == '-' ? "stdin" : filename;
> +
> +   pg_log_error("invalid format of filter file \"%s\": %s",
> +displayname,
> +message);
> +
> +   fprintf(stderr, "%d: %s\n", lineno, line);
> +   exit_nicely(1);
> +}
>
I think fclose is missing here.
>

done


>
> +   if (line[chars - 1] ==
> '\n')
> +   line[chars - 1] =
> '\0';
> Should we check for '\r' also to avoid failures in some platforms.
>

I checked other usage of fgets in Postgres source code, and everywhere is
used test on \n

When I did some fast research, then
https://stackoverflow.com/questions/12769289/carriage-return-by-fgets \r in
this case should be thrown by libc on Microsoft

https://stackoverflow.com/questions/2061334/fgets-linux-vs-mac

\n should be on Mac OS X .. 2001 year .. I am not sure if Mac OS 9 should
be supported.




>
> + 
> +  --filter= class="parameter">filename
> +  
> +   
> +Read filters from file. Format "(+|-)(tnfd) objectname:
> +   
> +  
> + 
>
> I felt some documentation is missing here. We could include,
> options tnfd is for controlling table, schema, foreign server data &
> table exclude patterns.
>

I have a plan to completate doc when the design is completed. It was not
clear if people prefer long or short forms of option names.


> Instead of using tnfd, if we could use the same options as existing
> pg_dump options it will be less confusing.
>

it almost same

+-t .. tables
+-n schema
-d exclude data .. there is not short option for --exclude-table-data
+f include foreign table .. there is not short option for
--include-foreign-data

So still, there is a opened question if use +-tnfd system, or system based
on long option

table foo
exclude-table foo
schema xx
exclude-schema xx
include-foreign-data yyy
exclude-table-data zzz


Typically these files will be generated by scripts and processed via pipe,
so there I see just two arguments for and aginst:

short format - there is less probability to do typo error (but there is not
full consistency with pg_dump options)
long format - it is self documented (and there is full consistency with
pg_dump)

In this case I prefer short form .. it is more comfortable for users, and
there are only a few variants, so it is not necessary to use too verbose
language (design). But my opinion is not aggressively strong and I'll
accept any common agreement.

Regards

Updated patch attached



> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com
>
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 2f0807e912..332f0c312f 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -813,6 +813,15 @@ PostgreSQL documentation
   
  
 
+ 
+  --filter=filename
+  
+   
+Read filters from file. Format "(+|-)(tnfd) objectname:
+   
+  
+ 
+
  
   --load-via-partition-root
   
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index a41a3db876..a3f8bebf52 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -290,6 +290,8 @@ static void appendReloptionsArrayAH(PQExpBuffer buffer, const char *reloptions,
 static char *get_synchronized_snapshot(Archive *fout);
 static void setupDumpWorker(Archive *AHX);
 static TableInfo *getRootTableInfo(TableInfo *tbinfo);
+static size_t pg_getline(char **lineptr, size_t *n, FILE *fp);
+static void exit_invalid_filter_format(FILE *fp, char *filename, char *message, char *line, int lineno);
 
 
 int
@@ -364,6 +366,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"extra-float-digits", required_argument, NULL, 8},
+		{"filter", required_argument, NULL, 12},
 		{"if-exists", no_argument, _exists, 1},
 		{"inserts", no_argument, NULL, 9},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -603,6 +606,141 @@ main(int argc, char **argv)
 		  optarg);
 break;
 
+			case 12:			/* filter implementation */
+{
+	FILE	   *f;
+	char	   *line;
+	ssize_t		chars;
+	size_t		line_size = 1024;
+	int			lineno = 0;
+
+	/* use "-" as symbol for stdin */
+	if (strcmp(optarg, "-") != 0)
+	{
+		f = fopen(optarg, "r");
+		if (!f)
+			fatal("could not open the input file \"%s\": %m",
+  optarg);
+	}
+	else
+		f = stdin;
+
+	line = pg_malloc(line_size);
+
+	while ((chars = pg_getline(, _size, f)) != 

Re: range_agg

2020-07-05 Thread Paul A Jungwirth
On Sun, Jul 5, 2020 at 12:11 PM Paul A Jungwirth
 wrote:
>
> Just knowing that arrays are
> something we do this for is enough to hunt for clues, but if anyone
> can point me more directly to code that will help me do it for
> multiranges, I'd be appreciative.

It looks like expandeddatum.h is where I should be looking. . . .

Paul




Re: Persist MVCC forever - retain history

2020-07-05 Thread Thomas Kellerer

Konstantin Knizhnik schrieb am 05.07.2020 um 19:31:

I am surprised that you are saying you didn't feel big interest. My
reading of the thread is the opposite, that there was quite some
interest, but that there are technical challenges to overcome. So you
gave up on that work?

No, I have not gave up.
But...
There are well known problems of proposed approach:
1. Not supporting schema changes
2. Not compatible with DROP/TRUNCATE
3. Presence of large number of aborted transaction can slow down data access.
4. Semantic of join of tables with different timestamp is obscure.


Oracle partially solved this (at least 1,3 and 4 - don't know about 3) by storing the old 
versions in a separate table that is automatically managed if you enable the feature. If 
a query uses the AS OF to go "back in time", it's rewritten to access the 
history table.

Thomas





Re: Ideas about a better API for postgres_fdw remote estimates

2020-07-05 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> * Rather than adding a core-server feature, the remote-end part of the new
> >> API should be a function installed by an extension (either postgres_fdw
> >> itself, or a new extension "postgres_fdw_remote" or the like).
> 
> > I'm trying to figure out why it makes more sense to use
> > 'postgres_fdw_support(query text)', which would still do parse/plan and
> > return EXPLAIN-like data, rather than having:
> 
> > EXPLAIN (FORMAT JSON, FDW true) query ...
> 
> I see a couple of reasons not to do it like that:
> 
> 1. This is specific to postgres_fdw.  Some other extension might want some
> other data, and different versions of postgres_fdw might want different
> data.  So putting it into core seems like the wrong thing.

Another extension or use-case might want exactly the same information
too though.  In a way, we'd be 'hiding' that information from other
potential users unless they want to install their own extension, which
is a pretty big leap.  Are we sure this information wouldn't be at all
interesting to pgAdmin4 or explain.depesz.com?

> 2. Wedging this into EXPLAIN would be quite ugly, because (at least
> as I envision it) the output would have just about nothing to do with
> any existing EXPLAIN output.

This is a better argument for not making it part of EXPLAIN, though I
don't really feel like I've got a decent idea of what you are suggesting
the output *would* look like, so it's difficult for me to agree (or
disagree) about this particular point.

> 3. We surely would not back-patch a core change like this.  OTOH, if
> the added infrastructure is in an extension, somebody might want to
> back-patch that (even if unofficially).  This argument falls to the
> ground of course if we're forced to make any core changes to be able
> to get at the data we need; but I'm not sure that will be needed.

Since postgres_fdw is part of core and core's release cycle, and the
packagers manage the extensions from core in a way that they have to
match up, this argument doesn't hold any weight with me.  For this to be
a viable argument, we would need to segregate extensions from core and
give them their own release cycle and clear indication of which
extension versions work with which PG major versions, etc.  I'm actually
generally in support of *that* idea- and with that, would agree with
your point 3 above, but that's not the reality of today.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Default gucs for EXPLAIN

2020-07-05 Thread Justin Pryzby
I think this got ample review, so I've set it to "Waiting".

-- 
Justin




Re: Ideas about a better API for postgres_fdw remote estimates

2020-07-05 Thread Tom Lane
Stephen Frost  writes:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> * Rather than adding a core-server feature, the remote-end part of the new
>> API should be a function installed by an extension (either postgres_fdw
>> itself, or a new extension "postgres_fdw_remote" or the like).

> I'm trying to figure out why it makes more sense to use
> 'postgres_fdw_support(query text)', which would still do parse/plan and
> return EXPLAIN-like data, rather than having:

> EXPLAIN (FORMAT JSON, FDW true) query ...

I see a couple of reasons not to do it like that:

1. This is specific to postgres_fdw.  Some other extension might want some
other data, and different versions of postgres_fdw might want different
data.  So putting it into core seems like the wrong thing.

2. Wedging this into EXPLAIN would be quite ugly, because (at least
as I envision it) the output would have just about nothing to do with
any existing EXPLAIN output.

3. We surely would not back-patch a core change like this.  OTOH, if
the added infrastructure is in an extension, somebody might want to
back-patch that (even if unofficially).  This argument falls to the
ground of course if we're forced to make any core changes to be able
to get at the data we need; but I'm not sure that will be needed.

regards, tom lane




Re: Persist MVCC forever - retain history

2020-07-05 Thread Konstantin Knizhnik




On 05.07.2020 08:48, Mitar wrote:

Hi!

On Fri, Jul 3, 2020 at 12:29 AM Konstantin Knizhnik
 wrote:

Did you read this thread:
https://www.postgresql.org/message-id/flat/78aadf6b-86d4-21b9-9c2a-51f1efb8a499%40postgrespro.ru
I have proposed a patch for supporting time travel (AS OF) queries.
But I didn't fill a big interest to it from community.

Oh, you went much further than me in this thinking. Awesome!

I am surprised that you are saying you didn't feel big interest. My
reading of the thread is the opposite, that there was quite some
interest, but that there are technical challenges to overcome. So you
gave up on that work?

No, I have not gave up.
But...
There are well known problems of proposed approach:
1. Not supporting schema changes
2. Not compatible with DROP/TRUNCATE
3. Presence of large number of aborted transaction can slow down data 
access.

4. Semantic of join of tables with different timestamp is obscure.

I do not know how to address this issues. I am not sure how critical all 
this issues are and do them made this approach unusable.
Also there is quite common opinion that time travel should be don at 
application level and we do not need to support it at database kernel level.


I will be glad to continue work in this direction if there is some 
interest to this topic and somebody is going to try/review this feature.
It is very difficult to find some motivation for developing new features 
if you are absolutely sure that it will be never accepted by community.







Re: SQL/JSON: functions

2020-07-05 Thread Justin Pryzby
On Mon, Mar 23, 2020 at 08:28:52PM +0300, Nikita Glukhov wrote:
> Attached 47th version of the patches.

The patch checker/cfbot says this doesn't apply ; could you send a rebasified
version ?

-- 
Justin




Re: Inlining of couple of functions in pl_exec.c improves performance

2020-07-05 Thread Tom Lane
I wrote:
> I attach your original 0003 here (it still applies, with some line
> offsets).  That's just so the cfbot doesn't get confused about what
> it's supposed to test now.

Pushed that part now, too.

BTW, the first test run I did on this (on x86_64) was actually several
percent *slower* than HEAD.  I couldn't reproduce that after restarting
the postmaster; all later tests concurred that there was a speedup.
So I suppose that was just some phase-of-the-moon effect, perhaps caused
by an ASLR-dependent collision of bits of code in processor cache.
Still, that illustrates the difficulty of getting useful, reproducible
improvements when doing this kind of hacking.  I tend to think that
most of the time we're better off leaving this to the compiler.

regards, tom lane




Re: range_agg

2020-07-05 Thread Justin Pryzby
On Sat, Apr 11, 2020 at 09:36:37AM -0700, Paul A Jungwirth wrote:
> On Fri, Apr 10, 2020 at 8:44 PM Paul A Jungwirth 
>  wrote:
> > Thanks, and thanks for your v17 also. Here is a patch building on that
> > and adding support for anycompatiblemultirange.
> 
> Here is a v19 that just moved the multirange tests to a new parallel
> group to avoid a max-20-tests error. Sorry about that!

This needs to be rebased ; set cfbot to "waiting".

-- 
Justin




Re: SQL/JSON: JSON_TABLE

2020-07-05 Thread Justin Pryzby
On Mon, Mar 23, 2020 at 08:33:34PM +0300, Nikita Glukhov wrote:
> On 23.03.2020 19:24, Pavel Stehule wrote:
> > This patch needs rebase
> 
> Attached 43rd version of the patches based on the latest (v47) SQL/JSON
> functions patches.

It looks like this needs to be additionally rebased - I will set cfbot to
"Waiting".

-- 
Justin




Re: Ideas about a better API for postgres_fdw remote estimates

2020-07-05 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> In <1116564.1593813...@sss.pgh.pa.us> I wrote:
> > I wonder whether someday we ought to invent a new API that's more
> > suited to postgres_fdw's needs than EXPLAIN is.  It's not like the
> > remote planner doesn't know the number we want; it just fails to
> > include it in EXPLAIN.
> 
> I've been thinking about this a little more, and I'd like to get some
> ideas down on electrons before they vanish.
> 
> The current method for postgres_fdw to obtain remote estimates is to
> issue an EXPLAIN command to the remote server and then decipher the
> result.  This has just one big advantage, which is that it works
> against existing, even very old, remote PG versions.  In every other
> way it's pretty awful: it involves a lot of cycles on the far end
> to create output details we don't really care about, it requires a
> fair amount of logic to parse that output, and we can't get some
> details that we *do* care about (such as the total size of the foreign
> table, as per the other discussion).
> 
> We can do better.  I don't propose removing the existing logic, because
> being able to work against old remote PG versions seems pretty useful.
> But we could probe at connection start for whether the remote server
> has support for a better way, and then use that way if available.

I agree we can, and should, try to do better here.

> What should the better way look like?  I suggest the following:
> 
> * Rather than adding a core-server feature, the remote-end part of the new
> API should be a function installed by an extension (either postgres_fdw
> itself, or a new extension "postgres_fdw_remote" or the like).  One
> attraction of this approach is that it'd be conceivable to back-port the
> new code into existing PG releases by updating the extension.  Also
> there'd be room for multiple versions of the support.  The
> connection-start probe could be of the form "does this function exist
> in pg_proc?".
> 
> * I'm imagining the function being of the form
> 
> function pg_catalog.postgres_fdw_support(query text) returns something
> 
> where the input is still the text of a query we're considering issuing,
> and the output is some structure that contains the items of EXPLAIN-like
> data we need, but not the items we don't.  The implementation of the
> function would run the query through parse/plan, then pick out the
> data we want and return that.
> 
> * We could do a lot worse than to have the "structure" be JSON.
> This'd allow structured, labeled data to be returned; it would not be
> too difficult to construct, even in PG server versions predating the
> addition of JSON logic to the core; and the receiving postgres_fdw
> extension could use the core's JSON logic to parse the data.

I also tend to agree with using JSON for this.

> * The contents of the structure need to be designed with forethought
> for extensibility, but this doesn't seem hard if it's all a collection
> of labeled fields.  We can just say that the recipient must ignore
> fields it doesn't recognize.  Once a given field has been defined, we
> can't change its contents, but we can introduce new fields as needed.
> Note that I would not be in favor of putting an overall version number
> within the structure; that's way too coarse-grained.

This also makes sense to me.

> I'm not planning to do anything about these ideas myself, at least
> not in the short term.  But perhaps somebody else would like to
> run with them.

I'm trying to figure out why it makes more sense to use
'postgres_fdw_support(query text)', which would still do parse/plan and
return EXPLAIN-like data, rather than having:

EXPLAIN (FORMAT JSON, FDW true) query ...

(Or, perhaps better, individual boolean options for whatever stuff we
want to ask for, or to exclude if we don't want it, so that other tools
could use this...).

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-07-05 Thread Dilip Kumar
On Tue, Jun 30, 2020 at 10:13 AM Dilip Kumar  wrote:
>
> On Tue, Jun 30, 2020 at 9:20 AM Amit Kapila  wrote:
> >
> > On Mon, Jun 29, 2020 at 4:24 PM Dilip Kumar  wrote:
> > >
> > > On Mon, Jun 22, 2020 at 4:30 PM Amit Kapila  
> > > wrote:
> > > >
> > > >
> > > > Other than above tests, can we somehow verify that the invalidations
> > > > generated at commit time are the same as what we do with this patch?
> > > > We have verified with individual commands but it would be great if we
> > > > can verify for the regression tests.
> > >
> > > I have verified this using a few random test cases.  For verifying
> > > this I have made some temporary code changes with an assert as shown
> > > below.  Basically, on DecodeCommit we call
> > > ReorderBufferAddInvalidations function only for an assert checking.
> > >
> > > -void
> > >  ReorderBufferAddInvalidations(ReorderBuffer *rb, TransactionId xid,
> > >   XLogRecPtr
> > > lsn, Size nmsgs,
> > > -
> > > SharedInvalidationMessage *msgs)
> > > +
> > > SharedInvalidationMessage *msgs, bool commit)
> > >  {
> > > ReorderBufferTXN *txn;
> > >
> > > txn = ReorderBufferTXNByXid(rb, xid, true, NULL, lsn, true);
> > > -
> > > +   if (commit)
> > > +   {
> > > +   Assert(txn->ninvalidations == nmsgs);
> > > +   return;
> > > +   }
> > >
> > > The result is that for a normal local test it works fine.  But with
> > > regression suit, it hit an assert at many places because if the
> > > rollback of the subtransaction is involved then at commit time
> > > invalidation messages those are not logged whereas with command time
> > > invalidation those are logged.
> > >
> >
> > Yeah, somehow, we need to ignore rollback to savepoint tests and
> > verify for others.
>
> Yeah, I have run the regression suite,  I can see a lot of failure
> maybe we can somehow see the diff and confirm that all the failures
> are due to rollback to savepoint only.  I will work on this.

I have compared the changes logged at command end vs logged at commit
time.  I have ignored the invalidation for the transaction which has
any aborted subtransaction in it.  While testing this I found one
issue, the issue is that if there are some invalidation generated
between last command counter increment and the commit transaction then
those were not logged.  I have fixed the issue by logging the pending
invalidation in RecordTransactionCommit.  I will include the changes
in the next patch set.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com


v32-0002-WAL-Log-invalidations-at-command-end-with-wal_le.patch
Description: Binary data


Re: [Patch] Invalid permission check in pg_stats for functional indexes

2020-07-05 Thread Tom Lane
Daniel Gustafsson  writes:
> This patch still hasn't progressed since Tom's draft patch, is anyone still
> interested in pursuing it or should we close it for now?

It seems clearly reasonable to me to close the CF item as RWF,
expecting that a new one can be made whenever somebody re-tackles
this problem.  The CF list is not a to-do list.

regards, tom lane




Re: Built-in connection pooler

2020-07-05 Thread Konstantin Knizhnik

Thank your for your help.

On 05.07.2020 07:17, Jaime Casanova wrote:
You should also allow cursors without the WITH HOLD option or there is 
something i'm missing? 


Yes, good point.

a few questions about tainted backends:
- why the use of check_primary_key() and check_foreign_key() in
contrib/spi/refint.c make the backend tainted?


I think this is because without it contrib test is not passed with 
connection pooler.
This extension uses static variables which are assumed to be session 
specific  but in case f connection pooler are shared by all backends.



- the comment in src/backend/commands/sequence.c needs some fixes, it
seems just quickly typed


Sorry, done.



some usability problem:
- i compiled this on a debian machine with "--enable-debug
--enable-cassert --with-pgport=54313", so nothing fancy
- then make, make install, and initdb: so far so good

configuration:
listen_addresses = '*'
connection_proxies = 20

and i got this:

"""
jcasanov@DangerBox:/opt/var/pgdg/14dev$ /opt/var/pgdg/14dev/bin/psql
-h 127.0.0.1 -p 6543 postgres
psql: error: could not connect to server: no se pudo conectar con el
servidor: No existe el fichero o el directorio
¿Está el servidor en ejecución localmente y aceptando
conexiones en el socket de dominio Unix «/var/run/postgresql/.s.PGSQL.54313»?
"""

but connect at the postgres port works well
"""
jcasanov@DangerBox:/opt/var/pgdg/14dev$ /opt/var/pgdg/14dev/bin/psql
-h 127.0.0.1 -p 54313 postgres
psql (14devel)
Type "help" for help.

postgres=# \q
jcasanov@DangerBox:/opt/var/pgdg/14dev$ /opt/var/pgdg/14dev/bin/psql
-p 54313 postgres
psql (14devel)
Type "help" for help.

postgres=#
"""

PS: unix_socket_directories is setted at /tmp and because i'm not
doing this with postgres user i can use /var/run/postgresql

Looks like for some reasons your Postgres was not configured to accept 
TCP connection.

It accepts only local connections through Unix sockets.
But pooler is not listening unix sockets because there is absolutely no 
sense in pooling local connections.


I have done the same steps as you and have no problem to access pooler:

knizhnik@xps:~/postgresql.vanilla$ psql postgres -h 127.0.0.1 -p 6543
psql (14devel)
Type "help" for help.

postgres=# \q



Please notice that if I specify some unexisted port, then I get error 
message which is different with yours:


knizhnik@xps:~/postgresql.vanilla$ psql postgres -h 127.0.0.1 -p 65433
psql: error: could not connect to server: could not connect to server: 
Connection refused

    Is the server running on host "127.0.0.1" and accepting
    TCP/IP connections on port 65433?


So Postgres is not mentioning unix socket path in this case. It makes me 
think that your server is not accepting TCP connections at all (despite to


listen_addresses = '*'

)





Re: proposal: schema variables

2020-07-05 Thread Pavel Stehule
čt 21. 5. 2020 v 14:49 odesílatel Pavel Stehule 
napsal:

>
>
> čt 21. 5. 2020 v 13:34 odesílatel Amit Kapila 
> napsal:
>
>> On Thu, May 21, 2020 at 3:41 PM Pavel Stehule 
>> wrote:
>> >
>> > Hi
>> >
>> > just rebase without any other changes
>> >
>>
>> You seem to forget attaching the rebased patch.
>>
>
> I am sorry
>
> attached.
>

fresh rebase

Regards

Pavel


> Pavel
>
>
>> --
>> With Regards,
>> Amit Kapila.
>> EnterpriseDB: http://www.enterprisedb.com
>>
>


schema-variables-20200705.patch.gz
Description: application/gzip


Re: TAP tests and symlinks on Windows

2020-07-05 Thread Andrew Dunstan


On 7/3/20 10:11 AM, Peter Eisentraut wrote:
> On 2020-06-30 14:13, Michael Paquier wrote:
>> Attached is an updated patch, where I have tried to use a better
>> wording in all the code paths involved.
>
> This new patch doesn't work for me on MSYS2 yet.
>
> It fails right now in 010_pg_basebackup.pl at
>
>     my $realTsDir  = TestLib::perl2host("$shorter_tempdir/tblspc1");
>
> with chdir: No such file or directory.  Because perl2host requires the
> parent directory of the argument to exist, but here it doesn't.
>
> If I add
>
>     mkdir $shorter_tempdir;
>
> above it, then it proceeds past that point, but then the CREATE
> TABLESPACE command fails with No such file or directory.  I think the
> call
>
>     symlink "$tempdir", $shorter_tempdir;
>
> creates a directory inside $shorter_tempdir, since it now exists, per
> my above change, rather than in place of $shorter_tempdir.
>
> I think all of this is still a bit too fragile it needs further
> consideration.



I'll see what I can come up with.


cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Online verification of checksums

2020-07-05 Thread Daniel Gustafsson
> On 6 Apr 2020, at 23:15, Michael Banck  wrote:

> Probably we need to take a step back;

This patch has been Waiting on Author since the last commitfest (and no longer
applies as well), and by the sounds of the thread there are some open issues
with it.  Should it be Returned with Feedback to be re-opened with a fresh take
on it?

cheers ./daniel



Re: [Patch] Invalid permission check in pg_stats for functional indexes

2020-07-05 Thread Daniel Gustafsson
> On 25 Mar 2020, at 15:52, David Steele  wrote:

> On 12/26/19 6:46 PM, Tom Lane wrote:
>> Awhile back I wrote:
>>> Actually ... maybe we don't need to change the view definition at all,
>>> but instead just make has_column_privilege() do something different
>>> for indexes than it does for other relation types.  It's dubious that
>>> applying that function to an index yields anything meaningful today,
>>> so we could redefine what it returns without (probably) breaking
>>> anything.  That would at least give us an option to back-patch, too,
>>> though the end result might be complex enough that we don't care to
>>> risk it.
>> In hopes of resurrecting this thread, here's a draft patch that does
>> it like that (and also fixes row_security_active(), as otherwise this
>> probably creates a security hole in pg_stats).
> 
> Do you know when you will have a chance to look at this patch?
> 
> Tom made a suggestion up-thread about where the regression tests could go.

This patch still hasn't progressed since Tom's draft patch, is anyone still
interested in pursuing it or should we close it for now?

cheers ./daniel




Re: psql - improve test coverage from 41% to 88%

2020-07-05 Thread Daniel Gustafsson
> On 24 Mar 2020, at 15:47, David Steele  wrote:

> This patch no longer applies: http://cfbot.cputube.org/patch_27_2262.log
> 
> CF entry has been updated to Waiting on Author.

This patch hasn't been updated and still doesn't apply, do you intend to rebase
it during this commitfest or should we move it to returned with feedback?  It
can always be re-opened at a later date.

cheers ./daniel



Re: doc: vacuum full, fillfactor, and "extra space"

2020-07-05 Thread Daniel Gustafsson
> On 28 Mar 2020, at 11:23, Amit Kapila  wrote:
> 
> On Wed, Jan 29, 2020 at 9:10 PM Peter Eisentraut
>  wrote:
>> 
>> On 2020-01-20 06:30, Justin Pryzby wrote:
>> 
>> About your patch, I don't think this is clearer.  The fillfactor stuff
>> is valid to be mentioned, but the way it's being proposed makes it sound
>> like the main purpose of VACUUM FULL is to bloat the table to make
>> fillfactor room.  The "no extra space" wording made sense to me, with
>> the fillfactor business perhaps worth being put into a parenthetical note.
> 
> Justin, would you like to address this comment of Peter E.?

This patch has been Waiting on Author since April, will you have time to
address the questions during this commitfest, or should it be moved to Returned
with Feedback?

cheers ./daniel



Re: proposal - function string_to_table

2020-07-05 Thread Pavel Stehule
pá 5. 6. 2020 v 13:55 odesílatel Pavel Stehule 
napsal:

> Hi
>
> čt 4. 6. 2020 v 11:49 odesílatel movead...@highgo.ca 
> napsal:
>
>> +{ oid => '2228', descr => 'split delimited text',
>> +  proname => 'string_to_table', prorows => '1000', proretset => 't',
>> +  prorettype => 'text', proargtypes => 'text text',
>> +  prosrc => 'text_to_table' },
>> +{ oid => '2282', descr => 'split delimited text with null string',
>> +  proname => 'string_to_table', prorows => '1000', proretset => 't',
>> +  prorettype => 'text', proargtypes => 'text text text',
>> +  prosrc => 'text_to_table_null' },
>>
>> I go through the patch, and everything looks good to me. But I do not know
>> why it needs a 'text_to_table_null()', it's ok to put a 'text_to_table'
>> there, I think.
>>
>
> It is a convention in Postgres - every SQL unique signature has its own
> unique internal C function.
>
> I am sending a refreshed patch.
>

rebase

Regards

Pavel


> Regards
>
> Pavel
>
>
>
>
>>
>> --
>> Regards,
>> Highgo Software (Canada/China/Pakistan)
>> URL : www.highgo.ca
>> EMAIL: mailto:movead(dot)li(at)highgo(dot)ca
>>
>
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index f065856535..e9f10c162f 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -3458,6 +3458,27 @@ repeat('Pg', 4) PgPgPgPg

   
 
+  
+   
+
+ string_to_table
+
+string_to_table ( string text, delimiter text  , nullstr text   )
+set of text
+   
+   
+splits string into table using supplied delimiter and
+optional null string.
+   
+   
+string_to_table('xx~^~yy~^~zz', '~^~', 'yy')
+
+xx
+yy,
+zz
+   
+  
+
   

 
@@ -16747,21 +16768,6 @@ strict $.track.segments[*].location

   
 
-  
-   
-string starts with string
-boolean
-   
-   
-Tests whether the second operand is an initial substring of the first
-operand.
-   
-   
-jsonb_path_query('["John Smith", "Mary Stone", "Bob Johnson"]', '$[*] ? (@ starts with "John")')
-"John Smith"
-   
-  
-
   

 exists ( path_expression )
diff --git a/src/backend/utils/adt/varlena.c b/src/backend/utils/adt/varlena.c
index df10bfb906..87980b0803 100644
--- a/src/backend/utils/adt/varlena.c
+++ b/src/backend/utils/adt/varlena.c
@@ -26,6 +26,7 @@
 #include "lib/hyperloglog.h"
 #include "libpq/pqformat.h"
 #include "miscadmin.h"
+#include "nodes/execnodes.h"
 #include "parser/scansup.h"
 #include "port/pg_bswap.h"
 #include "regex/regex.h"
@@ -35,6 +36,7 @@
 #include "utils/memutils.h"
 #include "utils/pg_locale.h"
 #include "utils/sortsupport.h"
+#include "utils/tuplestore.h"
 #include "utils/varlena.h"
 
 
@@ -92,6 +94,16 @@ typedef struct
 	pg_locale_t locale;
 } VarStringSortSupport;
 
+/*
+ * Holds target metadata used for split string to array or to table.
+ */
+typedef struct
+{
+	ArrayBuildState	*astate;
+	Tuplestorestate *tupstore;
+	TupleDesc	tupdesc;
+} SplitStringTargetData;
+
 /*
  * This should be large enough that most strings will fit, but small enough
  * that we feel comfortable putting it on the stack
@@ -139,7 +151,7 @@ static bytea *bytea_substring(Datum str,
 			  bool length_not_specified);
 static bytea *bytea_overlay(bytea *t1, bytea *t2, int sp, int sl);
 static void appendStringInfoText(StringInfo str, const text *t);
-static Datum text_to_array_internal(PG_FUNCTION_ARGS);
+static bool text_to_array_internal(FunctionCallInfo fcinfo, SplitStringTargetData *tstate);
 static text *array_to_text_internal(FunctionCallInfo fcinfo, ArrayType *v,
 	const char *fldsep, const char *null_string);
 static StringInfo makeStringAggState(FunctionCallInfo fcinfo);
@@ -4679,7 +4691,19 @@ text_isequal(text *txt1, text *txt2, Oid collid)
 Datum
 text_to_array(PG_FUNCTION_ARGS)
 {
-	return text_to_array_internal(fcinfo);
+	SplitStringTargetData tstate;
+
+	/* reset tstate */
+	memset(, 0, sizeof(tstate));
+
+	if (!text_to_array_internal(fcinfo, ))
+		PG_RETURN_NULL();
+
+	if (!tstate.astate)
+		PG_RETURN_ARRAYTYPE_P(construct_empty_array(TEXTOID));
+
+	PG_RETURN_ARRAYTYPE_P(makeArrayResult(tstate.astate,
+		  CurrentMemoryContext));
 }
 
 /*
@@ -4693,16 +4717,98 @@ text_to_array(PG_FUNCTION_ARGS)
 Datum
 text_to_array_null(PG_FUNCTION_ARGS)
 {
-	return text_to_array_internal(fcinfo);
+	return text_to_array(fcinfo);
+}
+
+/*
+ * text_to_table
+ * Parse input string and returns substrings as a table.
+ */
+Datum
+text_to_table(PG_FUNCTION_ARGS)
+{
+	ReturnSetInfo	   *rsi = (ReturnSetInfo *) fcinfo->resultinfo;
+	SplitStringTargetData tstate;
+	MemoryContext		old_cxt;
+
+	/* check to see if caller supports us returning a tuplestore */
+	if (rsi == NULL || !IsA(rsi, ReturnSetInfo))
+		ereport(ERROR,
+(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
+ errmsg("set-valued function called in context that cannot 

Re: Parallell hashjoin sometimes ignores temp_tablespaces

2020-07-05 Thread Magnus Hagander
On Fri, Jul 3, 2020 at 11:02 PM Tom Lane  wrote:

> Magnus Hagander  writes:
> > On Fri, Jul 3, 2020 at 7:06 PM Tom Lane  wrote:
> >> Shouldn't that be inserting MyDatabaseTableSpace?  I see no other places
> >> anywhere that are forcing temp stuff into pg_default like this.
>
> > Yeah, looking at it again, I think it should. I can't see any reason why
> it
> > should enforce pg_default.
>
> OK, pushed with that additional correction.
>

LGTM, thanks!

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-07-05 Thread Dilip Kumar
On Sat, Jul 4, 2020 at 11:35 AM Amit Kapila  wrote:
>
> On Tue, Jun 30, 2020 at 5:20 PM Amit Kapila  wrote:
> >
> > Let me know what you think about the above changes.
> >
>
> I went ahead and made few changes in
> 0005-Implement-streaming-mode-in-ReorderBuffer which are explained
> below.  I have few questions and suggestions for the patch as well
> which are also covered in below points.
>
> 1.
> + if (prev_lsn == InvalidXLogRecPtr)
> + {
> + if (streaming)
> + rb->stream_start(rb, txn, change->lsn);
> + else
> + rb->begin(rb, txn);
> + stream_started = true;
> + }
>
> I don't think we want to move begin callback here that will change the
> existing semantics, so it is better to move begin at its original
> position. I have made the required changes in the attached patch.

Looks good to me.

> 2.
> ReorderBufferTruncateTXN()
> {
> ..
> + dlist_foreach_modify(iter, >changes)
> + {
> + ReorderBufferChange *change;
> +
> + change = dlist_container(ReorderBufferChange, node, iter.cur);
> +
> + /* remove the change from it's containing list */
> + dlist_delete(>node);
> +
> + ReorderBufferReturnChange(rb, change);
> + }
> ..
> }
>
> I think here we can add an Assert that we're not mixing changes from
> different transactions.  See the changes in the patch.

Looks fine.

> 3.
> SetupCheckXidLive()
> {
> ..
> + /*
> + * setup CheckXidAlive if it's not committed yet. We don't check if the xid
> + * aborted. That will happen during catalog access.  Also, reset the
> + * bsysscan flag.
> + */
> + if (!TransactionIdDidCommit(xid))
> + {
> + CheckXidAlive = xid;
> + bsysscan = false;
> ..
> }
>
> What is the need to reset bsysscan flag here if we are already
> resetting on error (like in the previous patch sent by me)?

Yeah, now we don't not need this.

> 4.
> ReorderBufferProcessTXN()
> {
> ..
> ..
> + /* Reset the CheckXidAlive */
> + if (streaming)
> + CheckXidAlive = InvalidTransactionId;
> ..
> }
>
> Similar to the previous point, we don't need this as well because
> AbortCurrentTransaction would have taken care of this.

Right

> 5.
> + * XXX Do we need to check if the transaction has some changes to stream
> + * (maybe it got streamed right before the commit, which attempts to
> + * stream it again before the commit)?
> + */
> +static void
> +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
>
> The above comment doesn't make much sense to me, so I have removed it.
> Basically, if there are no changes before commit, we still need to
> send commit and anyway if there are no more changes
> ReorderBufferProcessTXN will not do anything.

ok

> 6.
> +ReorderBufferStreamTXN(ReorderBuffer *rb, ReorderBufferTXN *txn)
> {
> ..
> if (txn->snapshot_now == NULL)
> + {
> + dlist_iter subxact_i;
> +
> + /* make sure this transaction is streamed for the first time */
> + Assert(!rbtxn_is_streamed(txn));
> +
> + /* at the beginning we should have invalid command ID */
> + Assert(txn->command_id == InvalidCommandId);
> +
> + dlist_foreach(subxact_i, >subtxns)
> + {
> + ReorderBufferTXN *subtxn;
> +
> + subtxn = dlist_container(ReorderBufferTXN, node, subxact_i.cur);
> + ReorderBufferTransferSnapToParent(txn, subtxn);
> + }
> ..
> }
>
> Here, it is possible that there is no base_snapshot for txn, so we
> need a check for that similar to ReorderBufferCommit.
>
> 7.  Apart from the above, I made few changes in comments and ran pgindent.

Ok

> 8. We can't stream the transaction before we reach the
> SNAPBUILD_CONSISTENT state because some other output plugin can apply
> those changes unlike what we do with pgoutput plugin (which writes to
> file). And, I think applying the transactions without reaching a
> consistent state would be anyway wrong.  So, we should avoid that and
> if do that then we should have an Assert for streamed txns rather than
> sending abort for them in ReorderBufferForget.

I will work on this point.

> 9.
> +ReorderBufferHandleConcurrentAbort(ReorderBuffer *rb, ReorderBufferTXN *txn,
> {
> ..
> + ReorderBufferToastReset(rb, txn);
> + if (specinsert != NULL)
> + ReorderBufferReturnChange(rb, specinsert);
> ..
> }
>
> Why do we need to do these here when we wouldn't have been done for
> any exception other than ERRCODE_TRANSACTION_ROLLBACK?

Because we are handling this exception "ERRCODE_TRANSACTION_ROLLBACK"
gracefully and we are continuing with further decoding so we need to
return this change back.

> 10.  I have got the below failure once.  I have not investigated this
> in detail as the patch is still under progress.  See, if you have any
> idea?
> #   Failed test 'check extra columns contain local defaults'
> #   at t/013_stream_subxact_ddl_abort.pl line 81.
> #  got: '2|0'
> # expected: '1000|500'
> # Looks like you failed 1 test of 2.
> make[2]: *** [check] Error 1
> make[1]: *** [check-subscription-recurse] Error 2
> make[1]: *** Waiting for unfinished jobs
> make: *** [check-world-src/test-recurse] Error 2

Even I got the failure once and after that, it did not 

Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2020-07-05 Thread Dilip Kumar
On Tue, Jun 30, 2020 at 5:20 PM Amit Kapila  wrote:
>
> On Tue, Jun 30, 2020 at 10:13 AM Dilip Kumar  wrote:
> >
> > On Tue, Jun 30, 2020 at 9:20 AM Amit Kapila  wrote:
> > >
> > > Can't we name the last parameter as 'commit_lsn' as that is how
> > > documentation in the patch spells it and it sounds more appropriate?
> >
> > You are right commit_lsn seems more appropriate here.
> >
> > > Also, is there a reason for assigning report_location and
> > > write_location differently than what we do in commit_cb_wrapper?
> > > Basically, assign those as txn->final_lsn and txn->end_lsn
> > > respectively.
> >
> > Yes, I think it should be handled in same way as commit_cb_wrapper.
> > Because before calling ReorderBufferStreamCommit in
> > ReorderBufferCommit, we are properly updating the final_lsn as well as
> > the end_lsn.
> >
>
> Okay, I have made these changes in the attached patch and there are
> few more changes in
> 0003-Extend-the-output-plugin-API-with-stream-methods.
> 1. In pg_decode_stream_message, for transactional messages, we were
> displaying message contents which is different from other streaming
> APIs.  I have changed it so that streaming API doesn't display message
> contents for transactional messages.

Ok, make sense.

> 2.
> + /* in streaming mode, stream_change_cb is required */
> + if (ctx->callbacks.stream_change_cb == NULL)
> + ereport(ERROR,
> + (errmsg("Output plugin supports streaming, but has not registered "
> + "stream_change_cb callback.")));
>
> The error messages seem a bit weird.  (a) doesn't include error code,
> (b) not in PG style. I have changed all the error messages to fix
> these two issues and change the message as well

ok

> 3. Rearranged the functions stream_* so that the optional functions
> are at the end and also arranged other functions in a way that looks
> more logical to me.

Make sense to me.

> 4. Updated comments, commit message, and edited docs in the patch.
>
> I have made a few changes in
> 0004-Gracefully-handle-concurrent-aborts-of-transacti as well.
> 1. The variable bsysscan was not being reset in case of error.  I have
> introduced a new function to reset both bsysscan and CheckXidAlive
> during transaction abort.  Also, snapmgr.c doesn't seem right place
> for these variables, so I moved them to xact.c.  I think this will
> make the initialization of CheckXidAlive during catch in
> ReorderBufferProcessTXN redundant.

That looks better.

> 2. Updated comments and commit message.
>
> Let me know what you think about the above changes.

All the above changes look good to me and I will include in the next version.

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: suggest to rename enable_incrementalsort

2020-07-05 Thread Peter Eisentraut

On 2020-07-02 17:25, James Coleman wrote:

I think the change makes a lot of sense. The only reason I had it as
enable_incrementalsort in the first place was trying to broadly
following the existing GUC names, but as has already been pointed out,
there's a lot of variation there, and my version of the patch already
changed it to be more readable (at one point it was
enable_incsort...which is short...but does not have an obvious
meaning).

I've attached a patch to make the change,


committed

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services