Re: pgbench - allow to create partitioned tables

2019-09-09 Thread Amit Kapila
On Mon, Aug 26, 2019 at 11:04 PM Fabien COELHO  wrote:
>
>
> > Just one suggestion --partition-method option should be made dependent
> > on --partitions, because it has no use unless used with --partitions.
> > What do you think?
>

Some comments:
*
+ case 11: /* partitions */
+ initialization_option_set = true;
+ partitions = atoi(optarg);
+ if (partitions < 0)
+ {
+ fprintf(stderr, "invalid number of partitions: \"%s\"\n",
+ optarg);
+ exit(1);
+ }
+ break;

Is there a reason why we treat "partitions = 0" as a valid value?
Also, shouldn't we keep some max limit for this parameter as well?
Forex. how realistic it will be if the user gives the value of
partitions the same or greater than the number of rows in
pgbench_accounts table?  I understand it is not sensible to give such
a value, but I guess the API should behave sanely in such cases as
well.  I am not sure what will be the good max value for it, but I
think there should be one.  Anyone else have any better suggestions
for this?

*
@@ -3625,6 +3644,7 @@ initCreateTables(PGconn *con)
  const char *bigcols; /* column decls if accountIDs are 64 bits */
  int declare_fillfactor;
  };
+
  static const struct ddlinfo DDLs[] = {

Spurious line change.

*
+"  --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"
+"  --partition-method=(range|hash)\n"
+"   partition account table with this
method (default: range)\n"

Refer complete table name like pgbench_accounts instead of just
account. It will be clear and in sync with what we display in some
other options like --skip-some-updates.

*
+"  --partitions=NUM partition account table in NUM parts
(defaults: 0)\n"

/defaults/default.

*
I think we should print the information about partitions in
printResults.  It can help users while analyzing results.

*
+enum { PART_NONE, PART_RANGE, PART_HASH }
+ partition_method = PART_NONE;
+

I think it is better to follow the style of QueryMode enum by using
typedef here, that will make look code in sync with nearby code.

*
- int i;

  fprintf(stderr, "creating tables...\n");

- for (i = 0; i < lengthof(DDLs); i++)
+ for (int i = 0; i < lengthof(DDLs); i++)

This is unnecessary change as far as this patch is concerned.  I
understand there is no problem in writing either way, but let's not
change the coding pattern here as part of this patch.

*
+ if (partitions >= 1)
+ {
+ int64 part_size = (naccounts * (int64) scale + partitions - 1) / partitions;
+ char ff[64];
+ ff[0] = '\0';
+ append_fillfactor(ff, sizeof(ff));
+
+ fprintf(stderr, "creating %d partitions...\n", partitions);
+
+ for (int p = 1; p <= partitions; p++)
+ {
+ char query[256];
+
+ if (partition_method == PART_RANGE)
+ {

part_size can be defined inside "if (partition_method == PART_RANGE)"
as it is used here.  In general, this part of the code can use some
comments.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Change atoi to strtol in same place

2019-09-09 Thread Michael Paquier
On Mon, Sep 09, 2019 at 11:02:49PM -0500, Joe Nelson wrote:
> Alvaro Herrera from 2ndQuadrant wrote:
> > Okay, so who is submitting a new version here?  Surafel, Joe?
> 
> I've attached a revision that uses pg_strtoint64 from str2int-10.patch
> (posted in ). My patch is
> based off that one and c5bc7050af.
> 
> It covers the same front-end utilities as atoi-to-strtol-v2.patch. Other
> utilities in the pg codebase still have atoi inside, but I thought I'd
> share my progress to see if people like the direction. If so, I can
> update the rest of the utils.
> 
> I added a helper function to a new file in src/fe_utils, but had to
> modify Makefiles in ways that might not be too clean. Maybe there's a
> better place for the function.

Using a wrapper in src/fe_utils makes sense.  I would have used
options.c for the new file, or would that be too much generic?

The code indentation is weird, particularly the switch/case in
pg_strtoint64_range().

The error handling is awkward.  I think that you should just call
pg_log_error in pg_strtoint64_range instead of returning an error
string as you do.  You could do that by passing down the option name
to the routine, and generate a new set of error messages using that.

Why do you need to update common/Makefile?

The builds on Windows are broken.  You are adding one file into
fe_utils, but forgot to update @pgfeutilsfiles in Mkvcbuild.pm. 
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL 12 Beta 4

2019-09-09 Thread Michael Paquier
On Tue, Sep 10, 2019 at 09:34:34AM +0530, Sandeep Thakkar wrote:
> What's the date for PostgreSQL 12 GA?

This is not decided yet.
--
Michael


signature.asc
Description: PGP signature


Re: PostgreSQL 12 Beta 4

2019-09-09 Thread Sandeep Thakkar
Hi

What's the date for PostgreSQL 12 GA?

On Fri, Sep 6, 2019 at 1:57 AM Jonathan S. Katz 
wrote:

> Hi,
>
> PostgreSQL 12 Beta 4 will be released on 2019-09-12. Please make sure
> that fixes for bugs and other open items[1] are committed by the end of
> the weekend.
>
> Thanks for all of your efforts in getting PostgreSQL 12 ready for
> general availability!
>
> Jonathan
>
> [1] https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items
>
>

-- 
Sandeep Thakkar


Re: Change atoi to strtol in same place

2019-09-09 Thread Joe Nelson
Alvaro Herrera from 2ndQuadrant wrote:
> Okay, so who is submitting a new version here?  Surafel, Joe?

I've attached a revision that uses pg_strtoint64 from str2int-10.patch
(posted in ). My patch is
based off that one and c5bc7050af.

It covers the same front-end utilities as atoi-to-strtol-v2.patch. Other
utilities in the pg codebase still have atoi inside, but I thought I'd
share my progress to see if people like the direction. If so, I can
update the rest of the utils.

I added a helper function to a new file in src/fe_utils, but had to
modify Makefiles in ways that might not be too clean. Maybe there's a
better place for the function.

-- 
Joe Nelson  https://begriffs.com
diff --git a/contrib/pg_standby/Makefile b/contrib/pg_standby/Makefile
index 0bca2f8e9e..cb9292d0f4 100644
--- a/contrib/pg_standby/Makefile
+++ b/contrib/pg_standby/Makefile
@@ -6,6 +6,8 @@ PGAPPICON = win32
 PROGRAM = pg_standby
 OBJS	= pg_standby.o $(WIN32RES)
 
+PG_LDFLAGS += -L$(top_builddir)/src/fe_utils -lpgfeutils
+
 ifdef USE_PGXS
 PG_CONFIG = pg_config
 PGXS := $(shell $(PG_CONFIG) --pgxs)
diff --git a/contrib/pg_standby/pg_standby.c b/contrib/pg_standby/pg_standby.c
index 031b1b5cd5..ce2735f3ae 100644
--- a/contrib/pg_standby/pg_standby.c
+++ b/contrib/pg_standby/pg_standby.c
@@ -33,6 +33,7 @@
 #include "pg_getopt.h"
 
 #include "access/xlog_internal.h"
+#include "fe_utils/arg_utils.h"
 
 const char *progname;
 
@@ -678,6 +679,10 @@ main(int argc, char **argv)
 
 	while ((c = getopt(argc, argv, "cdk:lr:s:t:w:")) != -1)
 	{
+		pg_strtoint_status s;
+		int64		parsed;
+		char	   *parse_error;
+
 		switch (c)
 		{
 			case 'c':			/* Use copy */
@@ -687,12 +692,15 @@ main(int argc, char **argv)
 debug = true;
 break;
 			case 'k':			/* keepfiles */
-keepfiles = atoi(optarg);
-if (keepfiles < 0)
+s = pg_strtoint64_range(optarg, ,
+		0, INT_MAX, _error);
+if (s != PG_STRTOINT_OK)
 {
-	fprintf(stderr, "%s: -k keepfiles must be >= 0\n", progname);
+	fprintf(stderr, "%s: -k keepfiles %s\n",
+			progname, parse_error);
 	exit(2);
 }
+keepfiles = parsed;
 break;
 			case 'l':			/* Use link */
 
@@ -706,31 +714,39 @@ main(int argc, char **argv)
 #endif
 break;
 			case 'r':			/* Retries */
-maxretries = atoi(optarg);
-if (maxretries < 0)
+s = pg_strtoint64_range(optarg, ,
+		0, INT_MAX, _error);
+if (s != PG_STRTOINT_OK)
 {
-	fprintf(stderr, "%s: -r maxretries must be >= 0\n", progname);
+	fprintf(stderr, "%s: -r maxretries %s\n",
+			progname, parse_error);
 	exit(2);
 }
+maxretries = parsed;
 break;
 			case 's':			/* Sleep time */
-sleeptime = atoi(optarg);
-if (sleeptime <= 0 || sleeptime > 60)
+s = pg_strtoint64_range(optarg, , 1, 60, _error);
+if (s != PG_STRTOINT_OK)
 {
-	fprintf(stderr, "%s: -s sleeptime incorrectly set\n", progname);
+	fprintf(stderr, "%s: -s sleeptime %s\n",
+			progname, parse_error);
 	exit(2);
 }
+sleeptime = parsed;
 break;
 			case 't':			/* Trigger file */
 triggerPath = pg_strdup(optarg);
 break;
 			case 'w':			/* Max wait time */
-maxwaittime = atoi(optarg);
-if (maxwaittime < 0)
+s = pg_strtoint64_range(optarg, ,
+		0, INT_MAX, _error);
+if (s != PG_STRTOINT_OK)
 {
-	fprintf(stderr, "%s: -w maxwaittime incorrectly set\n", progname);
+	fprintf(stderr, "%s: -w maxwaittime %s\n",
+			progname, parse_error);
 	exit(2);
 }
+maxwaittime = parsed;
 break;
 			default:
 fprintf(stderr, "Try \"%s --help\" for more information.\n", progname);
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 7986872f10..45ec25c6e0 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -31,6 +31,7 @@
 #include "common/file_utils.h"
 #include "common/logging.h"
 #include "common/string.h"
+#include "fe_utils/arg_utils.h"
 #include "fe_utils/string_utils.h"
 #include "getopt_long.h"
 #include "libpq-fe.h"
@@ -2229,6 +2230,10 @@ main(int argc, char **argv)
 	while ((c = getopt_long(argc, argv, "CD:F:r:RS:T:X:l:nNzZ:d:c:h:p:U:s:wWkvP",
 			long_options, _index)) != -1)
 	{
+		pg_strtoint_status s;
+		int64		parsed;
+		char	   *parse_error;
+
 		switch (c)
 		{
 			case 'C':
@@ -2313,12 +2318,13 @@ main(int argc, char **argv)
 #endif
 break;
 			case 'Z':
-compresslevel = atoi(optarg);
-if (compresslevel < 0 || compresslevel > 9)
+s = pg_strtoint64_range(optarg, , 0, 9, _error);
+if (s != PG_STRTOINT_OK)
 {
-	pg_log_error("invalid compression level \"%s\"", optarg);
+	pg_log_error("invalid compression level: %s", parse_error);
 	exit(1);
 }
+compresslevel = parsed;
 break;
 			case 'c':
 if (pg_strcasecmp(optarg, "fast") == 0)
@@ -2351,12 +2357,14 @@ main(int argc, char **argv)
 dbgetpassword = 1;
 break;
 			case 's':
-

Re: refactoring - share str2*int64 functions

2019-09-09 Thread Michael Paquier
On Mon, Sep 09, 2019 at 05:27:04AM -0700, Andres Freund wrote:
> On 2019-09-09 20:57:46 +0900, Michael Paquier wrote:
> But ISTM all of them ought to just use the C types, rather than the SQL
> types however. Since in the above proposal the caller determines the
> type names, if you want a different type - like the SQL input routines -
> can just invoke pg_strtoint_error() themselves (or just have it open
> coded).

Yep, that was my line of thoughts.

>> And for errors which should never happen we could just use
>> elog().  For the input functions of int2/4/8 we still need the
>> existing errors of course.
> 
> Right, there it makes sense to continue to refer the SQL level types.

Actually, I found your suggestion of using a noreturn function for the
error reporting to be a very clean alternative.  I didn't know though
that gcc is not able to detect that a function does not return if you
don't have a default in the switch for all the status codes.  And this
even if all the values of the enum for the switch are listed.

> I'm saying that we shouldn't need the whole logic of trying to parse the
> string as an int, and then fail to float if it's not that. But that it's
> not this patchset's task to fix this.

Ah, sure.  Agreed.

>> Not sure about that.  I would keep the scope of the patch simple as of
>> now, where we make sure that we have the right interface for
>> everything.  There are a couple of extra improvements which could be
>> done afterwards, and if we move everything in the same place that
>> should be easier to move on with more improvements.  Hopefully.
> 
> The only reason for thinking about it now is that we'd then avoid
> changing the API twice.

What I think we would be looking for here is an extra argument for the
low-level routines to control the behavior of the function in an
extensible way, say a bits16 for a set of flags, with one flag to
ignore checks for trailing and leading whitespace.  This feels a bit
over-engineered though for this purpose.

Attached is an updated patch?  How does it look?  I have left the
parts of readfuncs.c for now as there are more issues behind that than
doing a single switch, short reads are one, long reads a second.  And
the patch already does a lot.  There could be also an argument for
having extra _check wrappers for the unsigned portions but these would
be mostly unused in the backend code, so I have left that out on
purpose.

After all that stuff, there are still some issues which need more
care, in short:
- the T_Float conversion.
- removal of strtoint()
- the part for readfuncs.c
--
Michael

diff --git a/contrib/pg_stat_statements/pg_stat_statements.c b/contrib/pg_stat_statements/pg_stat_statements.c
index 221b47298c..1ed03aae02 100644
--- a/contrib/pg_stat_statements/pg_stat_statements.c
+++ b/contrib/pg_stat_statements/pg_stat_statements.c
@@ -62,6 +62,7 @@
 #include 
 
 #include "catalog/pg_authid.h"
+#include "common/string.h"
 #include "executor/instrument.h"
 #include "funcapi.h"
 #include "mb/pg_wchar.h"
@@ -1022,7 +1023,7 @@ pgss_ProcessUtility(PlannedStmt *pstmt, const char *queryString,
 		/* parse command tag to retrieve the number of affected rows. */
 		if (completionTag &&
 			strncmp(completionTag, "COPY ", 5) == 0)
-			rows = pg_strtouint64(completionTag + 5, NULL, 10);
+			(void) pg_strtouint64(completionTag + 5, );
 		else
 			rows = 0;
 
diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490f85..b063f1e122 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -306,7 +306,7 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: too short %d (< 5) list of arguments", nargs);
 
-	nrefs = pg_strtoint32(args[0]);
+	nrefs = pg_strtoint32_check(args[0]);
 	if (nrefs < 1)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: %d (< 1) number of references specified", nrefs);
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 2c0ae395ba..8e75d52b06 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -21,6 +21,7 @@
 #include "catalog/heap.h"
 #include "catalog/pg_type.h"
 #include "commands/trigger.h"
+#include "common/string.h"
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
@@ -2338,8 +2339,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 	CreateTableAsStmt *ctastmt = (CreateTableAsStmt *) stmt->utilityStmt;
 
 	if (strncmp(completionTag, "SELECT ", 7) == 0)
-		_SPI_current->processed =
-			pg_strtouint64(completionTag + 7, NULL, 10);
+		(void) pg_strtouint64(completionTag + 7, &_SPI_current->processed);
 	else
 	{
 		/*
@@ -2361,8 +2361,7 @@ _SPI_execute_plan(SPIPlanPtr plan, ParamListInfo paramLI,
 else if (IsA(stmt->utilityStmt, CopyStmt))
 {
 	Assert(strncmp(completionTag, "COPY ", 5) == 0);
-	_SPI_current->processed = pg_strtouint64(completionTag + 5,
-			 NULL, 10);
+	(void) 

Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-09 Thread Masahiko Sawada
On Tue, Sep 10, 2019 at 10:21 AM Amit Kapila  wrote:
>
> On Mon, Sep 9, 2019 at 6:22 PM Alvaro Herrera from 2ndQuadrant
>  wrote:
> >
> > On 2019-Sep-08, Amit Kapila wrote:
> >
> > > On Thu, Sep 5, 2019 at 2:17 PM Masahiko Sawada  
> > > wrote:
> > > >
> > > > Thanks. I hope the attached new patch fixes this issue.
> > > *
> > > +-- decode infomask flags as human readable flag names
> > > +CREATE FUNCTION heap_infomask_flags(
> > > +   infomask integer,
> > > +   infomask2 integer,
> > > +   decode_combined boolean DEFAULT false)
> > > +RETURNS text[]
> > > +AS 'MODULE_PATHNAME', 'heap_infomask_flags'
> > >
> > > Isn't it better to name this function as tuple_infomask_flags or
> > > something like that?  The other functions that start their name with
> > > heap take page as input.  Also, I think the index-related functions
> > > that start with index name take blk/page as input.
> >
> > I think that other table AMs are not necessarily going to use the same
> > infomask flags, so I think we should keep a name that is somehow
> > heapam-specific.  Maybe "heapam_infomask_flags" would be okay?
> >
>
> It will look bit strange to use heapam as a prefix for this function
> when all others use heap.  I guess if we want to keep it AM specific,
> then the proposed name (heap_infomask_flags) is better or
> alternatively we can consider heap_tuple_infomask_flags?

+1 for heap_tuple_infomask_flags. And do we need to change
tuple_data_split to heap_tuple_data_split as well because it's also a
heap specific function?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center




Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-09 Thread Amit Kapila
On Tue, Sep 10, 2019 at 8:03 AM Masahiko Sawada  wrote:
>
> On Tue, Sep 10, 2019 at 10:21 AM Amit Kapila  wrote:
> >
> > On Mon, Sep 9, 2019 at 6:22 PM Alvaro Herrera from 2ndQuadrant
> >  wrote:
> > > I think that other table AMs are not necessarily going to use the same
> > > infomask flags, so I think we should keep a name that is somehow
> > > heapam-specific.  Maybe "heapam_infomask_flags" would be okay?
> > >
> >
> > It will look bit strange to use heapam as a prefix for this function
> > when all others use heap.  I guess if we want to keep it AM specific,
> > then the proposed name (heap_infomask_flags) is better or
> > alternatively we can consider heap_tuple_infomask_flags?
>
> +1 for heap_tuple_infomask_flags. And do we need to change
> tuple_data_split to heap_tuple_data_split as well because it's also a
> heap specific function?
>

Good thought, but I think even if we want to change the name of
tuple_data_split, it might be better done separately.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-09 Thread Michael Paquier
On Tue, Sep 10, 2019 at 07:51:08AM +0530, Amit Kapila wrote:
> It will look bit strange to use heapam as a prefix for this function
> when all others use heap.  I guess if we want to keep it AM specific,
> then the proposed name (heap_infomask_flags) is better or
> alternatively we can consider heap_tuple_infomask_flags?

Using "heap_" as prefix of the function looks like the best choice to
me and that's more consistent with the other functions we have
already.  Using "tuple" looks sensible as well so the last name you are
proposing sounds like a good alternative.

> At the beginning of pageinspect documentation page, we have a line
> "All of these functions may be used only by superusers.".  We need to
> change that and then maybe give some explanation of why this
> particular function will be allowed to non-superusers.  BTW, do you
> have any use case in mind for the same because anyway we need
> superuser privileges to get the page contents and I think this
> function can't be used independently?

I would still keep it as superuser-restricted, to avoid any risks with
people playing with the internals of this function.  pageinspect is
sensitive enough.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-09 Thread Amit Kapila
On Mon, Sep 9, 2019 at 6:22 PM Alvaro Herrera from 2ndQuadrant
 wrote:
>
> On 2019-Sep-08, Amit Kapila wrote:
>
> > On Thu, Sep 5, 2019 at 2:17 PM Masahiko Sawada  
> > wrote:
> > >
> > > Thanks. I hope the attached new patch fixes this issue.
> > *
> > +-- decode infomask flags as human readable flag names
> > +CREATE FUNCTION heap_infomask_flags(
> > +   infomask integer,
> > +   infomask2 integer,
> > +   decode_combined boolean DEFAULT false)
> > +RETURNS text[]
> > +AS 'MODULE_PATHNAME', 'heap_infomask_flags'
> >
> > Isn't it better to name this function as tuple_infomask_flags or
> > something like that?  The other functions that start their name with
> > heap take page as input.  Also, I think the index-related functions
> > that start with index name take blk/page as input.
>
> I think that other table AMs are not necessarily going to use the same
> infomask flags, so I think we should keep a name that is somehow
> heapam-specific.  Maybe "heapam_infomask_flags" would be okay?
>

It will look bit strange to use heapam as a prefix for this function
when all others use heap.  I guess if we want to keep it AM specific,
then the proposed name (heap_infomask_flags) is better or
alternatively we can consider heap_tuple_infomask_flags?

>
> > Some more comments:
> > *
> > +SELECT t_infomask, t_infomask2, flags
> > +FROM heap_page_items
> > (get_raw_page('test1', 0)),
> > + LATERAL heap_infomask_flags(t_infomask,
> > t_infomask2, true) m(flags);
> >
> > Do we really need a LATERAL clause in the above kind of queries?
> > AFAIU, the functions can reference the columns that exist in the FROM
> > list, but I might be missing some point.
>
> I think the spec allows the LATERAL keyword to be implicit in the case
> of functions, so this seems a matter of style.  Putting LATERAL
> explicitly seems slightly clearer, to me.
>

No problem.

> > +Datum
> > +heap_infomask_flags(PG_FUNCTION_ARGS)
> > +{
> > + uint16 t_infomask = PG_GETARG_INT16(0);
> > + uint16 t_infomask2 = PG_GETARG_INT16(1);
> > + bool decode_combined = PG_GETARG_BOOL(2);
>
> > All the functions in this file are allowed only for superusers and
> > there is an explanation for the same as mentioned in the file header
> > comments.  Is there a reason for this function to be different?
>
> The other functions can crash if fed arbitrary input.  I don't think
> this one can crash, so it seems okay for it not to be superuser-only.
>

At the beginning of pageinspect documentation page, we have a line
"All of these functions may be used only by superusers.".  We need to
change that and then maybe give some explanation of why this
particular function will be allowed to non-superusers.  BTW, do you
have any use case in mind for the same because anyway we need
superuser privileges to get the page contents and I think this
function can't be used independently?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: refactoring - share str2*int64 functions

2019-09-09 Thread Michael Paquier
On Mon, Sep 09, 2019 at 03:17:38AM -0700, Andres Freund wrote:
> On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
>> @@ -80,7 +81,7 @@
>>  #define READ_UINT64_FIELD(fldname) \
>>  token = pg_strtok(); /* skip :fldname */ \
>>  token = pg_strtok(); /* get field value */ \
>> -local_node->fldname = pg_strtouint64(token, NULL, 10)
>> +(void) pg_strtouint64(token, _node->fldname)
> 
> Seems like these actually could just ought to use the error-checked
> variants. And I think it ought to change all of
> READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one
> of them to the new routines.

Okay for these changes, except for READ_INT_FIELD where we have short
variables using it as well (for example StrategyNumber) so this
generates a justified warning.  I think that a correct solution
here would be to add a new READ_SHORT_FIELD which uses pg_strtoint16.
I am not adding that for now.
--
Michael


signature.asc
Description: PGP signature


Re: Should we add xid_current() or a int8->xid cast?

2019-09-09 Thread Tom Lane
Thomas Munro  writes:
> On Sun, Sep 1, 2019 at 5:04 PM Thomas Munro  wrote:
>> Adding to CF.

> Rebased.  An OID clashed so re-roll the dice.  Also spotted a typo.

FWIW, I'd move *all* the OIDs added by this patch up to >= 8000.
I don't feel a strong need to fill in the gaps in the low-numbered
OIDs, and people who do try that are likely to hit problems of the
sort you just did.

regards, tom lane




Re: Should we add xid_current() or a int8->xid cast?

2019-09-09 Thread Thomas Munro
On Sun, Sep 1, 2019 at 5:04 PM Thomas Munro  wrote:
> Adding to CF.

Rebased.  An OID clashed so re-roll the dice.  Also spotted a typo.

-- 
Thomas Munro
https://enterprisedb.com


0001-Add-SQL-type-xid8-to-expose-FullTransactionId-to--v2.patch
Description: Binary data


0002-Introduce-xid8-variants-of-the-txid_XXX-fmgr-func-v2.patch
Description: Binary data


Re: Consolidate 'unique array values' logic into a reusable function?

2019-09-09 Thread Thomas Munro
On Fri, Aug 30, 2019 at 3:34 PM Thomas Munro  wrote:
> Adding to CF.

Rebased due to bitrot.  Spotted one more place to use this, in
src/backend/utils/adt/txid.c.

-- 
Thomas Munro
https://enterprisedb.com


0001-Consolidate-code-that-makes-a-sorted-array-unique-v2.patch
Description: Binary data


Re: [HACKERS] CLUSTER command progress monitor

2019-09-09 Thread Peter Geoghegan
On Fri, Sep 6, 2019 at 5:11 AM Robert Haas  wrote:
> On Fri, Sep 6, 2019 at 1:44 AM Michael Paquier  wrote:
> > I don't see exactly why we could not switch to a fixed number of
> > slots, say 8, with one code path to start a progress which adds an
> > extra report on the stack, one to remove one entry from the stack, and
> > a new one to reset the whole thing for a backend.  This would not need
> > much restructuration of course.
>
> You could do that, but I don't think it's probably that great of an
> idea.  Now you've built something which is significantly more complex
> than the original design of this feature, but still not good enough to
> report on the progress of a query tree.  I tend to think we should
> confine ourselves to the progress reporting that can reasonably be
> done within the current infrastructure until somebody invents a really
> general mechanism that can handle, essentially, an EXPLAIN-on-the-fly
> of a current query tree.

+1. Let's not complicate the progress reporting infrastructure for an
uncertain benefit.

CLUSTER/VACUUM FULL is fundamentally an awkward utility command to
target with progress reporting infrastructure. I think that it's okay
to redefine how progress reporting works with CLUSTER now, in order to
fix the REINDEX/CLUSTER state clobbering bug.

-- 
Peter Geoghegan




Re: Shared Memory: How to use SYSV rather than MMAP ?

2019-09-09 Thread Thomas Munro
On Wed, Sep 4, 2019 at 10:30 AM Alvaro Herrera  wrote:
> On 2019-Feb-03, Thomas Munro wrote:
> > On Sat, Feb 2, 2019 at 12:49 AM Thomas Munro
> >  wrote:
> > > I am planning to commit the 0001 patch shortly, unless there are
> > > objections.  I attach a new version, which improves the documentation
> > > a bit (cross-referencing the new GUC and the section on sysctl
> > > settings).  That will give us shared_memory_type = sysv.
> >
> > Committed 0001.
>
> So can you please rebase what remains?

Here's a quick rebase.  It needs testing, review and (probably)
adjustment from AIX users.  I'm not going to be able to do anything
with it on my own due to lack of access, though I'm happy to help get
this committed eventually.  If we don't get any traction in this CF,
I'll withdraw this submission for now.  For consistency, I think we
should eventually also do the same thing for Linux when using sysv
(it's pretty similar, it just uses different flag names; it may also
be necessary to query the page size and round up the requested size,
on one or both of those OSes; I'm not sure).

-- 
Thomas Munro
https://enterprisedb.com


0001-Add-huge-page-support-for-AIX.patch
Description: Binary data


Re: [PATCH] Incremental sort (was: PoC: Partial sort)

2019-09-09 Thread Tomas Vondra

On Wed, Sep 04, 2019 at 09:17:10PM +0200, Tomas Vondra wrote:

On Wed, Sep 04, 2019 at 11:37:48AM +0200, Rafia Sabih wrote:

On Tue, 30 Jul 2019 at 02:17, Tomas Vondra 
wrote:


On Sun, Jul 21, 2019 at 01:34:22PM +0200, Tomas Vondra wrote:


...

I wonder if we're approaching this wrong. Maybe we should not reverse
engineer queries for the various places, but just start with a set of
queries that we want to optimize, and then identify which places in the
planner need to be modified.



I've decided to do a couple of experiments, trying to make my mind about
which modified places matter to diffrent queries. But instead of trying
to reverse engineer the queries, I've taken a different approach - I've
compiled a list of queries that I think are sensible and relevant, and
then planned them with incremental sort enabled in different places.

I don't have any clear conclusions at this point - it does show some of
the places don't change plan for any of the queries, although there may
be some additional query where it'd make a difference.

But I'm posting this mostly because it might be useful. I've initially
planned to move changes that add incremental sort paths to separate
patches, and then apply/skip different subsets of those patches. But
then I realized there's a better way to do this - I've added a bunch of
GUCs, one for each such place. This allows doing this testing without
having to rebuild repeatedly.

I'm not going to post the patch(es) with extra GUCs here, because it'd
just confuse the patch tester, but it's available here:

 https://github.com/tvondra/postgres/tree/incremental-sort-20190730

There are 10 GUCs, one for each place in planner where incremental sort
paths are constructed. By default all those are set to 'false' so no
incremental sort paths are built. If you do

 SET devel_create_ordered_paths = on;

it'll start creating the paths in non-parallel in create_ordered_paths.
Then you may enable devel_create_ordered_paths_parallel to also consider
parallel paths, etc.

The list of queries (synthetic, but hopefully sufficiently realistic)
and a couple of scripts to collect the plans is in this repository:

 https://github.com/tvondra/incremental-sort-tests-2

There's also a spreadsheet with a summary of results, with a visual
representation of which GUCs affect which queries.


Wow, that sounds like an elaborate experiment. But where is this
spreadsheet you mentioned ?



It seems I forgot to push the commit containing the spreadsheet with
results. I'll fix that tomorrow.



OK, I've pushed the commit with the spreadsheet. The single sheet lists
the synthetic queries, and hashes of plans with different flags enables
(parallel query, force incremental sort, and the new developer GUCs
mentioned before). Only a single developer flag is set to true (or none
of them).

The columns at the end simply say whether the plan differs from the plan
generated by master (no patches). TRUE means "same as master" while
FALSE means "different plan.

The "patched" column means all developer GUCs disabled, so it's expected
to produce the same plan as master (and it is). And then there's one
column for each developer GUC. If the column is just TRUE it means the
GUC does not affect any of the synthetic queries. There are 4 of them:

- devel_add_paths_to_grouping_rel_parallel
- devel_create_partial_grouping_paths
- devel_gather_grouping_paths
- devel_standard_join_search

The places controlled by those GUCs are either useless, or the query
affected by them is not included in the list of queries.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Replication & recovery_min_apply_delay

2019-09-09 Thread Alexander Korotkov
On Wed, Sep 4, 2019 at 4:37 PM Konstantin Knizhnik
 wrote:
> receivedUpto is just static variable in xlog.c, maintained by WAL receiver.
> But as I mentioned above, WAL receiver is not started at the moment when
> we need to know LSN of last record.
>
> Certainly it should be possible to somehow persist receveidUpto, so we
> do not need to scan WAL to determine the last LSN at next start.
> By persisting last LSN introduce a lot of questions and problems.
> For example when it needs to be flushed for the disk. If it is done
> after each received transaction, then it can significantly suffer
> performance.
> If it is done more or less asynchronously, then there us a risk that we
> requested streaming with wrong position.
> In any case it will significantly complicate the patch and make it more
> sensible for various errors.

I think we don't necessary need exact value of receveidUpto.  But it
could be some place to start scanning WAL from.  We currently call
UpdateControlFile() in a lot of places.  In particular we call it each
checkpoint.  If even we would start scanning WAL from one checkpoint
back value of receveidUpto, we could still save a lot of resources.

> I wonder what is wrong with determining LSN of last record by just
> scanning WAL?
> Certainly it is not the most efficient way. But I do not expect that
> somebody will have hundreds or thousands megabytes of WAL.
> Michael, do you see some other problems with GetLastLSN() functions
> except time of its execution?

As I get this patch fixes a problem with very large recovery apply
delay.  In this case, amount of accumulated WAL corresponding to that
delay could be also huge.  Scanning all this amount of WAL could be
costly.  And it's nice to evade.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [PATCH] ltree, lquery, and ltxtquery binary protocol support

2019-09-09 Thread Alexander Korotkov
Hi!

On Sun, Aug 18, 2019 at 6:53 PM Nino Floris  wrote:
> Attached is a patch to support send/recv on ltree, lquery and ltxtquery.
> I'm a contributor to the Npgsql .NET PostgreSQL driver and we'll be
> the first to have official support once those ltree changes have been
> released.
> You can find the driver support work here, the tests verify a
> roundtrip for each of the types is succesful.
> https://github.com/NinoFloris/npgsql/tree/label-tree-support

Thank you for the patch.

My notes are following:
 * Your patch doesn't follow coding convention.  In particular, it
uses spaces for indentation instead of tabs.  Moreover, it changes
tags to spaces in places it doesn't touch.  As the result, patch is
"dirty".  Looks like your IDE isn't properly setup.  Please check your
patch doesn't contain unintended changes and follow coding convention.
It's also good practice to run pgindent over changed files before
sending patch.
 * We currently don't add new extension SQL-script for new extension
version (i.e. ltree--1.2.sql).  Instead, we provide just a migration
script (i.e. ltree--1.1--1.2.sql).  This simplifies testing of
extension migration – plain extension creation implies migration.
 * What is motivation for binary format for lquery and ltxtquery?  One
could transfer large datasets of ltree's.  But is it so for lquery and
ltxtquery, which are used just for querying.
 * Just send binary representation of datatype is not OK.  PostgreSQL
supports a lot of platforms with different byte ordering, alignment
and so on.  You basically need to send each particular field using
pq_send*() function.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Rethinking opclass member checks and dependency strength

2019-09-09 Thread Alexander Korotkov
On Sun, Aug 18, 2019 at 10:00 PM Tom Lane  wrote:
> Here's a preliminary patch along these lines.  It adds an AM callback
> that can adjust the dependency types before they're entered into
> pg_depend.  There's a lot of stuff that's open for debate and/or
> remains to be done:
>
> * Is the parameter list of amcheckmembers() sufficient, or should we
> pass more info (if so, what)?  In principle, the AM can always look
> up anything else it needs to know from the provided OIDs, but that
> would be cumbersome if many AMs need the same info.

Looks sufficient to me.  I didn't yet imagine something else useful.

> * Do we need any more flexibility in the set of ways that the pg_depend
> entries can be set up than what I've provided here?

Flexibility also looks sufficient to me.

> * Are the specific ways that the entries are getting set up appropriate?
> Note in particular that I left btree/hash alone, feeling that the default
> (historical) behavior was designed for them and is not unreasonable; but
> maybe we should switch them to the cross-type-vs-not-cross-type behavior
> proposed above.  Also I didn't touch BRIN because I don't know enough
> about it to be sure what would be best, and I didn't touch contrib/bloom
> because I don't care too much about it.

I think we need ability to remove GiST fetch proc.  Presence of this
procedure is used to determine whether GiST index supports index only
scan (IOS).  We need to be able to remove this proc to drop IOS
support.

> * I refactored things a little bit in opclasscmds.c, mostly by adding
> an is_func boolean to OpFamilyMember and getting rid of parameters
> equivalent to that.  This is based on the thought that AMs might prefer
> to process the structs based on such a flag rather than by keeping them
> in separate lists.  We could go further and merge the operator and
> function structs into one list, forcing the is_func flag to be used;
> but I'm not sure whether that'd be an improvement.

I'm also not sure about this.  Two lists look OK to me.

> * I'm not at all impressed with the name, location, or concept of
> opfam_internal.h.  I think we should get rid of that header and put
> the OpFamilyMember struct somewhere else.  Given that this patch
> makes it part of the AM API, it wouldn't be unreasonable to move it
> to amapi.h.  But I've not done that here.

+1

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [PATCH] kNN for btree

2019-09-09 Thread Alexander Korotkov
On Sun, Sep 8, 2019 at 11:35 PM Alexander Korotkov
 wrote:
> I'm going to push 0001 changing "attno >= 1" to assert.

Pushed.  Rebased patchset is attached.  I propose to limit
consideration during this commitfest to this set of 7 remaining
patches.  The rest of patches could be considered later.  I made some
minor editing in preparation to commit.  But I decided I've couple
more notes to Nikita.

 * 0003 extracts part of fields from BTScanOpaqueData struct into new
BTScanStateData struct.  However, there is a big comment regarding
BTScanOpaqueData just before definition of BTScanPosItem.  It needs to
be revised.
 * 0004 adds "knnState" field to BTScanOpaqueData in addition to
"state" field.  Wherein "knnState" might unused during knn scan if it
could be done in one direction.  This seems counter-intuitive.  Could
we rework this to have "forwardState" and "backwardState" fields
instead of "state" and "knnState"?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


0002-Enable-ORDER-BY-operator-scans-on-ordered-indexe-v15.patch.gz
Description: GNU Zip compressed data


0001-Introduce-ammorderbyopfirstcol-v15.patch.gz
Description: GNU Zip compressed data


0003-Extract-structure-BTScanState-v15.patch.gz
Description: GNU Zip compressed data


0004-Add-kNN-support-to-btree-v15.patch.gz
Description: GNU Zip compressed data


0005-Add-btree-distance-operators-v15.patch.gz
Description: GNU Zip compressed data


0006-Remove-distance-operators-from-btree_gist-v15.patch.gz
Description: GNU Zip compressed data


0007-Add-regression-tests-for-kNN-btree-v15.patch.gz
Description: GNU Zip compressed data


Re: Bug in GiST paring heap comparator

2019-09-09 Thread Alexander Korotkov
On Mon, Sep 9, 2019 at 8:32 PM Nikita Glukhov  wrote:
> On 08.09.2019 22:32, Alexander Korotkov wrote:
>
> > On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov
> >  wrote:
> >> I'm going to push both if no objections.
> > So, pushed!
>
> Two years ago there was a similar patch for this issue:
> https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru
>
> Sorry that I looked at this thread too late.

I see, thanks.

You patch seems a bit less cumbersome thanks to using GISTDistance
struct.  You don't have to introduce separate array with null flags.
But it's harder too keep index_store_float8_orderby_distances() used
by both GiST and SP-GiST.

What do you think?  You can rebase your patch can propose that as refactoring.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: add a MAC check for TRUNCATE

2019-09-09 Thread Yuli Khodorkovskiy
On Fri, Sep 6, 2019 at 9:09 PM Joe Conway  wrote:
>
> On 9/6/19 8:07 PM, Tom Lane wrote:
> > Joe Conway  writes:
> >> On 9/6/19 2:18 PM, Tom Lane wrote:
> >>> sepgsql hasn't worked on RHEL6 in a long time, if ever; it requires
> >>> a newer version of libselinux than what ships in RHEL6.  So I'm not
> >>> concerned about that.  We do need to worry about RHEL7, and whatever
> >>> is the oldest version of Fedora that is running the sepgsql tests
> >>> in the buildfarm.
> >
> >> I could be wrong, but as far as I know rhinoceros is the only buildfarm
> >> animal running sepgsql tests.
> >
> > It seems reasonable to define RHEL7 as the oldest SELinux version we
> > still care about.  But it'd be a good idea for somebody to be running
> > a fairly bleeding-edge Fedora animal with sepgsql enabled, so we get
> > coverage of the other end of the scale.
>
>
> Yeah -- I was planning to eventually register a RHEL8 animal, but I
> should probably do one for Fedora as well. I'll bump the priority for
> that on my personal TODO.
>
> Joe
> --
> Crunchy Data - http://crunchydata.com
> PostgreSQL Support for Secure Enterprises
> Consulting, Training, & Open Source Development

Hello,

I have included an updated version of the sepgql patch. The
Truncate-Hook patch is unchanged from the last version.

The sepgsql changes now check if the db_table:{ truncate } permission
exists in the loaded SELinux policy before running the truncate
regression test. If the permission does not exist, then the new
regression test will not run.

Testing the TRUNCATE regression test can be done by manually adding
the permission with CIL:

```
sudo semodule -cE base
sudo sed -i -E 's/(class db_table.*?) \)/\1 truncate\)/' base.cil
sudo semodule -i base.cil
```

Thanks,

Yuli


Truncate-Hook.patch
Description: Binary data


v2-Sepgsql-Truncate.patch
Description: Binary data


Re: Does PostgreSQL support debian Linux on Arm CPU Platform?

2019-09-09 Thread Andrew Dunstan


On 9/9/19 11:07 AM, gc...@sina.cn wrote:
> Hi,I just want know does PostgreSQL support debian Linux with ARM CPU
> Platform,Thank you!


See



cheers


andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: pg_regress restart cluster?

2019-09-09 Thread Tom Lane
Jeremy Finzel  writes:
> I am using the basic extension building infrastructure with sql and
> expected files, but what I want to test is changing a config setting and
> then restarting the cluster with shared_preload_libraries in place. Is
> there a canonical way to do this or does anyone have any examples of this?
> I appreciate it very much!

pg_regress doesn't have any support for that, but you can do it pretty
easily in the context of a TAP test.  There are a bunch of examples
in the existing TAP tests --- look around for uses of append_conf().

regards, tom lane




Sloppy port assignment in src/test/ldap/

2019-09-09 Thread Tom Lane
I wondered about this transient failure:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=crake=2019-09-09%2008%3A48%3A25

The only information that was captured was

# Running: /usr/sbin/slapd -f 
/home/bf/bfr/root/HEAD/pgsql.build/src/test/ldap/tmp_check/slapd.conf -h 
ldap://localhost:57797 ldaps://localhost:57798
Bail out!  system /usr/sbin/slapd failed

So, slapd failed without writing anything to stderr *or* its configured
log file, which is pretty unhelpful.  But, casting about for a possible
explanation, I noticed the port setup code earlier in the script:

my $ldap_port = get_free_port();
my $ldaps_port= $ldap_port + 1;

Translated: we're finding one free port and just assuming that
the next one will be free (or even exists :-().

Sure enough, I can reproduce the failure seen on crake if I force
the $ldaps_port to be a port number that something has in use.
Presumably that transient failure occurred because something was
using 57798.

So this code needs to be

my $ldap_port = get_free_port();
my $ldaps_port= get_free_port();

I'll go fix that in a moment.

regards, tom lane




Re: Bug in GiST paring heap comparator

2019-09-09 Thread Nikita Glukhov

On 08.09.2019 22:32, Alexander Korotkov wrote:


On Fri, Sep 6, 2019 at 5:44 PM Alexander Korotkov
 wrote:

I'm going to push both if no objections.

So, pushed!


Two years ago there was a similar patch for this issue:
https://www.postgresql.org/message-id/1499c9d0-075a-3014-d2aa-ba59121b3728%40postgrespro.ru

Sorry that I looked at this thread too late.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Libpq support to connect to standby server as priority

2019-09-09 Thread Alvaro Herrera from 2ndQuadrant
Question about 0001.  In the CONNECTION_SETENV state, you end by falling
through to the CONNECTION_CHECK_TARGET case; but in that switch it seems
a bit unnatural to do that.  I think doing "goto keep_trying" is another
way of doing the same thing, but more in line with what every other
piece of code does.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 30fbb4a913697fe35a195dd41ddd5bcfeec53c0e Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Fri, 6 Sep 2019 16:26:01 -0400
Subject: [PATCH] Restructure the code to remove duplicate code

The duplicate code logic of checking for the server version
before issuing the SHOW transaction_read_only to find out whether
the server is read-write or not is restructured under a new
connection status, so that duplicate code is removed. This is
required for the next set of patches

Author: Hari Babu Kommi
Discussion: https://postgr.es/m/cajrrpge_qgdbbn+ybgevpd+ylhxxjtruzk6rmtmhqrfig+3...@mail.gmail.com
---
 src/interfaces/libpq/fe-connect.c | 87 ---
 src/interfaces/libpq/libpq-fe.h   |  3 +-
 2 files changed, 34 insertions(+), 56 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index f03d43376c..91316709a5 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -3434,6 +3434,13 @@ keep_going:		/* We will come back to here until there is
 	return PGRES_POLLING_WRITING;
 }
 
+/* Almost there now ... */
+conn->status = CONNECTION_CHECK_TARGET;
+goto keep_going;
+			}
+
+		case CONNECTION_CHECK_TARGET:
+			{
 /*
  * If a read-write connection is required, see if we have one.
  *
@@ -3476,67 +3483,37 @@ keep_going:		/* We will come back to here until there is
 			}
 
 		case CONNECTION_SETENV:
-
-			/*
-			 * Do post-connection housekeeping (only needed in protocol 2.0).
-			 *
-			 * We pretend that the connection is OK for the duration of these
-			 * queries.
-			 */
-			conn->status = CONNECTION_OK;
-
-			switch (pqSetenvPoll(conn))
 			{
-case PGRES_POLLING_OK:	/* Success */
-	break;
-
-case PGRES_POLLING_READING: /* Still going */
-	conn->status = CONNECTION_SETENV;
-	return PGRES_POLLING_READING;
-
-case PGRES_POLLING_WRITING: /* Still going */
-	conn->status = CONNECTION_SETENV;
-	return PGRES_POLLING_WRITING;
-
-default:
-	goto error_return;
-			}
-
-			/*
-			 * If a read-write connection is required, see if we have one.
-			 * (This should match the stanza in the CONNECTION_AUTH_OK case
-			 * above.)
-			 *
-			 * Servers before 7.4 lack the transaction_read_only GUC, but by
-			 * the same token they don't have any read-only mode, so we may
-			 * just skip the test in that case.
-			 */
-			if (conn->sversion >= 70400 &&
-conn->target_session_attrs != NULL &&
-strcmp(conn->target_session_attrs, "read-write") == 0)
-			{
-if (!saveErrorMessage(conn, ))
-	goto error_return;
-
+/*
+ * Do post-connection housekeeping (only needed in protocol 2.0).
+ *
+ * We pretend that the connection is OK for the duration of these
+ * queries.
+ */
 conn->status = CONNECTION_OK;
-if (!PQsendQuery(conn,
- "SHOW transaction_read_only"))
+
+switch (pqSetenvPoll(conn))
 {
-	restoreErrorMessage(conn, );
-	goto error_return;
+	case PGRES_POLLING_OK:	/* Success */
+		break;
+
+	case PGRES_POLLING_READING: /* Still going */
+		conn->status = CONNECTION_SETENV;
+		return PGRES_POLLING_READING;
+
+	case PGRES_POLLING_WRITING: /* Still going */
+		conn->status = CONNECTION_SETENV;
+		return PGRES_POLLING_WRITING;
+
+	default:
+		goto error_return;
 }
-conn->status = CONNECTION_CHECK_WRITABLE;
-restoreErrorMessage(conn, );
-return PGRES_POLLING_READING;
+
+/* Almost there now ... */
+conn->status = CONNECTION_CHECK_TARGET;
+goto keep_going;
 			}
 
-			/* We can release the address list now. */
-			release_conn_addrinfo(conn);
-
-			/* We are open for business! */
-			conn->status = CONNECTION_OK;
-			return PGRES_POLLING_OK;
-
 		case CONNECTION_CONSUME:
 			{
 conn->status = CONNECTION_OK;
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 22c4954f2b..5f65db30e4 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -67,7 +67,8 @@ typedef enum
  * connection. */
 	CONNECTION_CONSUME,			/* Wait for any pending message and consume
  * them. */
-	CONNECTION_GSS_STARTUP		/* Negotiating GSSAPI. */
+	CONNECTION_GSS_STARTUP,		/* Negotiating GSSAPI. */
+	CONNECTION_CHECK_TARGET		/* Check if we have a proper target connection */
 } ConnStatusType;
 
 typedef enum
-- 
2.17.1



Re: having issues with PG12 debian packaging repository

2019-09-09 Thread Magnus Hagander
On Mon, Sep 9, 2019 at 6:03 PM Murat Tuncer  wrote:

>
> On Sun, Sep 8, 2019, 21:43 Magnus Hagander  wrote:
>
>> On Wed, Sep 4, 2019 at 5:43 PM Murat Tuncer 
>> wrote:
>>
>>> Hello hackers
>>>
>>> I am getting sporadic errors when I tried to use PG12 bionic debian
>>> repository.
>>>
>>> Here is the error message that is result of apt-get update.
>>> ---
>>> Failed to fetch
>>> http://apt.postgresql.org/pub/repos/apt/dists/bionic-pgdg/main/binary-i386/Packages.gz
>>> File has unexpected size (260865 != 260866). Mirror sync in progress? [IP:
>>> 34.96.81.152 80]
>>> Hashes of expected file:
>>>  - Filesize:260866 [weak]
>>>  -
>>> SHA256:433bef097d8a54a9899350c182d0074c1a13f62c8e7e9987cc6c63cd11242abc
>>>  - SHA1:1be55e080a1dd277929f095690ae9b9cf01e971f [weak]
>>>  - MD5Sum:08189bf54aa297f53b9656bc3c529c62 [weak]
>>>  Release file created at: Mon, 02 Sep 2019 10:25:33 +
>>>  Failed to fetch
>>> http://apt.postgresql.org/pub/repos/apt/dists/bionic-pgdg/main/binary-amd64/Packages.gz
>>>  Some index files failed to download. They have been ignored, or old
>>> ones used instead.
>>> --
>>>
>>> It usually succeeds when I run it again.  Is there a way to avoid this?
>>>
>>>
>> This sounds very similar to an issue people ran into on travis, which I
>> believe was tracked down to travis putting a cache in between themselves
>> and apt.postgresql.org, which broke the order of downloads. Any chance
>> you also have a cache sitting there somewhere?
>>
>> //Magnus
>>
>
>
> Thanks.
>
> I should have added I was seeing this in travis. I tought it was a general
> issue.
>
> Is there a workaround for this in travis ?
>
>
I thought it was fixed by now. Travis has a thread at
https://travis-ci.community/t/sometimes-build-fails-when-apt-is-updating-postgresql-apt-repository/4872

//Magnus


Does PostgreSQL support debian Linux on Arm CPU Platform?

2019-09-09 Thread gc_11
Hi,I just want know does PostgreSQL support debian Linux with ARM CPU 
Platform,Thank you!

pg_upgrade issues

2019-09-09 Thread Dave Cramer
pgjdbc has a bug report which is as follows:

The database has a table that has a description and a constraint.
The constraint also has a description.

somehow the constraint and the table end up with the same OID's after
pg_upgrade.

My understanding of pg_upgrade suggests that shouldn't happen ? I realize
oids are not guaranteed to be unique, but this seems to be quite a
coincidence.


Dave Cramer


Re: having issues with PG12 debian packaging repository

2019-09-09 Thread Murat Tuncer
On Sun, Sep 8, 2019, 21:43 Magnus Hagander  wrote:

> On Wed, Sep 4, 2019 at 5:43 PM Murat Tuncer  wrote:
>
>> Hello hackers
>>
>> I am getting sporadic errors when I tried to use PG12 bionic debian
>> repository.
>>
>> Here is the error message that is result of apt-get update.
>> ---
>> Failed to fetch
>> http://apt.postgresql.org/pub/repos/apt/dists/bionic-pgdg/main/binary-i386/Packages.gz
>> File has unexpected size (260865 != 260866). Mirror sync in progress? [IP:
>> 34.96.81.152 80]
>> Hashes of expected file:
>>  - Filesize:260866 [weak]
>>  - SHA256:433bef097d8a54a9899350c182d0074c1a13f62c8e7e9987cc6c63cd11242abc
>>  - SHA1:1be55e080a1dd277929f095690ae9b9cf01e971f [weak]
>>  - MD5Sum:08189bf54aa297f53b9656bc3c529c62 [weak]
>>  Release file created at: Mon, 02 Sep 2019 10:25:33 +
>>  Failed to fetch
>> http://apt.postgresql.org/pub/repos/apt/dists/bionic-pgdg/main/binary-amd64/Packages.gz
>>  Some index files failed to download. They have been ignored, or old ones
>> used instead.
>> --
>>
>> It usually succeeds when I run it again.  Is there a way to avoid this?
>>
>>
> This sounds very similar to an issue people ran into on travis, which I
> believe was tracked down to travis putting a cache in between themselves
> and apt.postgresql.org, which broke the order of downloads. Any chance
> you also have a cache sitting there somewhere?
>
> //Magnus
>


Thanks.

I should have added I was seeing this in travis. I tought it was a general
issue.

Is there a workaround for this in travis ?

Murat

>


Re: using explicit_bzero

2019-09-09 Thread Andres Freund
Hi,

On 2019-09-05 08:38:36 +0200, Peter Eisentraut wrote:
> On 2019-09-05 04:12, Michael Paquier wrote:
> > On Wed, Sep 04, 2019 at 04:38:21PM -0400, Alvaro Herrera wrote:
> >> Marked RfC.  Can we get on with this?
> > 
> > FWIW, I have been able to test this one on Windows with MSVC and
> > things are handled correctly.
> 
> committed

I still think this change is done wrongly, by providing an
implementation for a library function implemented in various
projects. If you e.g. dynamically load a library that implements its own
version of bzero, ours will replace it in many cases.

I think all this implementation actually guarantees is that bzero2 is
read, but not that the copy is not elided. In practice that's *probably*
good enough, but a compiler could just check whether bzero_p points to
memset.

http://cseweb.ucsd.edu/~lerner/papers/dse-usenix2017.pdf
https://boringssl-review.googlesource.com/c/boringssl/+/1339/

I think we have absolutely no business possibly intercepting / replacing
actually securely coded implementations of sensitive functions.

Greetings,

Andres Freund




Re: Built-in connection pooler

2019-09-09 Thread Konstantin Knizhnik



On 06.09.2019 19:41, Konstantin Knizhnik wrote:



On 06.09.2019 1:01, Jaime Casanova wrote:


Sadly i got a lot of FAILED tests, i'm attaching the diffs on
regression with installcheck and installcheck-parallel.
btw, after make installcheck-parallel i wanted to do a new test but
wasn't able to drop regression database because there is still a
subscription, so i tried to drop it and got a core file (i was
connected trough the pool_worker), i'm attaching the backtrace of the
crash too.



Sorry, I failed to reproduce the crash.
So if you will be able to find out some scenario for reproduce it, I 
will be very pleased to receive it.


I was able to reproduce the crash.
Patch is attached. Also I added proxyign of RESET command.
Unfortunately it is still not enough to pass regression tests with 
"proxying_gucs=on".
Mostly because error messages doesn't match after prepending "set local" 
commands.



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/spi/refint.c b/contrib/spi/refint.c
index adf0490..5c2095f 100644
--- a/contrib/spi/refint.c
+++ b/contrib/spi/refint.c
@@ -11,6 +11,7 @@
 
 #include "commands/trigger.h"
 #include "executor/spi.h"
+#include "storage/proc.h"
 #include "utils/builtins.h"
 #include "utils/rel.h"
 
@@ -93,6 +94,8 @@ check_primary_key(PG_FUNCTION_ARGS)
 	else
 		tuple = trigdata->tg_newtuple;
 
+	MyProc->is_tainted = true;
+
 	trigger = trigdata->tg_trigger;
 	nargs = trigger->tgnargs;
 	args = trigger->tgargs;
@@ -284,6 +287,8 @@ check_foreign_key(PG_FUNCTION_ARGS)
 		/* internal error */
 		elog(ERROR, "check_foreign_key: cannot process INSERT events");
 
+	MyProc->is_tainted = true;
+
 	/* Have to check tg_trigtuple - tuple being deleted */
 	trigtuple = trigdata->tg_trigtuple;
 
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c91e3e1..df0bcaf 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -719,6 +719,169 @@ include_dir 'conf.d'
   
  
 
+ 
+  max_sessions (integer)
+  
+   max_sessions configuration parameter
+  
+  
+  
+
+  The maximum number of client sessions that can be handled by
+  one connection proxy when session pooling is enabled.
+  This parameter does not add any memory or CPU overhead, so
+  specifying a large max_sessions value
+  does not affect performance.
+  If the max_sessions limit is reached new connections are not accepted.
+
+
+  The default value is 1000. This parameter can only be set at server start.
+   
+  
+ 
+
+ 
+  session_pool_size (integer)
+  
+   session_pool_size configuration parameter
+  
+  
+  
+
+  Enables session pooling and defines the maximum number of
+  backends that can be used by client sessions for each database/user combination.
+  Launched non-tainted backends are never terminated even if there are no active sessions.
+  Backend is considered as tainted if client updates GUCs, creates temporary table or prepared statements.
+  Tainted backend can server only one client.
+
+
+  The default value is 10, so up to 10 backends will serve each database,
+
+  
+ 
+
+ 
+  proxy_port (integer)
+  
+   proxy_port configuration parameter
+  
+  
+  
+
+  Sets the TCP port for the connection pooler.
+  Clients connected to main "port" will be assigned dedicated backends,
+  while client connected to proxy port will be connected to backends through proxy which
+  performs transaction level scheduling. 
+   
+
+  The default value is 6543.
+
+  
+ 
+
+ 
+  connection_proxies (integer)
+  
+   connection_proxies configuration parameter
+  
+  
+  
+
+  Sets number of connection proxies.
+  Postmaster spawns separate worker process for each proxy. Postmaster scatters connections between proxies using one of scheduling policies (round-robin, random, load-balancing).
+  Each proxy launches its own subset of backends.
+  So maximal number of non-tainted backends is  session_pool_size*connection_proxies*databases*roles.
+   
+
+  The default value is 0, so session pooling is disabled.
+
+  
+ 
+
+ 
+  session_schedule (enum)
+  
+   session_schedule configuration parameter
+  
+  
+  
+
+  Specifies scheduling policy for assigning session to proxies in case of
+  connection pooling. Default policy is round-robin.
+
+
+  With round-robin policy postmaster cyclicly scatter sessions between proxies.
+
+
+  With random policy postmaster randomly choose proxy for new session.
+
+

Re: Hypothetical indexes using BRIN broken since pg10

2019-09-09 Thread Tom Lane
Alvaro Herrera from 2ndQuadrant  writes:
> On 2019-Sep-02, Tom Lane wrote:
>> The right answer IMO is basically for the brinGetStats call to go
>> away from brincostestimate and instead happen during plancat.c's
>> building of the IndexOptInfo.  In the case of a hypothetical index,
>> it'd fall to the get_relation_info_hook to fill in suitable fake
>> data.

> So I'm not clear on what the suggested strategy is, here.  Do we want
> that design change to occur in the bugfix that would be backpatched, or
> do we want the backbranches to use the patch as posted and then we apply
> the above design on master only?

The API change I'm proposing is surely not back-patchable.

Whether we should bother back-patching a less capable stopgap fix
is unclear to me.  Yeah, it's a bug that an index adviser can't
try a hypothetical BRIN index; but given that nobody noticed till
now, it doesn't seem like there's much field demand for it.
And I'm not sure that extension authors would want to deal with
testing minor-release versions to see if the fix is in, so
even if there were a back-patch, it might go unused.

regards, tom lane




Re: Hypothetical indexes using BRIN broken since pg10

2019-09-09 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Sep-02, Tom Lane wrote:

> Julien Rouhaud  writes:
> > On Fri, Jul 26, 2019 at 1:34 PM Heikki Linnakangas  wrote:
> >> The patch assumes the default pages_per_range setting, but looking at
> >> the code at https://github.com/HypoPG/hypopg, the extension actually
> >> takes pages_per_range into account when it estimates the index size. I
> >> guess there's no easy way to pass the pages_per_range setting down to
> >> brincostestimate(). I'm not sure what we should do about that, but seems
> >> that just using BRIN_DEFAULT_PAGES_PER_RANGE here is not very accurate.
> 
> > Yes, hypopg can use a custom pages_per_range as it intercepts it when
> > the hypothetical index is created.  I didn't find any way to get that
> > information in brincostestimate(), especially for something that could
> > backpatched.  I don't like it, but I don't see how to do better than
> > just using BRIN_DEFAULT_PAGES_PER_RANGE :(
> 
> I can tell you what I think ought to happen, but making it happen might
> be more work than this patch should take on.
> 
> The right answer IMO is basically for the brinGetStats call to go
> away from brincostestimate and instead happen during plancat.c's
> building of the IndexOptInfo.  In the case of a hypothetical index,
> it'd fall to the get_relation_info_hook to fill in suitable fake
> data.

So I'm not clear on what the suggested strategy is, here.  Do we want
that design change to occur in the bugfix that would be backpatched, or
do we want the backbranches to use the patch as posted and then we apply
the above design on master only?

If the former -- is Julien interested in trying to develop that?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Set of header files for Ryu floating-point stuff in src/common/

2019-09-09 Thread Andrew Gierth
> "Michael" == Michael Paquier  writes:

 Michael> Hi all,
 Michael> (Andrew G. in CC)

 Michael> We have the following set of header files in src/common/:
 Michael> digit_table.h
 Michael> d2s_full_table.h
 Michael> d2s_intrinsics.h
 Michael> ryu_common.h

 Michael> Shouldn't all these files be in src/include/common/ instead?

No.

a) They are implementation, not interface.

b) Most of them are just data tables.

c) The ones that define inline functions have some specializations (e.g.
limits on ranges or shift counts) that make it unwise to expose more
generally.

They are kept as separate files primarily because upstream had them that
way (and having the data tables out of the way makes the code more
readable). But it's explicitly not a good idea for them to be installed
anywhere or to have any additional code depending on them, since it is
conceivable that they might have to change without warning or disappear
in the event that we choose to track some upstream change (or replace
Ryu entirely).

-- 
Andrew (irc:RhodiumToad)




Re: Two pg_rewind patches (auto generate recovery conf and ensure clean shutdown)

2019-09-09 Thread Paul Guo
>
> Thank for rebasing.
>
> I didn't like 0001 very much.
>
> * It seems now would be the time to stop pretending we're using a file
> called recovery.conf; I know we still support older server versions that
> use that file, but it sounds like we should take the opportunity to
> rename the function to be less misleading once those versions vanish out
> of existance.
>

How about renaming the function names to
GenerateRecoveryConf -> GenerateRecoveryConfContents
WriteRecoveryConf -> WriteRecoveryConfInfo<- it writes standby.signal
on pg12+, so function name WriteRecoveryConfContents is not accurate.
and
variable writerecoveryconf -> write_recovery_conf_info?


> * disconnect_and_exit seems a bit out of place compared to the other
> parts of this new module.  I think you only put it there so that the
> 'conn' can be a global, and that you can stop passing 'conn' as a
> variable to GenerateRecoveryConf.  It seems more modular to me to keep
> it as a separate variable in each program and have it passed down to the
> routine that writes the file.
>
> * From modularity also seems better to me to avoid a global variable
> 'recoveryconfcontents' and instead return the string from
> GenerateRecoveryConf to pass as a param to WriteRecoveryConf.
> (In fact, I wonder why the current structure is like it is, namely to
> have ReceiveAndUnpackTarFile write the file; why wouldn't be its caller
> be responsible for writing it?)
>

Reasonable to make common code include less variables. I can try modifying
the patches to remove the previously added variables below in the common
code.

+/* Contents of configuration file to be generated */
+extern PQExpBuffer recoveryconfcontents;
+
+extern bool writerecoveryconf;
+extern char *replication_slot;
+PGconn*conn;


>
> I wonder about putting this new file in src/fe_utils instead of keeping
> it in pg_basebackup and symlinking to pg_rewind.  Maybe if we make it a
> true module (recovery_config_gen.c) it makes more sense there.
>
> I thought some about where to put the common code also. It seems pg_rewind
and pg_basebackup are the only consumers of the small common code.  I doubt
it deserves a separate file under src/fe_utils.


>
> 0003:
>
> I still don't understand why we need a command-line option to do this.
> Why should it not be the default behavior?
>

This was discussed but frankly speaking I do not know how other postgres
users or enterprise providers handle this (probably some have own scripts?).
I could easily remove the option code if more and more people agree on that
or at least we could turn it on by default?

Thanks


Re: Commit fest 2019-09

2019-09-09 Thread Alvaro Herrera from 2ndQuadrant
On this CF's second week, numbers are:

  statusstring  │ week1 │ week2
┼───┼───
 Needs review   │   165 │   138
 Waiting on Author  │30 │44
 Ready for Committer│11 │ 5
 Returned with Feedback │ 1 │ 4
 Moved to next CF   │ 2 │ 4
 Committed  │14 │23
 Rejected   │ 1 │ 1
 Withdrawn  │ 4 │ 9
(8 filas)

One of the "withdrawn" ones was a duplicate entry (2087) for a bugfix patch
(1970).

We cleared a large fraction of the patches that were waiting on
committers.  It seems that we now need people (including committers) to
focus on patches that are "Needs Review".  I marked a bunch of those as
Waiting-on-Author; patch authors have a lot of pending work.

Here's the set of patches Ready-for-Committer.  Note the majority is
pgbench-related:

 patch_id │ name  
──┼───
 2217 │ pgbench - allow to create partitioned "account" tables
 2069 │ Expose queryid in pg_stat_activity in log_line_prefix
 2086 │ pgbench - extend initialization phase control
 1736 │ pgbench - add pseudo-random permutation function
 2091 │ pgbench - add \aset to store results of combined queries (\;)

I am inclined to think that 2217 is clearly something we want; 1736 is
something we probably want; and 2086 and 2091 are on the verge of being
unwanted things.

We have a number of apparently actionable bugfixes still:

 patch_id │   statusstring│ 
   name
──┼───┼
 2224 │ Needs review  │ Race condition in logical walsender causes long 
postgresql shutdown delay
 2026 │ Needs review  │ Spurious "apparent wraparound" via 
SimpleLruTruncate() rounding (data loss)
 2179 │ Needs review  │ Fix support for hypothetical indexes using BRIN 
access method
 2187 │ Needs review  │ Duplicated LSN in ReorderBuffer
 2221 │ Needs review  │ Add mandatory access control for TRUNCATE
 2044 │ Needs review  │ propagating replica identity to partitions
  528 │ Needs review  │ Fix the optimization to skip WAL-logging on 
table created in same transaction
 2025 │ Needs review  │ SimpleLruTruncate() mutual exclusion (data loss 
from lack thereof)
 1995 │ Needs review  │ Replica with large recovery_min_apply_delay is 
not receiving changes from master after restart until this delay is expired
 2215 │ Needs review  │ pg_upgrade fails with non-standard ACL
 2266 │ Needs review  │ Fix up partitionwise join on how equi-join 
conditions between the partition keys are identified
 2050 │ Needs review  │ Fix unique join costings
 2062 │ Needs review  │ Unaccent extension python script Issue in 
Windows
 2205 │ Needs review  │ CVE-2017-7484-induced bugs, or, btree cmp 
functions are not leakproof?
 2054 │ Needs review  │ fix for BUG #3720: wrong results at using ltree

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Fix Proposal - Deadlock Issue in Single User Mode When IO Failure Occurs

2019-09-09 Thread Amit Kapila
On Sat, Jul 27, 2019 at 6:22 AM Chengchao Yu  wrote:
>
> Thus, I have updated the patch v3 according to your suggestions. Could you 
> help to review again?
> Please let me know should you have more suggestions or feedbacks.
>

I have tried to look into this patch and I don't think it fixes the
problem.  Basically, I have tried the commands suggested by you in
single-user mode, create table; insert and then checkpoint.  Now, what
I see is almost the same behavior as explained by you in one of the
above emails with a slight difference which makes me think that the
fix you are proposing is not correct.  Below is what you told:

"The second type is in Step #4. At the time when “checkpoint” SQL
command is being executed, PG has already set up the before_shmem_exit
callbackShutdownPostgres(), which releases all lw-locks given
transaction or sub-transaction is on-going. So after the first IO
error, the buffer page’s lw-lock gets released successfully. However,
later ShutdownXLOG() is invoked, and PG tries to flush buffer pages
again, which results in the second IO error. Different from the first
time, this time, all the previous executed before/on_shmem_exit
callbacks are not invoked again due to the decrease of the indexes. So
lw-locks for buffer pages are not released when PG tries to get the
same buffer lock in AbortBufferIO(), and then PG process gets stuck."

The only difference is in the last line where for me it gives
assertion failure when trying to do ReleaseAuxProcessResources.  Below
is the callstack:

  postgres.exe!ExceptionalCondition(const char *
conditionName=0x00db0c78, const char * errorType=0x00db0c68, const
char * fileName=0x00db0c18, int lineNumber=1722)  Line 55 C
  postgres.exe!UnpinBuffer(BufferDesc * buf=0x052a104c, bool
fixOwner=true)  Line 1722 + 0x2f bytes C
  postgres.exe!ReleaseBuffer(int buffer=96)  Line 3367 + 0x17 bytes C
  postgres.exe!ResourceOwnerReleaseInternal(ResourceOwnerData *
owner=0x0141f6e8, 
phase=RESOURCE_RELEASE_BEFORE_LOCKS, bool isCommit=false, bool
isTopLevel=true)  Line 526 + 0x9 bytes C
  postgres.exe!ResourceOwnerRelease(ResourceOwnerData *
owner=0x0141f6e8, 
phase=RESOURCE_RELEASE_BEFORE_LOCKS, bool isCommit=false, bool
isTopLevel=true)  Line 484 + 0x17 bytes C
  postgres.exe!ReleaseAuxProcessResources(bool isCommit=false)  Line
861 + 0x15 bytes C
> postgres.exe!ReleaseAuxProcessResourcesCallback(int code=1, unsigned int 
> arg=0)  Line 881 + 0xa bytes C
  postgres.exe!shmem_exit(int code=1)  Line 272 + 0x1f bytes C
  postgres.exe!proc_exit_prepare(int code=1)  Line 194 + 0x9 bytes C
  postgres.exe!proc_exit(int code=1)  Line 107 + 0x9 bytes C
  postgres.exe!errfinish(int dummy=0, ...)  Line 538 + 0x7 bytes C
  postgres.exe!mdwrite(SMgrRelationData * reln=0x0147e140, ForkNumber
forknum=MAIN_FORKNUM, unsigned int blocknum=7, char *
buffer=0x0542dd00, bool skipFsync=false)  Line 713 + 0x4c bytes C
  postgres.exe!smgrwrite(SMgrRelationData * reln=0x0147e140,
ForkNumber forknum=MAIN_FORKNUM, unsigned int blocknum=7, char *
buffer=0x0542dd00, bool skipFsync=false)  Line 587 + 0x24 bytes C
  postgres.exe!FlushBuffer(BufferDesc * buf=0x052a104c,
SMgrRelationData * reln=0x0147e140)  Line 2759 + 0x1d bytes C
  postgres.exe!SyncOneBuffer(int buf_id=95, bool
skip_recently_used=false, WritebackContext * wb_context=0x012ccea0)
Line 2402 + 0xb bytes C
  postgres.exe!BufferSync(int flags=5)  Line 1992 + 0x15 bytes C
  postgres.exe!CheckPointBuffers(int flags=5)  Line 2586 + 0x9 bytes C
  postgres.exe!CheckPointGuts(unsigned __int64
checkPointRedo=22933176, int flags=5)  Line 8991 + 0x9 bytes C
  postgres.exe!CreateCheckPoint(int flags=5)  Line 8780 + 0x11 bytes C
  postgres.exe!ShutdownXLOG(int code=1, unsigned int arg=0)  Line 8333
+ 0x7 bytes C
  postgres.exe!shmem_exit(int code=1)  Line 272 + 0x1f bytes C
  postgres.exe!proc_exit_prepare(int code=1)  Line 194 + 0x9 bytes C
  postgres.exe!proc_exit(int code=1)  Line 107 + 0x9 bytes C
  postgres.exe!errfinish(int dummy=0, ...)  Line 538 + 0x7 bytes C
  postgres.exe!mdwrite(SMgrRelationData * reln=0x0147e140, ForkNumber
forknum=MAIN_FORKNUM, unsigned int blocknum=7, char *
buffer=0x0542dd00, bool skipFsync=false)  Line 713 + 0x4c bytes C
  postgres.exe!smgrwrite(SMgrRelationData * reln=0x0147e140,
ForkNumber forknum=MAIN_FORKNUM, unsigned int blocknum=7, char *
buffer=0x0542dd00, bool skipFsync=false)  Line 587 + 0x24 bytes C
  postgres.exe!FlushBuffer(BufferDesc * buf=0x052a104c,
SMgrRelationData * reln=0x0147e140)  Line 2759 + 0x1d bytes C
  postgres.exe!SyncOneBuffer(int buf_id=95, bool
skip_recently_used=false, WritebackContext * wb_context=0x012ce580)
Line 2402 + 0xb bytes C
  postgres.exe!BufferSync(int flags=44)  Line 1992 + 0x15 bytes C
  postgres.exe!CheckPointBuffers(int flags=44)  Line 2586 + 0x9 bytes C
  postgres.exe!CheckPointGuts(unsigned __int64
checkPointRedo=22933176, int flags=44)  Line 8991 + 0x9 bytes C
  postgres.exe!CreateCheckPoint(int flags=44)  Line 8780 + 0x11 bytes C
  

Re: [HACKERS] [PATCH] pageinspect function to decode infomasks

2019-09-09 Thread Alvaro Herrera from 2ndQuadrant
On 2019-Sep-08, Amit Kapila wrote:

> On Thu, Sep 5, 2019 at 2:17 PM Masahiko Sawada  wrote:
> >
> > Thanks. I hope the attached new patch fixes this issue.
> *
> +-- decode infomask flags as human readable flag names
> +CREATE FUNCTION heap_infomask_flags(
> +   infomask integer,
> +   infomask2 integer,
> +   decode_combined boolean DEFAULT false)
> +RETURNS text[]
> +AS 'MODULE_PATHNAME', 'heap_infomask_flags'
> 
> Isn't it better to name this function as tuple_infomask_flags or
> something like that?  The other functions that start their name with
> heap take page as input.  Also, I think the index-related functions
> that start with index name take blk/page as input.

I think that other table AMs are not necessarily going to use the same
infomask flags, so I think we should keep a name that is somehow
heapam-specific.  Maybe "heapam_infomask_flags" would be okay?

> + heap_infomask_flags(infomask integer, infomask2
> integer, decode_combined bool) returns text[]

> I think it is better to use one example for this function as we do for
> other functions in this doc.

Agreed.

> + 
> +  If decode_combined is set, combination flags like
> +  HEAP_XMIN_FROZEN are output instead of raw
> +  flags, HEAP_XMIN_COMMITTED and
> +  HEAP_XMIN_INVALID. Default value is
> +  false.
> + 
> 
> decode_combined should use parameter marker (like
> decode_combined).  Instead of saying
> "decode_combined" is set, you can use true/false terminology as that
> is what we use for this parameter.

Agreed.

> Some more comments:
> *
> +SELECT t_infomask, t_infomask2, flags
> +FROM heap_page_items
> (get_raw_page('test1', 0)),
> + LATERAL heap_infomask_flags(t_infomask,
> t_infomask2, true) m(flags);
> 
> Do we really need a LATERAL clause in the above kind of queries?
> AFAIU, the functions can reference the columns that exist in the FROM
> list, but I might be missing some point.

I think the spec allows the LATERAL keyword to be implicit in the case
of functions, so this seems a matter of style.  Putting LATERAL
explicitly seems slightly clearer, to me.

> +Datum
> +heap_infomask_flags(PG_FUNCTION_ARGS)
> +{
> + uint16 t_infomask = PG_GETARG_INT16(0);
> + uint16 t_infomask2 = PG_GETARG_INT16(1);
> + bool decode_combined = PG_GETARG_BOOL(2);

> All the functions in this file are allowed only for superusers and
> there is an explanation for the same as mentioned in the file header
> comments.  Is there a reason for this function to be different?

The other functions can crash if fed arbitrary input.  I don't think
this one can crash, so it seems okay for it not to be superuser-only.

> In any case, if you think that this function needs to behave
> differently w.r.t superuser privileges, then it is better to add some
> comments in the function header to explain the same.

Yeah.

> *
> +Datum
> +heap_infomask_flags(PG_FUNCTION_ARGS)
> {
> ..
> +
> + d = (Datum *) palloc0(sizeof(Datum) * bitcnt);
> 
> It seems we don't free this memory before leaving this function.  I
> think it shouldn't be a big problem as this will be normally allocated
> in ExprContext and shouldn't last for long, but if there is no strong
> reason, I think it is better to free it.

Agreed.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
On Fri, Aug 30, 2019 at 6:52 PM Jeevan Ladhe 
wrote:

> Here are some comments:
> Or maybe we can just say:
> "cannot verify checksum in file \"%s\"" if checksum requested, disable the
> checksum and leave it to the following message:
>
> +   ereport(WARNING,
> +   (errmsg("file size (%d) not in multiple of page size
> (%d), sending whole file",
> +   (int) cnt, BLCKSZ)));
>
>
Opted for the above suggestion.


>
> I think we should give the user hint from where he should be reading the
> input
> lsn for incremental backup in the --help option as well as documentation?
> Something like - "To take an incremental backup, please provide value of
> "--lsn"
> as the "START WAL LOCATION" of previously taken full backup or incremental
> backup from backup_lable file.
>

Added this in the documentation. In help, it will be too crowdy.


> pg_combinebackup:
>
> +static bool made_new_outputdata = false;
> +static bool found_existing_outputdata = false;
>
> Both of these are global, I understand that we need them global so that
> they are
> accessible in cleanup_directories_atexit(). But they are passed to
> verify_dir_is_empty_or_create() as parameters, which I think is not needed.
> Instead verify_dir_is_empty_or_create() can directly change the globals.
>

After adding support for a tablespace, these two functions take different
values depending upon the context.


> The current logic assumes the incremental backup directories are to be
> provided
> as input in the serial order the backups were taken. This is bit confusing
> unless clarified in pg_combinebackup help menu or documentation. I think we
> should clarify it at both the places.
>

Added in doc.


>
> I think scan_directory() should be rather renamed as do_combinebackup().
>

I am not sure about this renaming. scan_directory() is called recursively
to scan each sub-directories too. If we rename it then it is not actually
recursively doing a combinebackup. Combine backup is a single whole
process.

-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: msys2 vs pg_upgrade/test.sh

2019-09-09 Thread Andrew Dunstan


On 9/9/19 4:48 AM, Peter Eisentraut wrote:
> On 2019-09-09 00:06, Andrew Dunstan wrote:
>> Diagnosing this took quite a lot of time and detective work. For some
>> reason I don't quite understand, when calling the Windows command
>> processor in a modern msys2/WindowsServer2019 installation, you need to
>> double the slash, thus:
>>
>>     cmd //c foo.bat
>>
>> Some Internet postings at least seem to suggest this is by design. (FSVO
>> "design")
>>
>> I tested this on older versions and the change appears to work, so I
>> propose to apply the attached patch.
> If we're worried about messing things up for non-msys2 environments, we
> could also set MSYS2_ARG_CONV_EXCL instead; see
> .
> According to that page, that would seem to be the more proper way to do it.
>


Nice find, thanks, I'll do it that way.


cheers


andrew



-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: refactoring - share str2*int64 functions

2019-09-09 Thread Andres Freund
Hi,

On 2019-09-09 20:57:46 +0900, Michael Paquier wrote:
> > I don't really buy that saving a few copies of a strings is worth that
> > much. But if you really want to do that, the right approach imo would be
> > to move the error reporting into a separate function. I.e. something
> > 
> > void pg_attribute_noreturn()
> > pg_strtoint_error(pg_strtoint_status status, const char *input, const char 
> > *type)
> > 
> > that you'd call in small wrappers. Something like
> 
> I am not completely sure if we should do that either anyway.  Another
> approach would be to try to make the callers of the routines generate
> their own error messages.  The errors we have now are really linked to
> the data types we have in core for signed integers (smallint, int,
> bigint).  In most cases do they really make sense (varlena.c, pqmq.c,
> etc.)?

I think there's plenty places that ought to use the checked functions,
even if they currently don't. And typically calling in the caller will
actually be slightly less efficient, than an out-of-line function like I
was proposing above, because it'll be in-line code for infrequent code.

But ISTM all of them ought to just use the C types, rather than the SQL
types however. Since in the above proposal the caller determines the
type names, if you want a different type - like the SQL input routines -
can just invoke pg_strtoint_error() themselves (or just have it open
coded).


> And for errors which should never happen we could just use
> elog().  For the input functions of int2/4/8 we still need the
> existing errors of course.

Right, there it makes sense to continue to refer the SQL level types.


> >> So, it seems to me that if we want to have a consolidation of
> >> everything, we basically need to have the generic error messages for
> >> the backend directly into common/string.c where the routines are
> >> refactored, and I think that we should do the following based on the
> >> patch attached:
> > 
> >> - Just remove pg_strtoint(), and move the error generation logic in
> >> common/string.c.
> > 
> > I'm not quite sure what you mean by moving the "error generation logic"?
> 
> I was referring to the error messages we have on HEAD in scanint8()
> and pg_strtoint16() for bad inputs and overflows.

Not the right direction imo.



> >>  static void pcb_error_callback(void *arg);
> >> @@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int 
> >> location)
> >>
> >>case T_Float:
> >>/* could be an oversize integer as well as a float ... 
> >> */
> >> -  if (scanint8(strVal(value), true, ))
> >> +  if (pg_strtoint64(strVal(value), ) == 
> >> PG_STRTOINT_OK)
> >>{
> >>/*
> >> * It might actually fit in int32. Probably 
> >> only INT_MIN can
> > 
> > Not for this change, but we really ought to move away from this crappy
> > logic. It's really bonkers to have T_Float represent large integers and
> > floats.
> 
> I am not sure but what are you suggesting here?

I'm saying that we shouldn't need the whole logic of trying to parse the
string as an int, and then fail to float if it's not that. But that it's
not this patchset's task to fix this.


> >> +  /* skip leading spaces */
> >> +  while (likely(*ptr) && isspace((unsigned char) *ptr))
> >> +  ptr++;
> >> +
> >> +  /* handle sign */
> >> +  if (*ptr == '-')
> >> +  {
> >> +  ptr++;
> >> +  neg = true;
> >> +  }
> >> +  else if (*ptr == '+')
> >> +  ptr++;
> > 
> >> +  /* require at least one digit */
> >> +  if (unlikely(!isdigit((unsigned char) *ptr)))
> >> +  return PG_STRTOINT_SYNTAX_ERROR;
> > 
> > Wonder if there's an argument for moving this behaviour to someplace
> > else - in most cases we really don't expect whitespace, and checking for
> > it is unnecessary overhead.
> 
> Not sure about that.  I would keep the scope of the patch simple as of
> now, where we make sure that we have the right interface for
> everything.  There are a couple of extra improvements which could be
> done afterwards, and if we move everything in the same place that
> should be easier to move on with more improvements.  Hopefully.

The only reason for thinking about it now is that we'd then avoid
changing the API twice.

Greetings,

Andres Freund




Re: Compiler warnings with MinGW

2019-09-09 Thread Magnus Hagander
On Sat, Sep 7, 2019 at 4:58 AM Michael Paquier  wrote:

> On Sat, Sep 07, 2019 at 12:11:25AM +0200, Peter Eisentraut wrote:
> > I'm not sure exactly what the upstream of mingw is these days, but I
> > think the original issue that led to 811be893 has long been fixed [0],
> > and the other stuff in mingwcompat.c is also no longer relevant [1].  I
> > think mingwcompat.c could be removed altogether.  I'm not sure to what
> > extent we need to support 5+ year old mingw versions.
>
> On HEAD I would not be against removing that as this leads to a
> cleanup of our code.  For MSVC, we only support VS 2013~ on HEAD, so
> saying that we don't support MinGW older than what was proposed 5
> years ago sounds sensible.
>

+1, definitely.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Set of header files for Ryu floating-point stuff in src/common/

2019-09-09 Thread Michael Paquier
Hi all,
(Andrew G. in CC)

We have the following set of header files in src/common/:
digit_table.h
d2s_full_table.h
d2s_intrinsics.h
ryu_common.h

Shouldn't all these files be in src/include/common/ instead?  HEAD is
not really consistent with the common practice here.

Thanks,
--
Michael


signature.asc
Description: PGP signature


Re: refactoring - share str2*int64 functions

2019-09-09 Thread Michael Paquier
On Mon, Sep 09, 2019 at 03:17:38AM -0700, Andres Freund wrote:
> On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
> I *VEHEMENTLY* oppose the introduction of any future pseudo-generic
> routines taking the type width as a parameter. They're weakening the
> type system and they're unnecessarily inefficient.

I saw that, using the previous wrapper increases the time of one call
from roughly 0.9ms to 1.1ns.

> I don't really buy that saving a few copies of a strings is worth that
> much. But if you really want to do that, the right approach imo would be
> to move the error reporting into a separate function. I.e. something
> 
> void pg_attribute_noreturn()
> pg_strtoint_error(pg_strtoint_status status, const char *input, const char 
> *type)
> 
> that you'd call in small wrappers. Something like

I am not completely sure if we should do that either anyway.  Another
approach would be to try to make the callers of the routines generate
their own error messages.  The errors we have now are really linked to
the data types we have in core for signed integers (smallint, int,
bigint).  In most cases do they really make sense (varlena.c, pqmq.c,
etc.)?  And for errors which should never happen we could just use
elog().  For the input functions of int2/4/8 we still need the
existing errors of course.

>> So, it seems to me that if we want to have a consolidation of
>> everything, we basically need to have the generic error messages for
>> the backend directly into common/string.c where the routines are
>> refactored, and I think that we should do the following based on the
>> patch attached:
> 
>> - Just remove pg_strtoint(), and move the error generation logic in
>> common/string.c.
> 
> I'm not quite sure what you mean by moving the "error generation logic"?

I was referring to the error messages we have on HEAD in scanint8()
and pg_strtoint16() for bad inputs and overflows.

> Seems like these actually could just ought to use the error-checked
> variants. And I think it ought to change all of
> READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one

Right.

>>  static void pcb_error_callback(void *arg);
>> @@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int 
>> location)
>>
>>  case T_Float:
>>  /* could be an oversize integer as well as a float ... 
>> */
>> -if (scanint8(strVal(value), true, ))
>> +if (pg_strtoint64(strVal(value), ) == 
>> PG_STRTOINT_OK)
>>  {
>>  /*
>>   * It might actually fit in int32. Probably 
>> only INT_MIN can
> 
> Not for this change, but we really ought to move away from this crappy
> logic. It's really bonkers to have T_Float represent large integers and
> floats.

I am not sure but what are you suggesting here?

>> +/* skip leading spaces */
>> +while (likely(*ptr) && isspace((unsigned char) *ptr))
>> +ptr++;
>> +
>> +/* handle sign */
>> +if (*ptr == '-')
>> +{
>> +ptr++;
>> +neg = true;
>> +}
>> +else if (*ptr == '+')
>> +ptr++;
> 
>> +/* require at least one digit */
>> +if (unlikely(!isdigit((unsigned char) *ptr)))
>> +return PG_STRTOINT_SYNTAX_ERROR;
> 
> Wonder if there's an argument for moving this behaviour to someplace
> else - in most cases we really don't expect whitespace, and checking for
> it is unnecessary overhead.

Not sure about that.  I would keep the scope of the patch simple as of
now, where we make sure that we have the right interface for
everything.  There are a couple of extra improvements which could be
done afterwards, and if we move everything in the same place that
should be easier to move on with more improvements.  Hopefully.
--
Michael


signature.asc
Description: PGP signature


Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
On Tue, Sep 3, 2019 at 12:11 PM Dilip Kumar  wrote:

> On Fri, Aug 16, 2019 at 3:54 PM Jeevan Chalke
>  wrote:
> >
> 0003:
> +/*
> + * When to send the whole file, % blocks modified (90%)
> + */
> +#define WHOLE_FILE_THRESHOLD 0.9
>
> How this threshold is selected.  Is it by some test?
>

Currently, it is set arbitrarily. If required, we will make it a GUC.


>
> - magic number, currently 0 (4 bytes)
> I think in the patch we are using  (#define INCREMENTAL_BACKUP_MAGIC
> 0x494E4352) as a magic number, not 0
>

Yes. Robert too reported this. Updated the commit message.


>
> Can we breakdown this function in 2-3 functions.  At least creating a
> file map can directly go to a separate function.
>

Separated out filemap changes to separate function. Rest kept as is to have
an easy followup.


>
> I have read 0003 and 0004 patch and there are few cosmetic comments.
>

Can you please post those too?

Other comments are fixed.


>
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>

Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
On Wed, Sep 4, 2019 at 5:21 PM Dilip Kumar  wrote:

>
>  I have not yet completed the review for 0004, but I have few more
> comments.  Tomorrow I will try to complete the review and some testing
> as well.
>
> 1. It seems that the output full backup generated with
> pg_combinebackup also contains the "INCREMENTAL BACKUP REFERENCE WAL
> LOCATION".  It seems confusing
> because now this is a full backup, not the incremental backup.
>

Yes, that was remaining and was in my TODO.
Done in the new patchset. Also, taking --label as an input like
pg_basebackup.


>
> 2.
> + memset(outblocks, 0, sizeof(FileOffset) * RELSEG_SIZE);
>
> I don't think you need to memset this explicitly as you can initialize
> the array itself no?
> FileOffset outblocks[RELSEG_SIZE] = {{0}}
>

I didn't see any issue with memset either but changed this per your
suggestion.


>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
On Tue, Aug 27, 2019 at 11:59 PM Robert Haas  wrote:

> On Fri, Aug 16, 2019 at 6:23 AM Jeevan Chalke
>  wrote:
> > [ patches ]
>
> Reviewing 0002 and 0003:
>
> - Commit message for 0003 claims magic number and checksum are 0, but
> that (fortunately) doesn't seem to be the case.
>

Oops, updated commit message.


>
> - looks_like_rel_name actually checks whether it looks like a
> *non-temporary* relation name; suggest adjusting the function name.
>
> - The names do_full_backup and do_incremental_backup are quite
> confusing because you're really talking about what to do with one
> file.  I suggest sendCompleteFile() and sendPartialFile().
>

Changed function names.


>
> - Is there any good reason to have 'refptr' as a global variable, or
> could we just pass the LSN around via function arguments?  I know it's
> just mimicking startptr, but storing startptr in a global variable
> doesn't seem like a great idea either, so if it's not too annoying,
> let's pass it down via function arguments instead.  Also, refptr is a
> crappy name (even worse than startptr); whether we end up with a
> global variable or a bunch of local variables, let's make the name(s)
> clear and unambiguous, like incremental_reference_lsn.  Yeah, I know
> that's long, but I still think it's better than being unclear.
>

Renamed variable.
However, I have kept that as global only as it needs many functions to
change their signature, like, sendFile(), sendDir(), sendTablspeace() etc.


> - do_incremental_backup looks like it can never report an error from
> fread(), which is bad.  But I see that this is just copied from the
> existing code which has the same problem, so I started a separate
> thread about that.
>
> - I think that passing cnt and blkindex to verify_page_checksum()
> doesn't look very good from an abstraction point of view.  Granted,
> the existing code isn't great either, but I think this makes the
> problem worse.  I suggest passing "int backup_distance" to this
> function, computed as cnt - BLCKSZ * blkindex.  Then, you can
> fseek(-backup_distance), fread(BLCKSZ), and then fseek(backup_distance
> - BLCKSZ).
>

Yep. Done these changes in the refactoring patch.


>
> - While I generally support the use of while and for loops rather than
> goto for flow control, a while (1) loop that ends with a break is
> functionally a goto anyway.  I think there are several ways this could
> be revised.  The most obvious one is probably to use goto, but I vote
> for inverting the sense of the test: if (PageIsNew(page) ||
> PageGetLSN(page) >= startptr) break; This approach also saves a level
> of indentation for more than half of the function.
>

I have used this new inverted condition, but we still need a while(1) loop.


> - I am not sure that it's a good idea for sendwholefile = true to
> result in dumping the entire file onto the wire in a single CopyData
> message.  I don't know of a concrete problem in typical
> configurations, but someone who increases RELSEG_SIZE might be able to
> overflow CopyData's length word.  At 2GB the length word would be
> negative, which might break, and at 4GB it would wrap around, which
> would certainly break.  See CopyData in
> https://www.postgresql.org/docs/12/protocol-message-formats.html  To
> avoid this issue, and maybe some others, I suggest defining a
> reasonably large chunk size, say 1MB as a constant in this file
> someplace, and sending the data as a series of chunks of that size.
>

OK. Done as per the suggestions.


>
> - I don't think that the way concurrent truncation is handled is
> correct for partial files.  Right now it just falls through to code
> which appends blocks of zeroes in either the complete-file or
> partial-file case.  I think that logic should be moved into the
> function that handles the complete-file case.  In the partial-file
> case, the blocks that we actually send need to match the list of block
> numbers we promised to send.  We can't just send the promised blocks
> and then tack a bunch of zero-filled blocks onto the end that the file
> header doesn't know about.
>

Well, in partial file case we won't end up inside that block. So we are
never sending zeroes at the end in case of partial file.


> - For reviewer convenience, please use the -v option to git
> format-patch when posting and reposting a patch series.  Using -v2,
> -v3, etc. on successive versions really helps.
>

Sure. Thanks for letting me know about this option.


>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
On Tue, Aug 27, 2019 at 4:46 PM vignesh C  wrote:

> Few comments:
> Comment:
> + buf = (char *) malloc(statbuf->st_size);
> + if (buf == NULL)
> + ereport(ERROR,
> + (errcode(ERRCODE_OUT_OF_MEMORY),
> + errmsg("out of memory")));
> +
> + if ((cnt = fread(buf, 1, statbuf->st_size, fp)) > 0)
> + {
> + Bitmapset  *mod_blocks = NULL;
> + int nmodblocks = 0;
> +
> + if (cnt % BLCKSZ != 0)
> + {
>
> We can use same size as full page size.
> After pg start backup full page write will be enabled.
> We can use the same file size to maintain data consistency.
>

Can you please explain which size?
The aim here is to read entire file in-memory and thus used
statbuf->st_size.

Comment:
> Should we check if it is same timeline as the system's timeline.
>

At the time of taking the incremental backup, we can't check that.
However, while combining, I made sure that the timeline is the same for all
backups.


>
> Comment:
>
> Should we support compression formats supported by pg_basebackup.
> This can be an enhancement after the functionality is completed.
>

For the incremental backup, it just works out of the box.
For combining backup, as discussed up-thread, the user has to
uncompress first, combine them, compress if required.


> Comment:
> We should provide some mechanism to validate the backup. To identify
> if some backup is corrupt or some file is missing(deleted) in a
> backup.
>

Maybe, but not for the first version.


> Comment:
> +/*
> + * When to send the whole file, % blocks modified (90%)
> + */
> +#define WHOLE_FILE_THRESHOLD 0.9
> +
> This can be user configured value.
> This can be an enhancement after the functionality is completed.
>

Yes.


> Comment:
> We can add a readme file with all the details regarding incremental
> backup and combine backup.
>

Will have a look.


>
> Regards,
> Vignesh
> EnterpriseDB: http://www.enterprisedb.com
>

Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: block-level incremental backup

2019-09-09 Thread Jeevan Chalke
Hi,

Attached new set of patches adding support for the tablespace handling.

This patchset also fixes the issues reported by Vignesh, Robert, Jeevan
Ladhe,
and Dilip Kumar.

Please have a look and let me know if I  missed any comments to account.

Thanks
-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company
From 7cc2fb70188676406ca883055467aff602438fcd Mon Sep 17 00:00:00 2001
From: Jeevan Chalke 
Date: Mon, 9 Sep 2019 10:38:01 +0530
Subject: [PATCH v2 1/4] Add support for command line option to pass LSN.

This adds [ LSN 'lsn' ] to BASE_BACKUP command and --lsn=LSN to
the pg_basebackup binary.

Also, add small tests.
---
 src/backend/replication/basebackup.c | 20 
 src/backend/replication/repl_gram.y  |  6 ++
 src/backend/replication/repl_scanner.l   |  1 +
 src/bin/pg_basebackup/pg_basebackup.c| 15 +--
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 12 +++-
 5 files changed, 51 insertions(+), 3 deletions(-)

diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index 6aab8d7..e72bf8e 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -38,6 +38,7 @@
 #include "storage/ipc.h"
 #include "storage/reinit.h"
 #include "utils/builtins.h"
+#include "utils/pg_lsn.h"
 #include "utils/ps_status.h"
 #include "utils/relcache.h"
 #include "utils/timestamp.h"
@@ -52,6 +53,7 @@ typedef struct
 	bool		includewal;
 	uint32		maxrate;
 	bool		sendtblspcmapfile;
+	XLogRecPtr 	lsn;
 } basebackup_options;
 
 
@@ -652,6 +654,7 @@ parse_basebackup_options(List *options, basebackup_options *opt)
 	bool		o_maxrate = false;
 	bool		o_tablespace_map = false;
 	bool		o_noverify_checksums = false;
+	bool		o_lsn = false;
 
 	MemSet(opt, 0, sizeof(*opt));
 	foreach(lopt, options)
@@ -740,6 +743,23 @@ parse_basebackup_options(List *options, basebackup_options *opt)
 			noverify_checksums = true;
 			o_noverify_checksums = true;
 		}
+		else if (strcmp(defel->defname, "lsn") == 0)
+		{
+			bool		have_error = false;
+
+			if (o_lsn)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg("duplicate option \"%s\"", defel->defname)));
+			o_lsn = true;
+
+			/* Validate given LSN and convert it into XLogRecPtr. */
+			opt->lsn = pg_lsn_in_internal(strVal(defel->arg), _error);
+			if (XLogRecPtrIsInvalid(opt->lsn))
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_TEXT_REPRESENTATION),
+		 errmsg("invalid value for LSN")));
+		}
 		else
 			elog(ERROR, "option \"%s\" not recognized",
  defel->defname);
diff --git a/src/backend/replication/repl_gram.y b/src/backend/replication/repl_gram.y
index c4e11cc..c24d319 100644
--- a/src/backend/replication/repl_gram.y
+++ b/src/backend/replication/repl_gram.y
@@ -87,6 +87,7 @@ static SQLCmd *make_sqlcmd(void);
 %token K_EXPORT_SNAPSHOT
 %token K_NOEXPORT_SNAPSHOT
 %token K_USE_SNAPSHOT
+%token K_LSN
 
 %type 	command
 %type 	base_backup start_replication start_logical_replication
@@ -214,6 +215,11 @@ base_backup_opt:
   $$ = makeDefElem("noverify_checksums",
    (Node *)makeInteger(true), -1);
 }
+			| K_LSN SCONST
+{
+  $$ = makeDefElem("lsn",
+   (Node *)makeString($2), -1);
+}
 			;
 
 create_replication_slot:
diff --git a/src/backend/replication/repl_scanner.l b/src/backend/replication/repl_scanner.l
index 380faeb..77b5af4 100644
--- a/src/backend/replication/repl_scanner.l
+++ b/src/backend/replication/repl_scanner.l
@@ -107,6 +107,7 @@ EXPORT_SNAPSHOT		{ return K_EXPORT_SNAPSHOT; }
 NOEXPORT_SNAPSHOT	{ return K_NOEXPORT_SNAPSHOT; }
 USE_SNAPSHOT		{ return K_USE_SNAPSHOT; }
 WAIT{ return K_WAIT; }
+LSN	{ return K_LSN; }
 
 ","{ return ','; }
 ";"{ return ';'; }
diff --git a/src/bin/pg_basebackup/pg_basebackup.c b/src/bin/pg_basebackup/pg_basebackup.c
index 7986872..1791853 100644
--- a/src/bin/pg_basebackup/pg_basebackup.c
+++ b/src/bin/pg_basebackup/pg_basebackup.c
@@ -105,6 +105,7 @@ static bool temp_replication_slot = true;
 static bool create_slot = false;
 static bool no_slot = false;
 static bool verify_checksums = true;
+static char *lsn = NULL;
 
 static bool success = false;
 static bool made_new_pgdata = false;
@@ -341,6 +342,7 @@ usage(void)
 			 " include required WAL files with specified method\n"));
 	printf(_("  -z, --gzip compress tar output\n"));
 	printf(_("  -Z, --compress=0-9 compress tar output with given compression level\n"));
+	printf(_("  --lsn=LSN  incremental backup, using LSN as threshold\n"));
 	printf(_("\nGeneral options:\n"));
 	printf(_("  -c, --checkpoint=fast|spread\n"
 			 " set fast or spread checkpointing\n"));
@@ -1805,6 +1807,7 @@ BaseBackup(void)
 maxServerMajor;
 	int			serverVersion,
 serverMajor;
+	char	   *lsn_clause = NULL;
 
 	Assert(conn != NULL);
 
@@ -1871,8 +1874,11 @@ 

Re: Fetching timeline during recovery

2019-09-09 Thread Fujii Masao
On Sat, Sep 7, 2019 at 12:06 AM Jehan-Guillaume de Rorthais
 wrote:
>
> On Wed, 4 Sep 2019 00:32:03 +0900
> Fujii Masao  wrote:
>
> > On Mon, Jul 29, 2019 at 7:26 PM Jehan-Guillaume de Rorthais
> >  wrote:
> > >
> > > On Fri, 26 Jul 2019 18:22:25 +0200
> > > Jehan-Guillaume de Rorthais  wrote:
> > >
> > > > On Fri, 26 Jul 2019 10:02:58 +0200
> > > > Jehan-Guillaume de Rorthais  wrote:
> [...]
> > > > Please, find in attachment a new version of the patch. It now creates 
> > > > two
> > > > new fonctions:
> > > >
> > > >   pg_current_wal_tl()
> > > >   pg_last_wal_received_tl()
> > >
> > > I just found I forgot to use PG_RETURN_INT32 in pg_last_wal_received_tl().
> > > Please find the corrected patch in attachment:
> > > 0001-v3-Add-functions-to-get-timeline.patch
> >
> > Thanks for the patch! Here are some comments from me.
>
> Thank you for your review!
>
> Please, find in attachment the v4 of the patch:
> 0001-v4-Add-functions-to-get-timeline.patch

Thanks for updating the patch!

Should we add regression tests for these functions? For example,
what about using these functions to check the timeline switch case,
in src/test/recovery/t/004_timeline_switch.pl?

>
> Answers bellow.
>
> > You need to write the documentation explaining the functions
> > that you're thinking to add.
>
> Done.

Thanks!

+   Get current write-ahead log timeline

I'm not sure if "current write-ahead log timeline" is proper word.
"timeline ID of current write-ahead log" is more appropriate?

+   int
+   Get last write-ahead log timeline received and sync to disk by
+streaming replication.

Same as above. I think that "timeline ID of last write-ahead log received
and sync to disk ..." is better here.

Like pg_last_wal_receive_lsn(), something like "If recovery has
completed this will remain static at the value of the last WAL
record received and synced to disk during recovery.
If streaming replication is disabled, or if it has not yet started,
the function returns NULL." should be in this description?

>
> > +/*
> > + * Returns the current timeline on a production cluster
> > + */
> > +Datum
> > +pg_current_wal_tl(PG_FUNCTION_ARGS)

I think that "tl" in the function name should be "tli". "tli" is used
used for other functions and views related to timeline, e.g.,
pg_stat_wal_receiver.received_tli. Thought?

> >
> > The timeline ID that this function returns seems almost
> > the same as pg_control_checkpoint().timeline_id,
> > when the server is in production. So I'm not sure
> > if it's worth adding that new function.
>
> pg_control_checkpoint().timeline_id is read from the controldata file on disk
> which is asynchronously updated with the real status of the local cluster.
> Right after a promotion, fetching the TL from pg_control_checkpoint() is wrong
> and can cause race conditions on client side.

Understood.

> > The timeline ID that this function returns is the same as
> > pg_stat_wal_receiver.received_tli while walreceiver is running.
> > But when walreceiver is not running, pg_stat_wal_receiver returns
> > no record, and pg_last_wal_received_tl() would be useful to
> > get the timeline only in this case. Is this my understanding right?
>
> Exactly.

I'm just imaging that some users want to use pg_last_wal_receive_lsn() and
pg_last_wal_receive_tli() together to, e.g., get the name of WAL file received
last. But there can be a corner case where the return values of
pg_last_wal_receive_lsn() and of pg_last_wal_receive_tli() are inconsistent.
This can happen because those values are NOT gotten within single lock.
That is, each function takes each lock to get each value.

So, to avoid that corner case and get consistent WAL file name,
we might want to have the function that gets both LSN and
timeline ID of the last received WAL record within single lock
(i.e., just uses GetWalRcvWriteRecPtr()) and returns them.
Thought?

Regards,

-- 
Fujii Masao




Re: Minimal logical decoding on standbys

2019-09-09 Thread Amit Khandekar
On Tue, 3 Sep 2019 at 23:10, Alvaro Herrera  wrote:
>
> On 2019-Jul-19, Amit Khandekar wrote:
>
> > Attached are the split patches. Included is an additional patch that
> > has doc changes. Here is what I have added in the docs. Pasting it
> > here so that all can easily spot how it is supposed to behave, and to
> > confirm that we are all on the same page :
>
> ... Apparently, this patch was not added to the commitfest for some
> reason; and another patch that *is* in the commitfest has been said to
> depend on this one (Petr's https://commitfest.postgresql.org/24/1961/
> which hasn't been updated in quite a while and has received no feedback
> at all, not even from the listed reviewer Shaun Thomas).  To make
> matters worse, Amit's patchset no longer applies.
>
> What I would like to do is add a link to this thread to CF's 1961 entry
> and then update all these patches, in order to get things moving.

Hi Alvaro,

Thanks for notifying about this. Will work this week on rebasing this
patchset and putting it into the 2019-11 commit fest.




Re: Batch insert in CTAS/MatView code

2019-09-09 Thread Paul Guo
On Fri, Aug 2, 2019 at 2:55 AM Heikki Linnakangas  wrote:

> On 17/06/2019 15:53, Paul Guo wrote:
> > I noticed that to do batch insert, we might need additional memory copy
> > sometimes comparing with "single insert"
> > (that should be the reason that we previously saw a bit regressions) so a
> > good solution seems to fall back
> > to "single insert" if the tuple length is larger than a threshold. I set
> > this as 2000 after quick testing.
>
> Where does the additional memory copy come from? Can we avoid doing it
> in the multi-insert case?


Hi Heikki,

Sorry for the late reply. I took some time on looking at & debugging the
code of TupleTableSlotOps
of various TupleTableSlot types carefully, especially the
BufferHeapTupleTableSlot case on which
we seemed to see regression if no threshold is set, also debugging &
testing more of the CTAS case.
I found my previous word "additional memory copy" (mainly tuple content
copy against single insert)
is wrong based on the latest code (probably is wrong also with previous
code). So in theory
we should not worry about additional tuple copy overhead now, and then I
tried the patch without setting
multi-insert threshold as attached.

To make test results more stable, this time I run a simple ' select
count(*) from tbl' before each CTAS to
warm up the shared buffer, run checkpoint before each CTAS, disable
autovacuum by setting
'autovacuum = off', set larger shared buffers (but < 25% of total memory
which is recommended
by PG doc) so that CTAS all hits shared buffer read if there exists warm
buffers (double-checked via
explain(analyze, buffers)). These seem to be reasonable for performance
testing. Each kind of CTAS
testing is run three times (Note before each run we do warm up and
checkpoint as mentioned).

I mainly tested the t12 (normal table with tuple size ~ 2k) case since for
others our patch either
performs better or similarly.

Patch: 1st_run 2nd_run3rd_run

t12_BufferHeapTuple 7883.400  7549.9668090.080
t12_Virtual 8041.637   8191.3178182.404

Baseline: 1st_run 2nd_run3rd_run

t12_BufferHeapTuple: 8264.290  7508.410   7681.702
t12_Virtual   8167.792  7970.537   8106.874

I actually roughly tested other tables we mentioned also (t11 and t14) -
the test results and conclusions are same.
t12_BufferHeapTuple means: create table tt as select * from t12;
t12_Virtual means: create table tt as select *partial columns* from t12;

So it looks like for t12 the results between our code and baseline are
similar so not setting
threshoud seem to be good though it looks like t12_BufferHeapTuple test
results varies a
lot (at most 0.5 seconds) for both our patch and baseline vs the virtual
case which is quite stable.

This actually confused me a bit given we've cached the source table in
shared buffers. I suspected checkpoint affects,
so I disabled checkpoint by setting max_wal_size = 3000 during CTAS, the
BufferHeapTuple case (see below)
still varies some. I'm not sure what's the reason but this does not seem to
a be blocker for the patch.
Patch: 1st_run 2nd_run3rd_run
t12_BufferHeapTuple 7717.3047413.2597452.773
t12_Virtual  7445.742 7483.148   7593.583

Baseline: 1st_run 2nd_run3rd_run
t12_BufferHeapTuple  8186.302 7736.541   7759.056
t12_Virtual   8004.880  8096.712   7961.483


v4-0001-Multi-insert-in-Create-Table-As.patch
Description: Binary data


Re: Attempt to consolidate reading of XLOG page

2019-09-09 Thread Antonin Houska
Alvaro Herrera  wrote:

> Hi Antonin, could you please rebase again?

Attached.

-- 
Antonin Houska
Web: https://www.cybertec-postgresql.com

>From 7b15d7bbdae13c743ddeae10b8ff79e9b02d8243 Mon Sep 17 00:00:00 2001
From: Antonin Houska 
Date: Mon, 9 Sep 2019 11:53:54 +0200
Subject: [PATCH 1/4] Remove TLI from some argument lists.

The timeline information is available to caller via XLogReaderState. Now that
XLogRead() is gonna be (sometimes) responsible for determining the TLI, it
would have to be added the (TimeLineID *) argument too, just to be consistent
with the current coding style. Since XLogRead() updates also other
position-specific fields of XLogReaderState, it seems simpler if we remove the
output argument from XLogPageReadCB and always report the TLI via
XLogReaderState.
---
 src/backend/access/transam/xlog.c  |  7 +++
 src/backend/access/transam/xlogreader.c|  6 +++---
 src/backend/access/transam/xlogutils.c | 11 ++-
 src/backend/replication/logical/logicalfuncs.c |  4 ++--
 src/backend/replication/walsender.c|  2 +-
 src/bin/pg_rewind/parsexlog.c  |  9 -
 src/bin/pg_waldump/pg_waldump.c|  2 +-
 src/include/access/xlogreader.h|  8 +++-
 src/include/access/xlogutils.h |  5 ++---
 src/include/replication/logicalfuncs.h |  2 +-
 10 files changed, 26 insertions(+), 30 deletions(-)

diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 6876537b62..cd948dbefc 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -885,8 +885,7 @@ static int	XLogFileRead(XLogSegNo segno, int emode, TimeLineID tli,
 		 int source, bool notfoundOk);
 static int	XLogFileReadAnyTLI(XLogSegNo segno, int emode, int source);
 static int	XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr,
-		 int reqLen, XLogRecPtr targetRecPtr, char *readBuf,
-		 TimeLineID *readTLI);
+		 int reqLen, XLogRecPtr targetRecPtr, char *readBuf);
 static bool WaitForWALToBecomeAvailable(XLogRecPtr RecPtr, bool randAccess,
 		bool fetching_ckpt, XLogRecPtr tliRecPtr);
 static int	emode_for_corrupt_record(int emode, XLogRecPtr RecPtr);
@@ -11523,7 +11522,7 @@ CancelBackup(void)
  */
 static int
 XLogPageRead(XLogReaderState *xlogreader, XLogRecPtr targetPagePtr, int reqLen,
-			 XLogRecPtr targetRecPtr, char *readBuf, TimeLineID *readTLI)
+			 XLogRecPtr targetRecPtr, char *readBuf)
 {
 	XLogPageReadPrivate *private =
 	(XLogPageReadPrivate *) xlogreader->private_data;
@@ -11640,7 +11639,7 @@ retry:
 	Assert(targetPageOff == readOff);
 	Assert(reqLen <= readLen);
 
-	*readTLI = curFileTLI;
+	xlogreader->readPageTLI = curFileTLI;
 
 	/*
 	 * Check the page header immediately, so that we can retry immediately if
diff --git a/src/backend/access/transam/xlogreader.c b/src/backend/access/transam/xlogreader.c
index a66e3324b1..2184f4291d 100644
--- a/src/backend/access/transam/xlogreader.c
+++ b/src/backend/access/transam/xlogreader.c
@@ -559,7 +559,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 
 		readLen = state->read_page(state, targetSegmentPtr, XLOG_BLCKSZ,
    state->currRecPtr,
-   state->readBuf, >readPageTLI);
+   state->readBuf);
 		if (readLen < 0)
 			goto err;
 
@@ -577,7 +577,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 	 */
 	readLen = state->read_page(state, pageptr, Max(reqLen, SizeOfXLogShortPHD),
 			   state->currRecPtr,
-			   state->readBuf, >readPageTLI);
+			   state->readBuf);
 	if (readLen < 0)
 		goto err;
 
@@ -596,7 +596,7 @@ ReadPageInternal(XLogReaderState *state, XLogRecPtr pageptr, int reqLen)
 	{
 		readLen = state->read_page(state, pageptr, XLogPageHeaderSize(hdr),
    state->currRecPtr,
-   state->readBuf, >readPageTLI);
+   state->readBuf);
 		if (readLen < 0)
 			goto err;
 	}
diff --git a/src/backend/access/transam/xlogutils.c b/src/backend/access/transam/xlogutils.c
index 1fc39333f1..680bed8278 100644
--- a/src/backend/access/transam/xlogutils.c
+++ b/src/backend/access/transam/xlogutils.c
@@ -909,12 +909,12 @@ XLogReadDetermineTimeline(XLogReaderState *state, XLogRecPtr wantPage, uint32 wa
  */
 int
 read_local_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr,
-	 int reqLen, XLogRecPtr targetRecPtr, char *cur_page,
-	 TimeLineID *pageTLI)
+	 int reqLen, XLogRecPtr targetRecPtr, char *cur_page)
 {
 	XLogRecPtr	read_upto,
 loc;
 	int			count;
+	TimeLineID	pageTLI;
 
 	loc = targetPagePtr + reqLen;
 
@@ -934,7 +934,7 @@ read_local_xlog_page(XLogReaderState *state, XLogRecPtr targetPagePtr,
 		else
 			read_upto = GetXLogReplayRecPtr();
 
-		*pageTLI = ThisTimeLineID;
+		pageTLI = ThisTimeLineID;
 
 		/*
 		 * Check which timeline to get the record from.
@@ -991,7 +991,7 @@ read_local_xlog_page(XLogReaderState *state, 

Re: refactoring - share str2*int64 functions

2019-09-09 Thread Andres Freund
Hi,

On 2019-09-09 14:28:14 +0900, Michael Paquier wrote:
> On Thu, Sep 05, 2019 at 03:52:48PM +0900, Michael Paquier wrote:
> > Right, there was this part.  This brings also the point of having one
> > interface for the backend as all the error messages for the backend
> > are actually the same, with the most simple name being that:
> > pg_strtoint(value, size, error_ok).

I *VEHEMENTLY* oppose the introduction of any future pseudo-generic
routines taking the type width as a parameter. They're weakening the
type system and they're unnecessarily inefficient.


I don't really buy that saving a few copies of a strings is worth that
much. But if you really want to do that, the right approach imo would be
to move the error reporting into a separate function. I.e. something

void pg_attribute_noreturn()
pg_strtoint_error(pg_strtoint_status status, const char *input, const char 
*type)

that you'd call in small wrappers. Something like

static inline int32
pg_strtoint32_check(const char* s)
{
int32 result;
pg_strtoint_status status = pg_strtoint32(s, );

if (unlikely(status == PG_STRTOINT_OK))
   pg_strtoint_error(status, s, "int32");
return result;
}


> So, it seems to me that if we want to have a consolidation of
> everything, we basically need to have the generic error messages for
> the backend directly into common/string.c where the routines are
> refactored, and I think that we should do the following based on the
> patch attached:

> - Just remove pg_strtoint(), and move the error generation logic in
> common/string.c.

I'm not quite sure what you mean by moving the "error generation logic"?


> - Add an error_ok flag for all the pg_strto[u]int{16|32|64} routines,
> which generates errors only for FRONTEND is not defined.

I think this is a bad idea.


> - Use pg_log_error() for the frontend errors.
>
> If those errors are added everywhere, we would basically have no code
> paths in the frontend or the backend (for the unsigned part only)
> which use them yet.  Another possibility would be to ignore the
> error_ok flag in those cases, but that's inconsistent.

Yea, ignoring it would be terrible idea.

> diff --git a/src/backend/libpq/pqmq.c b/src/backend/libpq/pqmq.c
> index a9bd47d937..593a5ef54e 100644
> --- a/src/backend/libpq/pqmq.c
> +++ b/src/backend/libpq/pqmq.c
> @@ -286,10 +286,10 @@ pq_parse_errornotice(StringInfo msg, ErrorData *edata)
>   edata->hint = pstrdup(value);
>   break;
>   case PG_DIAG_STATEMENT_POSITION:
> - edata->cursorpos = pg_strtoint32(value);
> + edata->cursorpos = pg_strtoint(value, 4);
>   break;

I'd be really upset if this type of change went in.



>  #include "fmgr.h"
>  #include "miscadmin.h"
> +#include "common/string.h"
>  #include "nodes/extensible.h"
>  #include "nodes/parsenodes.h"
>  #include "nodes/plannodes.h"
> @@ -80,7 +81,7 @@
>  #define READ_UINT64_FIELD(fldname) \
>   token = pg_strtok(); /* skip :fldname */ \
>   token = pg_strtok(); /* get field value */ \
> - local_node->fldname = pg_strtouint64(token, NULL, 10)
> + (void) pg_strtouint64(token, _node->fldname)

Seems like these actually could just ought to use the error-checked
variants. And I think it ought to change all of
READ_{INT,UINT,LONG,UINT64,OID}_FIELD, rather than just redirecting one
of them to the new routines.


>  static void pcb_error_callback(void *arg);
> @@ -496,7 +496,7 @@ make_const(ParseState *pstate, Value *value, int location)
>
>   case T_Float:
>   /* could be an oversize integer as well as a float ... 
> */
> - if (scanint8(strVal(value), true, ))
> + if (pg_strtoint64(strVal(value), ) == 
> PG_STRTOINT_OK)
>   {
>   /*
>* It might actually fit in int32. Probably 
> only INT_MIN can

Not for this change, but we really ought to move away from this crappy
logic. It's really bonkers to have T_Float represent large integers and
floats.



> +/*
> + * pg_strtoint16
> + *
> + * Convert input string to a signed 16-bit integer.  Allows any number of
> + * leading or trailing whitespace characters.
> + *
> + * NB: Accumulate input as a negative number, to deal with two's complement
> + * representation of the most negative number, which can't be represented as 
> a
> + * positive number.
> + *
> + * The function returns immediately if the conversion failed with a status
> + * value to let the caller handle the error.  On success, the result is
> + * stored in "*result".
> + */
> +pg_strtoint_status
> +pg_strtoint16(const char *s, int16 *result)
> +{
> + const char *ptr = s;
> + int16   tmp = 0;
> + boolneg = false;
> +
> + /* skip leading spaces */
> + while (likely(*ptr) && 

patch: psql - enforce constant width of last column

2019-09-09 Thread Pavel Stehule
Hi

When I played with vertical cursor support I got badly displayed last
columns when border was not 2. Only when border is 2, then psql displays
last column with same width for each row.

I think so we can force column width alignment for any border styles today
(for alignment and wrapping styles) or as minimum this behave can be
optional.

I wrote a patch with pset option "final_spaces", but I don't see a reason
why we trim rows today.

Regards

Pavel
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 7789fc6177..4849086c0f 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2641,6 +2641,15 @@ lo_import 152801
   
   
 
+  
+  final_spaces
+  
+  
+  With value always last column will have same width.
+  
+  
+  
+
   
   footer
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index b981ae81ff..a91bb6ef6e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1980,7 +1980,7 @@ exec_command_pset(PsqlScanState scan_state, bool active_branch)
 			int			i;
 			static const char *const my_list[] = {
 "border", "columns", "csv_fieldsep", "expanded", "fieldsep",
-"fieldsep_zero", "footer", "format", "linestyle", "null",
+"fieldsep_zero", "final_spaces", "footer", "format", "linestyle", "null",
 "numericlocale", "pager", "pager_min_lines",
 "recordsep", "recordsep_zero",
 "tableattr", "title", "tuples_only",
@@ -3746,6 +3746,19 @@ _unicode_linestyle2string(int linestyle)
 	return "unknown";
 }
 
+static const char *
+_final_spaces_style2string(int finalspaces)
+{
+	switch (finalspaces)
+	{
+		case FINAL_SPACES_AUTO:
+			return "auto";
+		case FINAL_SPACES_ALWAYS:
+			return "always";
+	}
+	return "unknown";
+}
+
 /*
  * do_pset
  *
@@ -3820,6 +3833,21 @@ do_pset(const char *param, const char *value, printQueryOpt *popt, bool quiet)
 		}
 	}
 
+	else if (strcmp(param, "final_spaces") == 0)
+	{
+		if (!value)
+			;
+		if (pg_strncasecmp("auto", value, vallen) == 0)
+			popt->topt.final_spaces = FINAL_SPACES_AUTO;
+		else if (pg_strncasecmp("always", value, vallen) == 0)
+			popt->topt.final_spaces = FINAL_SPACES_ALWAYS;
+		else
+		{
+			pg_log_error("\\pset: allowed final space styles are auto or always");
+			return false;
+		}
+	}
+
 	/* set table line style */
 	else if (strcmp(param, "linestyle") == 0)
 	{
@@ -4247,6 +4275,12 @@ printPsetInfo(const char *param, struct printQueryOpt *popt)
 			   _unicode_linestyle2string(popt->topt.unicode_header_linestyle));
 	}
 
+	else if (strcmp(param, "final_spaces") == 0)
+	{
+		printf(_("final spaces style is \"%s\".\n"),
+			   _final_spaces_style2string(popt->topt.final_spaces));
+	}
+
 	else
 	{
 		pg_log_error("\\pset: unknown option: %s", param);
@@ -4357,6 +4391,8 @@ pset_value_string(const char *param, struct printQueryOpt *popt)
 		return pstrdup(_unicode_linestyle2string(popt->topt.unicode_column_linestyle));
 	else if (strcmp(param, "unicode_header_linestyle") == 0)
 		return pstrdup(_unicode_linestyle2string(popt->topt.unicode_header_linestyle));
+	else if (strcmp(param, "final_spaces") == 0)
+		return pstrdup(_final_spaces_style2string(popt->topt.final_spaces));
 	else
 		return pstrdup("ERROR");
 }
diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index bcc7404c55..3162f77d68 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -3707,7 +3707,7 @@ psql_completion(const char *text, int start, int end)
 		COMPLETE_WITH_QUERY(Query_for_list_of_roles);
 	else if (TailMatchesCS("\\pset"))
 		COMPLETE_WITH_CS("border", "columns", "csv_fieldsep", "expanded",
-		 "fieldsep", "fieldsep_zero", "footer", "format",
+		 "fieldsep", "final_spaces", "fieldsep_zero", "footer", "format",
 		 "linestyle", "null", "numericlocale",
 		 "pager", "pager_min_lines",
 		 "recordsep", "recordsep_zero",
@@ -3717,6 +3717,8 @@ psql_completion(const char *text, int start, int end)
 		 "unicode_header_linestyle");
 	else if (TailMatchesCS("\\pset", MatchAny))
 	{
+		if (TailMatchesCS("final_spaces"))
+			COMPLETE_WITH_CS("auto", "always");
 		if (TailMatchesCS("format"))
 			COMPLETE_WITH_CS("aligned", "asciidoc", "csv", "html", "latex",
 			 "latex-longtable", "troff-ms", "unaligned",
diff --git a/src/fe_utils/print.c b/src/fe_utils/print.c
index e41f42ea98..77b9da357d 100644
--- a/src/fe_utils/print.c
+++ b/src/fe_utils/print.c
@@ -586,6 +586,7 @@ print_aligned_text(const printTableContent *cont, FILE *fout, bool is_pager)
 	unsigned short opt_border = cont->opt->border;
 	const printTextFormat *format = get_line_style(cont->opt);
 	const printTextLineFormat *dformat = >lrule[PRINT_RULE_DATA];
+	final_spaces_style opt_final_spaces = cont->opt->final_spaces;
 
 	unsigned int col_count = 0,
 cell_count = 0;
@@ -1005,8 +1006,14 @@ print_aligned_text(const printTableContent *cont, FILE *fout, 

Re: BUG #15977: Inconsistent behavior in chained transactions

2019-09-09 Thread fn ln
It looks good now. Thank you again.

2019年9月9日(月) 17:43 Peter Eisentraut :

> On 2019-09-09 05:58, fn ln wrote:
> > Confirmed. Thank you all for your help.
> >
> > The only concern is that this test:
> >
> >SET TRANSACTION READ WRITE\; COMMIT AND CHAIN;  -- error
> >SHOW transaction_read_only;
> >
> >SET TRANSACTION READ WRITE\; ROLLBACK AND CHAIN;  -- error
> >SHOW transaction_read_only;
> >
> > makes more sense with READ ONLY because default_transaction_read_only is
> > off at this point.
>
> Oh you're right.  Fixed.
>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: msys2 vs pg_upgrade/test.sh

2019-09-09 Thread Peter Eisentraut
On 2019-09-09 00:06, Andrew Dunstan wrote:
> Diagnosing this took quite a lot of time and detective work. For some
> reason I don't quite understand, when calling the Windows command
> processor in a modern msys2/WindowsServer2019 installation, you need to
> double the slash, thus:
> 
>     cmd //c foo.bat
> 
> Some Internet postings at least seem to suggest this is by design. (FSVO
> "design")
> 
> I tested this on older versions and the change appears to work, so I
> propose to apply the attached patch.

If we're worried about messing things up for non-msys2 environments, we
could also set MSYS2_ARG_CONV_EXCL instead; see
.
According to that page, that would seem to be the more proper way to do it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: BUG #15977: Inconsistent behavior in chained transactions

2019-09-09 Thread Peter Eisentraut
On 2019-09-09 05:58, fn ln wrote:
> Confirmed. Thank you all for your help.
> 
> The only concern is that this test:
> 
>    SET TRANSACTION READ WRITE\; COMMIT AND CHAIN;  -- error
>    SHOW transaction_read_only;
> 
>    SET TRANSACTION READ WRITE\; ROLLBACK AND CHAIN;  -- error
>    SHOW transaction_read_only;
> 
> makes more sense with READ ONLY because default_transaction_read_only is
> off at this point.

Oh you're right.  Fixed.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [HACKERS] CLUSTER command progress monitor

2019-09-09 Thread Michael Paquier
On Fri, Sep 06, 2019 at 10:27:02AM -0400, Alvaro Herrera from 2ndQuadrant wrote:
> That said, I did spend some time on this type of issue when doing CREATE
> INDEX support; you can tell because I defined the columns for block
> numbers in a scan separately from CREATE INDEX specific fields,
> precisely to avoid multiple commands running concurrently from
> clobbering unrelated columns:
> 
> /* Block numbers in a generic relation scan */
> #define PROGRESS_SCAN_BLOCKS_TOTAL15
> #define PROGRESS_SCAN_BLOCKS_DONE 16

Hm.  It is not really clear what is the intention by looking at the
contents progress.h.

> I would say that it's fairly useful to have CLUSTER report progress on
> indexes being created underneath, but I understand that it might be too
> late to be designing the CLUSTER report to take advantage of the CREATE
> INDEX metrics.

The same can be said about the reporting done in reindex_relation for
PROGRESS_CLUSTER_INDEX_REBUILD_COUNT.  I think that it should be
removed for now.

> I think a workable, not terribly invasive approach is to have REINDEX
> process its commands conditionally: have the caller indicate whether
> progress is to be reported, and skip the calls if not.  That would
> (should) prevent it from clobbering the state set up by CLUSTER.

So, you would basically add an extra flag in the options of
reindex_index() to decide if a progress report should be started or
not?  I am not a fan of that because it does not take care of the root
issue which is that the start of the progress reports is too much
internal.  I think that it would be actually less error prone to move
the start of the progress reporting for REINDEX out of reindex_index()
and start it at a higher level.  Looking again at the code, I would
recommend that we should start the progress in ReindexIndex() before
calling reindex_index(), ReindexMultipleTables() before calling
reindex_relation() and ReindexRelationConcurrently(), and
ReindexTable() before calling reindex_relation().  That will avoid
each command to clobber each other's in-progress reports.

It would be also very good to document clearly how the overlaps for
the progress parameter values are not happening.  For example for
CREATE INDEX, we don't know why 1, 2 and 7 are not used.

Note that there is an ID collision with PROGRESS_CREATEIDX_INDEX_OID
updated in reindex_index() and the CLUSTER part
PROGRESS_CLUSTER_HEAP_BLKS_SCANNED..  There could be an argument to
use a completely different range of IDs for each command phase to
avoid any overlap, but it is scary to consider that we may not have
found all the issues with one command cloberring another one's
state..
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] PATCH: Batch/pipelining support for libpq

2019-09-09 Thread Amit Kapila
On Fri, Aug 30, 2019 at 7:06 PM Nikhil Sontakke  wrote:
>
> Hi,
>
> > > This patch has been around for some time now, the last version fails to
> > > apply cleanly and in-depth reviews have happened.  I am moving that to
> > > the next CF, waiting on its author.
> >
> > Unfortunately, nothing was changed since then, so there is already some 
> > amount
> > of unaddressed review feedback. I'll move this patch to "Returned with
> > feedback".
> >
>
> Craig Ringer mentioned about this thread to me recently.
>
> This effort has seen decent reviews from Craig, Andres and Michael
> already. So, I thought of refreshing it to work against latest master
> HEAD.
>

Thanks for picking up this.  However, I noticed that previously
Horiguchi-San has given some comments on this patch [1] which doesn't
seem to be addressed or at least not all of them are addressed.  It is
possible that you would have already addressed those, but in that
case, it would be good if you respond to his email as well.  If those
are not addressed, then it will be good to address those.


[1] - 
https://www.postgresql.org/message-id/20180322.211148.187821341.horiguchi.kyotaro%40lab.ntt.co.jp

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




RE: [PATCH] Speedup truncates of relation forks

2019-09-09 Thread Jamison, Kirk
On Friday, September 6, 2019 11:51 PM (GMT+9), Alvaro Herrera wrote:

Hi Alvaro,
Thank you very much for the review!

> On 2019-Sep-05, Jamison, Kirk wrote:
> 
> > I also mentioned it from my first post if we can just remove this dead code.
> > If not, it would require to modify the function because it would also
> > need nforks as input argument when calling DropRelFileNodeBuffers. I
> > kept my changes in the latest patch.
> > So should I remove the function now or keep my changes?
> 
> Please add a preliminary patch that removes the function.  Dead code is good,
> as long as it is gone.  We can get it pushed ahead of the rest of this.

Alright. I've attached a separate patch removing the smgrdounlinkfork.


> What does it mean to "mark" a dirty page in FSM?  We don't have the concept
> of marking pages as far as I know (and I don't see that the patch does any
> sort of mark).  Do you mean to find where it is and return its block number?

Yes. "Mark" is probably not a proper way to describe it, so I temporarily
changed it to "locate" and renamed the function to FreeSpaceMapLocateBlock().
If anyone could suggest a more appropriate name, that'd be appreciated.


> If so, I wonder how this handles concurrent table extension: are we keeping
> some sort of lock that prevents it?
> (... or would we lose any newly added pages that receive tuples while this
> truncation is ongoing?)

I moved the the description about acquiring AccessExclusiveLock
from FreeSpaceMapLocateBlock() and visibilitymap_truncate_prepare() to the
smgrtruncate description instead.
IIUC, in lazy_truncate_heap() we still acquire AccessExclusiveLock for the 
relation
before calling RelationTruncate(), which then calls smgrtruncate().
While holding the exclusive lock, the following are also called to check
if rel has not extended and verify that end pages contain no tuples while
we were vacuuming (with non-exclusive lock).
  new_rel_pages = RelationGetNumberOfBlocks(onerel);
  new_rel_pages = count_nondeletable_pages(onerel, vacrelstats);
I assume that the above would update the correct number of pages.
We then release the exclusive lock as soon as we have truncated the pages.


> I think the new API of smgrtruncate() is fairly confusing.  Would it be better
> to define a new struct { ForkNum forknum; BlockNumber blkno; } and pass an
> array of those?

This is for readbility, right? However, I think there's no need to define a
new structure for it, so I kept my changes.
  smgrtruncate(SMgrRelation reln, ForkNumber *forknum, int nforks, BlockNumber 
*nblocks).
I also declared *forkNum and nforks next to each other as suggested by 
Sawada-san.


What do you think about these changes?


Regards,
Kirk Jamison


v1-0001-Remove-deadcode-smgrdounlinkfork.patch
Description: v1-0001-Remove-deadcode-smgrdounlinkfork.patch


v7-0001-Speedup-truncates-of-relation-forks.patch
Description: v7-0001-Speedup-truncates-of-relation-forks.patch