Re: Minimal logical decoding on standbys

2019-06-09 Thread Amit Khandekar
On Tue, 4 Jun 2019 at 21:28, Andres Freund  wrote:
>
> Hi,
>
> On 2019-06-04 15:51:01 +0530, Amit Khandekar wrote:
> > After giving more thought on this, I think it might make sense to
> > arrange for the xl_running_xact record to be sent from master to the
> > standby, when a logical slot is to be created on standby. How about
> > standby sending a new message type to the master, requesting for
> > xl_running_xact record ? Then on master, ProcessStandbyMessage() will
> > process this new message type and call LogStandbySnapshot().
>
> I think that should be a secondary feature. You don't necessarily know
> the upstream master, as the setup could be cascading one.
Oh yeah, cascading setup makes it more complicated.

> I think for
> now just having to wait, perhaps with a comment to manually start a
> checkpoint, ought to suffice?

Ok.

Since this requires the test to handle the
fire-create-slot-and-then-fire-checkpoint-from-master actions, I was
modifying the test file to do this. After doing that, I found that the
slave gets an assertion failure in XLogReadRecord()=>XRecOffIsValid().
This happens only when the restart_lsn is set to ReplayRecPtr.
Somehow, this does not happen when I manually create the logical slot.
It happens only while running testcase. Working on it ...



-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company




Fix testing on msys when builddir is under /c mount point

2019-06-09 Thread Noah Misch
Several TAP test suites have a need to translate from an msys path to a
Windows path.  They currently use two ways to do that:

1. TestLib::real_dir, new in v11, is sound but works for directories only.
2. The $vfs_path approach is semi-private to PostgresNode.pm and 017_shm.pl,
   and it does not work if the file falls in a mount point other than "/".
   For example, it has been doing the wrong thing when builddir is
   /c/nm/postgresql (falling in mount point /c).

I'd like to fix the mount point problem and consolidate these two methods.  I
plan to call it TestLib::perl2host, since it translates a path in Perl's
notion of the filesystem to a path in the @host@ notion of the filesystem.
Attached.
diff --git a/src/bin/pg_checksums/t/002_actions.pl 
b/src/bin/pg_checksums/t/002_actions.pl
index ce24d06..59228b9 100644
--- a/src/bin/pg_checksums/t/002_actions.pl
+++ b/src/bin/pg_checksums/t/002_actions.pl
@@ -183,7 +183,7 @@ check_relation_corruption($node, 'corrupt1', 'pg_default');
 my $basedir= $node->basedir;
 my $tablespace_dir = "$basedir/ts_corrupt_dir";
 mkdir($tablespace_dir);
-$tablespace_dir = TestLib::real_dir($tablespace_dir);
+$tablespace_dir = TestLib::perl2host($tablespace_dir);
 $node->safe_psql('postgres',
"CREATE TABLESPACE ts_corrupt LOCATION '$tablespace_dir';");
 check_relation_corruption($node, 'corrupt2', 'ts_corrupt');
diff --git a/src/test/perl/PostgresNode.pm b/src/test/perl/PostgresNode.pm
index 8d5ad6b..6019f37 100644
--- a/src/test/perl/PostgresNode.pm
+++ b/src/test/perl/PostgresNode.pm
@@ -107,15 +107,6 @@ our @EXPORT = qw(
 our ($use_tcp, $test_localhost, $test_pghost, $last_host_assigned,
$last_port_assigned, @all_nodes, $died);
 
-# Windows path to virtual file system root
-
-our $vfs_path = '';
-if ($Config{osname} eq 'msys')
-{
-   $vfs_path = `cd / && pwd -W`;
-   chomp $vfs_path;
-}
-
 INIT
 {
 
@@ -945,7 +936,7 @@ primary_conninfo='$root_connstr'
 sub enable_restoring
 {
my ($self, $root_node) = @_;
-   my $path = $vfs_path . $root_node->archive_dir;
+   my $path = TestLib::perl2host($root_node->archive_dir);
my $name = $self->name;
 
print "### Enabling WAL restore for node \"$name\"\n";
@@ -990,7 +981,7 @@ sub set_standby_mode
 sub enable_archiving
 {
my ($self) = @_;
-   my $path   = $vfs_path . $self->archive_dir;
+   my $path   = TestLib::perl2host($self->archive_dir);
my $name   = $self->name;
 
print "### Enabling WAL archiving for node \"$name\"\n";
diff --git a/src/test/perl/TestLib.pm b/src/test/perl/TestLib.pm
index f8eb705..d2a9828 100644
--- a/src/test/perl/TestLib.pm
+++ b/src/test/perl/TestLib.pm
@@ -166,22 +166,31 @@ sub tempdir_short
return File::Temp::tempdir(CLEANUP => 1);
 }
 
-# Return the real directory for a virtual path directory under msys.
-# The directory  must exist. If it's not an existing directory or we're
-# not under msys, return the input argument unchanged.
-sub real_dir
+# Translate a Perl file name to a host file name.  Currently, this is a no-op
+# except for the case of Perl=msys and host=mingw32.  The subject need not
+# exist, but its parent directory must exist.
+sub perl2host
 {
-   my $dir = "$_[0]";
-   return $dir unless -d $dir;
-   return $dir unless $Config{osname} eq 'msys';
+   my ($subject) = @_;
+   return $subject unless $Config{osname} eq 'msys';
my $here = cwd;
-   chdir $dir;
+   my $leaf;
+   if (chdir $subject)
+   {
+   $leaf = '';
+   }
+   else
+   {
+   $leaf = '/' . basename $subject;
+   my $parent = dirname $subject;
+   chdir $parent or die "could not chdir \"$parent\": $!";
+   }
 
# this odd way of calling 'pwd -W' is the only way that seems to work.
-   $dir = qx{sh -c "pwd -W"};
+   my $dir = qx{sh -c "pwd -W"};
chomp $dir;
chdir $here;
-   return $dir;
+   return $dir . $leaf;
 }
 
 sub system_log
diff --git a/src/test/recovery/t/014_unlogged_reinit.pl 
b/src/test/recovery/t/014_unlogged_reinit.pl
index 103c0a2..ee05e1a 100644
--- a/src/test/recovery/t/014_unlogged_reinit.pl
+++ b/src/test/recovery/t/014_unlogged_reinit.pl
@@ -30,7 +30,7 @@ ok(-f "$pgdata/$baseUnloggedPath",'main fork in base 
exists');
 
 my $tablespaceDir = TestLib::tempdir;
 
-my $realTSDir = TestLib::real_dir($tablespaceDir);
+my $realTSDir = TestLib::perl2host($tablespaceDir);
 
 $node->safe_psql('postgres', "CREATE TABLESPACE ts1 LOCATION '$realTSDir'");
 $node->safe_psql('postgres',
diff --git a/src/test/recovery/t/017_shm.pl b/src/test/recovery/t/017_shm.pl
index f16821d..7f10ff5 100644
--- a/src/test/recovery/t/017_shm.pl
+++ b/src/test/recovery/t/017_shm.pl
@@ -12,14 +12,6 @@ use Time::HiRes qw(usleep);
 
 plan tests => 5;
 
-# See PostgresNode
-my $vfs_path = '';
-if ($Config{osname} eq 'msys')
-{
-   $vfs_path = `cd / && pwd -W`;
-   chomp $vfs_path;
-}
-

pg_upgrade: prep_status doesn't translate messages

2019-06-09 Thread Kyotaro Horiguchi
Hello.

In pg_upgrade, prep statuts is shown in English even if LANG is
set to other languages.

$ LANG=ja_JP.UTF8 pg_upgrade ...
<"Performing Consistency Checks on Old Live Server" in Japanese>
--
Checking cluster versions   ok
Checking database user is the install user  ok
Checking database connection settings   ok
Checking for prepared transactions  ok
...
<"*Clusters are compatible*" in Japanese>


prep_status is marked as GETTEXT_TRIGGERS but actually doesn't
translate. I suppose the reason is we don't have a general and
portable means to align the message strings containing non-ascii
characters.

I'd like to propose to append " ... " instead of aligning messages.

Checking cluster versions ... ok
Checking database user is the install user ... ok
Checking database connection settings ... ok
Checking for prepared transactions ... ok

If we don't do that, translation lines in po files are
useless. prep_stauts must be removed from TETTEXT_TRIGGERS, and a
comment that explains the reason for not translating.

Any opinions?

regardes.


-- 
Kyotaro Horiguchi
NTT Open Source Software Center


0001-Change-format-of-prep-status-of-pg_upgrade.patch
Description: Binary data


Re: [PATCH] Fix potential memoryleak in guc.c

2019-06-09 Thread Tom Lane
"Zhang, Jie"  writes:
> In src\backend\utils\misc\guc.c, I found a potential memory leak.
> make_absolute_path() return a malloc'd copy, we should free memory before the 
> function return false.

If SelectConfigFiles were executed more than once per postmaster
launch, this might be worth adding code for ... but as-is, I'm
dubious.  There are a few tens of KB of other one-time leaks
that we don't worry about removing.

Even more to the point, the particular code path you're complaining
about is a failure exit that will lead to immediate process
termination, so there really is no point in adding code there.

regards, tom lane




doc: clarify "pg_signal_backend" default role

2019-06-09 Thread Ian Barwick

Hi

Currently the documentation for the default role "pg_signal_backend" states,
somewhat ambiguously, "Send signals to other backends (eg: cancel query, 
terminate)",
giving the impression other signals (e.g. SIGHUP) can be sent too, which is
currently not the case.

Attached patch clarifies this, adds a descriptive paragraph (similar to what
the other default roles have) and a link to the "Server Signaling Functions"
section.

Patch applies cleanly to HEAD and REL_11_STABLE.


Regards

Ian Barwick


--
 Ian Barwick   https://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/doc/src/sgml/user-manag.sgml b/doc/src/sgml/user-manag.sgml
new file mode 100644
index 6106244..76b6ddb
*** a/doc/src/sgml/user-manag.sgml
--- b/doc/src/sgml/user-manag.sgml
*** DROP ROLE doomed_role;
*** 532,538 


 pg_signal_backend
!Send signals to other backends (eg: cancel query, terminate).


 pg_read_server_files
--- 532,538 


 pg_signal_backend
!Signal another backend to cancel a query or terminate the backend.


 pg_read_server_files
*** DROP ROLE doomed_role;
*** 561,566 
--- 561,573 
 
  

+   The pg_signal_backend role is intended to allow administrators to enable
+   trusted, but non-superuser, roles to send signals to other backends. Currently this role
+   enables sending of signals for canceling a query on another backend or terminating the
+   backend. A user granted this role cannot however send signals to a backend owned by a superuser.
+   See also .
+   
+   
The pg_read_server_files, pg_write_server_files and
pg_execute_server_program roles are intended to allow administrators to have
trusted, but non-superuser, roles which are able to access files and run programs on the


[PATCH] Fix potential memoryleak in guc.c

2019-06-09 Thread Zhang, Jie
Hi all

In src\backend\utils\misc\guc.c, I found a potential memory leak.

make_absolute_path() return a malloc'd copy, we should free memory before the 
function return false.

SelectConfigFiles(const char *userDoption, const char *progname)
{
..
/* configdir is -D option, or $PGDATA if no -D */
if (userDoption)
configdir = make_absolute_path(userDoption);  ★
else
configdir = make_absolute_path(getenv("PGDATA")); ★

if (configdir && stat(configdir, _buf) != 0)
{
write_stderr("%s: could not access directory \"%s\": %s\n",
 progname,
 configdir,
 strerror(errno));
if (errno == ENOENT)
write_stderr("Run initdb or pg_basebackup to initialize 
a PostgreSQL data directory.\n");
★// Need to free memory of configdir
return false;
}
..
---

Refer to the following files for the implementation of make_absolute_path().

file:  src\port\path.c
/*
 * make_absolute_path
 *
 * If the given pathname isn't already absolute, make it so, interpreting
 * it relative to the current working directory.
 *
 * Also canonicalizes the path.  The result is always a malloc'd copy.







guc.patch
Description: guc.patch


[PATCH] memory leak in ecpglib

2019-06-09 Thread Zhang, Jie
Hi all

Memory leaks occur when the ecpg_update_declare_statement() is called the 
second time.

FILE:postgresql\src\interfaces\ecpg\ecpglib\prepare.c
void
ecpg_update_declare_statement(const char *declared_name, const char 
*cursor_name, const int lineno)
{
struct declared_statement *p = NULL;

if (!declared_name || !cursor_name)
return;

/* Find the declared node by declared name */
p = ecpg_find_declared_statement(declared_name);
if (p)
p->cursor_name = ecpg_strdup(cursor_name, lineno);  ★
}
ecpg_strdup() returns a pointer to a null-terminated byte string, which is a 
duplicate of the string pointed to by str.
The memory obtained is done dynamically using malloc and hence it can be freed 
using free().

When the ecpg_update_declare_statement() is called for the second time, 
the memory allocated for p->cursor_name is not freed. 

For example:

EXEC SQL BEGIN DECLARE SECTION;
char *selectString = "SELECT * FROM foo;";
int FooBar;
char DooDad[17];
EXEC SQL END DECLARE SECTION;

EXEC SQL CONNECT TO postgres@localhost:5432 AS con1 USER postgres;

EXEC SQL AT con1 DECLARE stmt_1 STATEMENT;
EXEC SQL AT con1 PREPARE stmt_1 FROM :selectString;

EXEC SQL AT con1 DECLARE cur_1 CURSOR FOR stmt_1; //★1 ECPGopen() --> 
ecpg_update_declare_statement()
EXEC SQL AT con1 OPEN cur_1;   

EXEC SQL AT con1 DECLARE cur_2 CURSOR FOR stmt_1; //★2 ECPGopen() --> 
ecpg_update_declare_statement()
EXEC SQL AT con1 OPEN cur_2;
  Memory leaks

EXEC SQL FETCH cur_2 INTO:FooBar, :DooDad;
EXEC SQL COMMIT;
EXEC SQL DISCONNECT ALL;


We should free p->cursor_name before p->cursor_name = ecpg_strdup(cursor_name, 
lineno).
#
if(p->cursor_name)
ecpg_free(p->cursor_name);
p->cursor_name = ecpg_strdup(cursor_name,lineno);
###
Here is a patch.

Best Regards!





ecpglib.patch
Description: ecpglib.patch


Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-06-09 Thread Tom Lane
Andres Freund  writes:
> On June 9, 2019 8:36:37 AM PDT, Tom Lane  wrote:
>> I think you are mistaken that doing transactional updates in pg_index
>> is OK.  If memory serves, we rely on xmin of the pg_index row for
>> purposes such as detecting whether a concurrently-created index is safe
>> to use yet.

> We could replace that with storing a 64 xid in a normal column nowadays.

Perhaps, but that's a nontrivial change that'd be prerequisite to
doing what's suggested in this thread.

regards, tom lane




Re: Custom table AMs need to include heapam.h because of BulkInsertState

2019-06-09 Thread David Rowley
On Sat, 8 Jun 2019 at 04:51, Andres Freund  wrote:
>
> On 2019-06-07 09:48:29 -0400, Robert Haas wrote:
> > However, it looks to me as though copy.c can create a bunch of
> > BulkInsertStates but only call finish_bulk_insert() once, so unless
> > that's a bug in need of fixing I don't quite see how to make that
> > approach work.
>
> That is a bug. Not a currently "active" one with in-core AMs (no
> dangerous bulk insert flags ever get set for partitioned tables), but we
> obviously need to fix it nevertheless.
>
> Robert, seems we'll have to - regardless of where we come down on fixing
> this bug - have to make copy use multiple BulkInsertState's, even in the
> CIM_SINGLE (with proute) case. Or do you have a better idea?
>
> David, any opinions on how to best fix this? It's not extremely obvious
> how to do so best in the current setup of the partition actually being
> hidden somewhere a few calls away (i.e. the table_open happening in
> ExecInitPartitionInfo()).

That's been overlooked. I agree it's not a bug with heap, since
heapam_finish_bulk_insert() only does anything there when we're
skipping WAL, which we don't do in copy.c for partitioned tables.
However, who knows what other AMs will need, so we'd better fix that.

My proposed patch is attached.

I ended up moving the call to CopyMultiInsertInfoCleanup() down to
after we call table_finish_bulk_insert for the main table. This might
not be required but I lack imagination right now to what AMs might put
in the finish_bulk_insert call, so doing this seems safer.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


fix_copy_table_finish_build_insert.patch
Description: Binary data


Re: Temp table handling after anti-wraparound shutdown (Was: BUG #15840)

2019-06-09 Thread Andres Freund
Hi,

(on postgres lists, please do not top-quote).

On 2019-06-08 04:06:39 -0500, Thierry Husson wrote:
> In fact, I suppose all temporary tables and their content could be
> completly ignored by MVCC principles as they are not subject to
> concurrency being unmodifiable/unreadable by other connections.

That'd cause corruption, because vacuum would then remove resources that
the temp table might rely on (commit log, multixacts, ...).


> The separate situation, as noted by Michael, could be done at connection
> time, when PG gives a temporay schema to it. When it create a pg_temp_XXX
> schema, it could make sure it's completely empty and otherwise remove
> everything in it.

That already happens, but unfortunately only too late. IIRC We only do
so once the first temp table in a session is created.


> I already had a DB corruption because system tables weren't in sync
> about these tables/schemas after a badly closed connection, so it was
> impossible to make a drop table on them. So it could be even safer to
> clear everything directly from system tables instead of calling drop
> table for each leftover temp table.

Hm, I'd like to know more about that corruption. Did you report it when
it occured?

Greetings,

Andres Freund




GiST limits on contrib/cube with dimension > 100?

2019-06-09 Thread Siarhei Siniak
I've been using cube extension recompiled with #define MAX_DIM 256.
But with a version 11.3 I'm getting the following error:failed to add item to 
index page in 
There's a regression unit test in contrib/cube/expected/cube.out:
CREATE TABLE test_cube (c cube);
\copy test_cube from 'data/test_cube.data'
CREATE INDEX test_cube_ix ON test_cube USING gist (c);
SELECT * FROM test_cube WHERE c && '(3000,1000),(0,0)' ORDER BY c;
I've created gist index in the same way, i.e. create index  on 
 using gist();
If MAX_DIM equals to 512, btree index complaints as:index row size 4112 exceeds 
maximum 2712 for index 
HINT:  Values larger than 1/3 of a buffer page cannot be indexed.   
   
Consider a function index of an MD5 hash of the value, or use full text 
indexing.    

That's why 256 has been set.
But gist doesn't provide explanation on its error.
These are the places where the message might have been 
generated:src/backend/access/gist/gist.c:418:   
  elog(ERROR, "failed to add item to index page in \"%s\"", 
RelationGetRelationName(rel));
src/backend/access/gist/gist.c:540: 
elog(ERROR, "failed to add item to index page in \"%s\"",

Question is what restrains from setting MAX_DIM bigger than 100 in a custom 
recompiled cube extension version?In practice the error messages are too 
cryptic.
contrib/cube/cube.c has the following methods regarding GIST:/*
** GiST support methods
*/

PG_FUNCTION_INFO_V1(g_cube_consistent);
PG_FUNCTION_INFO_V1(g_cube_compress);
PG_FUNCTION_INFO_V1(g_cube_decompress);
PG_FUNCTION_INFO_V1(g_cube_penalty);
PG_FUNCTION_INFO_V1(g_cube_picksplit);
PG_FUNCTION_INFO_V1(g_cube_union);
PG_FUNCTION_INFO_V1(g_cube_same);
PG_FUNCTION_INFO_V1(g_cube_distance);

g_cube_compress has the following body:    PG_RETURN_DATUM(PG_GETARG_DATUM(0));

Does it just returns void pointer to the underlying x array?
cube data structure:
typedef struct NDBOX
{
    /* varlena header (do not touch directly!) */
    int32        vl_len_;

    /*--
     * Header contains info about NDBOX. For binary compatibility with old
     * versions, it is defined as "unsigned int".
     *
     * Following information is stored:
     *
     *    bits 0-7  : number of cube dimensions;
     *    bits 8-30 : unused, initialize to zero;
     *    bit  31   : point flag. If set, the upper right coordinates are not
     *                stored, and are implicitly the same as the lower left
     *                coordinates.
     *--
     */
    unsigned int header;

    /*
     * The lower left coordinates for each dimension come first, followed by
     * upper right coordinates unless the point flag is set.
     */
    double        x[FLEXIBLE_ARRAY_MEMBER];
} NDBOX;

Can it be a problem of not fitting into some limits when building or updating 
gist index for cube with MAX_DIM > 100?  


Re: Bloom Indexes - bit array length and the total number of bits (or hash functions ?? ) !

2019-06-09 Thread Avinash Kumar
Thanks Fabien,

But the 2 direct questions i have are :

1. What is the structure of the Bloom Index ? Can you please let me know
what are the fields of a Bloom Index ? Is it just the Item Pointer
and BloomSignatureWord ?
When i describe my bloom index it looks like following.

postgres=# \d+ foo.idx_bloom_bar
   Index "foo.idx_bloom_bar"
 Column  |  Type   | Key? | Definition | Storage | Stats target
-+-+--++-+--
 id  | integer | yes  | id | plain   |
 dept| integer | yes  | dept   | plain   |
 id2 | integer | yes  | id2| plain   |
 id3 | integer | yes  | id3| plain   |
 id4 | integer | yes  | id4| plain   |
 id5 | integer | yes  | id5| plain   |
 id6 | integer | yes  | id6| plain   |
 zipcode | integer | yes  | zipcode| plain   |
bloom, for table "foo.bar"
Options: length=64, col1=4, col2=4, col3=4, col4=4, col5=4, col6=4, col7=4,
col8=4

2. If we are storing just one signature word per row, how is this
aggregated for all column values of that row into one signature in high
level ?
For example, if length = 64, does it mean that a bit array of 64 bits is
generated per each row ?
If col1=4, does it mean the value of col1 is passed to 4 hash functions
that generate 4 positions that can be set to 1 in the bit array ?

On Sat, Jun 8, 2019 at 3:11 AM Fabien COELHO  wrote:

>
> Hello Avinash,
>
> > I was testing bloom indexes today. I understand bloom indexes uses bloom
> > filters. [...]
> >
> > So the question here is -
> > I assume - number of bits = k. Where k is the total number of hash
> > functions used on top of the data that needs to validated. Is that
> correct
> > ? If yes, why do we see the Index 1 performing better than Index 2 ?
> > Because, the data has to go through more hash functions (4 vs 2) in
> Index 1
> > than Index 2. So, with Index 1 it should take more time.
> > Also, both the indexes have ZERO false positives.
> > Please let me know if there is anything simple that i am missing here.
>
> You may have a look at the blog entry about these parameters I redacted a
> few year ago:
>
>http://blog.coelho.net/database/2016/12/11/postgresql-bloom-index.html
>
> --
> Fabien.
>
>
>

-- 
9000799060


Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-06-09 Thread Andres Freund
Hi,

On June 9, 2019 8:36:37 AM PDT, Tom Lane  wrote:
>"Goel, Dhruv"  writes:
>I think you are mistaken that doing transactional updates in pg_index
>is OK.  If memory serves, we rely on xmin of the pg_index row for
>purposes
>such as detecting whether a concurrently-created index is safe to use
>yet.

We could replace that with storing a 64 xid in a normal column nowadays.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.




Re: Avoiding deadlock errors in CREATE INDEX CONCURRENTLY

2019-06-09 Thread Tom Lane
"Goel, Dhruv"  writes:
> Yes, you are correct. The test case here was that if a tuple is inserted 
> after the reference snapshot is taken in Phase 2 and before the index is 
> marked ready. If this tuple is deleted before the reference snapshot of Phase 
> 3, it will never make it to the index. I have fixed this problem by making 
> pg_index tuple updates transactional (I believe there is no reason why it has 
> to be in place now) so that the xmin of the pg_index tuple is same the xmin 
> of the snapshot in Phase 3.

I think you are mistaken that doing transactional updates in pg_index
is OK.  If memory serves, we rely on xmin of the pg_index row for purposes
such as detecting whether a concurrently-created index is safe to use yet.
So a transactional update would restart that clock and result in temporary
denial of service.

regards, tom lane




Re: Use of reloptions by EXTENSIONs

2019-06-09 Thread Tom Lane
Dent John  writes:
> I guess my question is, and I correctly understanding that reloptions are 
> basically off-limits to EXTENSIONS?

IIRC that's basically true.  There's a lot of dissatisfaction with the
current implementation of reloptions, although I think that it's been
mainly focused on the fact that adding new ones is hard/error-prone
even within the core code.  If you want to help move this along, you
could review the existing patch in the area:

https://www.postgresql.org/message-id/flat/2083183.Rn7qOxG4Ov@x200m

and/or propose additional changes.

regards, tom lane




Re: Why does pg_checksums -r not have a long option?

2019-06-09 Thread Tomas Vondra

On Thu, Jun 06, 2019 at 06:01:21PM +0900, Michael Paquier wrote:

On Wed, Jun 05, 2019 at 10:31:54PM +0200, Peter Eisentraut wrote:

I think -r/--relfilenode was actually a good suggestion.  Because it
doesn't actually check a *file* but potentially several files (forks,
segments).  The -f naming makes it sound like it operates on a specific
file.


Hmm.  I still tend to prefer the -f/--filenode interface as that's
more consistent with what we have in the documentation, where
relfilenode gets only used when referring to the pg_class attribute.
You have a point about the fork types and extra segments, but I am not
sure that --relfilenode defines that in a better way than --filenode.
--


I agree. The "rel" prefix is there mostly because the other pg_class
attributes have it too (reltablespace, reltuples, ...) and we use
"filenode" elsewhere. For example we have pg_relation_filenode() function,
operating with exactly this piece of information.

So +1 to keep the "-f/--filenode" options.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services





Re: Binary support for pgoutput plugin

2019-06-09 Thread Tomas Vondra

On Sat, Jun 08, 2019 at 08:40:43PM -0400, Dave Cramer wrote:

On Sat, 8 Jun 2019 at 20:09, Andres Freund  wrote:


Hi,

On 2019-06-08 19:41:34 -0400, Dave Cramer wrote:
> So the reason we are discussing using pgoutput plugin is because it is
part
> of core and guaranteed to be in cloud providers solutions.

IMO people needing this should then band together and write one that's
suitable, rather than trying to coerce test_decoding and now pgoutput
into something they're not made for.



At the moment it would look a lot like pgoutput. For now I'm fine with no
changes to pgoutput other than binary
Once we have some real use cases we can look at writing a new one. I would
imagine we would actually start with
pgoutput and just add to it.



I understand the desire to make this work for managed cloud environments,
we support quite a few customers who would benefit from it. But pgoutput
is meant specifically for built-in replication, and adding complexity that
is useless for that use case does not seem like a good tradeoff.


From this POV the binary mode is fine, because it'd benefit pgoutput, but

the various other stuff mentioned here (e.g. nullability) is not.

And if we implement a new plugin for use by out-of-core stuff, I guess
we'd probably done it in an extension. But even having it in contrib would
not make it automatically installed on managed systems, because AFAIK the
various providers only allow whitelisted extensions. At which point
there's there's little difference compared to external extensions.

I think the best party to implement such extension is whoever implements
such replication system (say Debezium), because they are the ones who know
which format / behavior would work for them. And they can also show the
benefit to their users, who can then push the cloud providers to install
the extension. Of course, that'll take a long time (but it's unclear how
long), and until then they'll have to provide some fallback.

This is a bit of a chicken-egg problem, with three parties - our project,
projects building on logical replication and cloud providers. And no
matter how you slice it, the party implementing it has only limited (if
any) control over what the other parties allow.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services