Re: [HACKERS] Add support for tuple routing to foreign partitions

2018-02-22 Thread Amit Langote
Fujita-san,

On Thu, Feb 22, 2018 at 8:49 PM, Etsuro Fujita
 wrote:
> (2018/02/22 11:52), Amit Langote wrote:
>> I wonder why partition_index needs to be made part of this API?
>
> The reason for that is because I think the FDW might want to look at the
> partition info stored in mtstate->mt_partition_tuple_routing for some reason
> or other, such as parent_child_tupconv_maps and child_parent_tupconv_maps,
> which are accessed with the given partition index.

I see.  I guess that makes sense.

>> Perhaps an independent concern, but one thing I noticed is that it does
>> not seem to play well with the direct modification (update push-down)
>> feature.  Now because updates (at least local, let's say) support
>> re-routing, I thought we'd be able move rows across servers via the local
>> server, but with direct modification we'd never get the chance.  However,
>> since update tuple routing is triggered by partition constraint failure,
>> which we don't enforce for foreign tables/partitions anyway, I'm not sure
>> if we need to do anything about that, and even if we did, whether it
>> concerns this proposal.
>
> Good point!  Actually, update tuple routing we have in HEAD doesn't allow
> re-routing tuples from foreign partitions even without direct modification.
> Here is an example using postgres_fdw:
>
> postgres=# create table pt (a int, b text) partition by list (a);
> CREATE TABLE
> postgres=# create table loct (a int check (a in (1)), b text);
> CREATE TABLE
> postgres=# create foreign table remp partition of pt for values in (1)
> server loopback options (table_name 'loct');
> CREATE FOREIGN TABLE
> postgres=# create table locp partition of pt for values in (2);
> CREATE TABLE
> postgres=# insert into remp values (1, 'foo');
> INSERT 0 1
> postgres=# insert into locp values (2, 'bar');
> INSERT 0 1
> postgres=# select tableoid::regclass, * from pt;
>  tableoid | a |  b
> --+---+-
>  remp | 1 | foo
>  locp | 2 | bar
> (2 rows)
>
> postgres=# create function postgres_fdw_abs(int) returns int as $$begin
> return abs($1); end$$ language plpgsql immutable;
> CREATE FUNCTION
> postgres=# explain verbose update pt set a = postgres_fdw_abs(a) + 1 where b
> = 'foo';
>  QUERY PLAN
>
> 
> -
>  Update on public.pt  (cost=100.00..154.54 rows=12 width=42)
>Foreign Update on public.remp
>  Remote SQL: UPDATE public.loct SET a = $2 WHERE ctid = $1
>Update on public.locp
>->  Foreign Scan on public.remp  (cost=100.00..127.15 rows=6 width=42)
>  Output: (postgres_fdw_abs(remp.a) + 1), remp.b, remp.ctid
>  Remote SQL: SELECT a, b, ctid FROM public.loct WHERE ((b =
> 'foo'::text)) FOR UPDATE
>->  Seq Scan on public.locp  (cost=0.00..27.39 rows=6 width=42)
>  Output: (postgres_fdw_abs(locp.a) + 1), locp.b, locp.ctid
>  Filter: (locp.b = 'foo'::text)
> (10 rows)
>
> postgres=# update pt set a = postgres_fdw_abs(a) + 1 where b = 'foo';
> ERROR:  new row for relation "loct" violates check constraint "loct_a_check"
> DETAIL:  Failing row contains (2, foo).
> CONTEXT:  Remote SQL command: UPDATE public.loct SET a = $2 WHERE ctid = $1
>
> To be able to move tuples across foreign servers, I think we would first
> have to do something to allow re-routing tuples from foreign partitions.
> The patches I proposed hasn't done anything about that, so the patched
> version would behave the same way as HEAD with/without direct modification.

Yes.

As I said, since update re-routing is triggered by partition
constraint failure for the new row and we don't check (any)
constraints for foreign tables, that means re-routing won't occur for
foreign partitions.

>> That said, I saw in the changes to ExecSetupPartitionTupleRouting() that
>> BeginForeignRouting() is called for a foreign partition even if direct
>> modification might already have been set up.  If direct modification is
>> set up, then ExecForeignRouting() will never get called, because we'd
>> never call ExecUpdate() or ExecInsert().
>
> Yeah, but I am thinking to leave the support for re-routing tuples across
> foreign servers for another patch.
>
> One difference between HEAD and the patched version would be: we can
> re-route tuples from a plain partition to a foreign partition if the foreign
> partition supports this tuple-routing API.  Here is an example using the
> above data:
>
> postgres=# select tableoid::regclass, * from pt;
>  tableoid | a |  b
> --+---+-
>  remp | 1 | foo
>  locp | 2 | bar
> (2 rows)
>
> postgres=# update pt set a = 1 where b = 'bar';
> UPDATE 1
> postgres=# select tableoid::regclass, * from pt;
>  tableoid | a |  b
> --+---+-
>  remp | 1 | foo
>  remp | 1 | bar
> (2 rows)
>
> This would introduce an asymmetry (we can move tuples from plain partitions
> to foreign partitions, but the 

RTLD_GLOBAL (& JIT inlining)

2018-02-22 Thread Andres Freund
Hi Peter, All,


First question:

Why do we currently use RTLD_GLOBAL loading extension libraries, but
take pains ([1]) to make things work without RTLD_GLOBAL. It seems like
it'd be both safer to RTLD_LOCAL on platforms supporting it (or
equivalent), as well as less error-prone because we'd be less likely to
depend on it like we did e.g. during transforms development.


It seems error prone because on systmes I'm aware of (IOW linux ;))
conflicting symbols will not cause errors. Neither when there's a
conflict between a .so and the main binary, nor between different
shlibs.  My reading of glibc's ld is that the search order for
unresolved references in a .so is main binary, and then other
RTLD_GLOBAL shared libraries in order of time they're loaded.

Right now it appears that e.g. hstore_plperl has some issue, building
with RTLD_LOCAL makes its test fail. Which means it'll probably not work
on some platforms.


I think using RTLD_LOCAL on most machines would be a much better
idea. I've not found proper explanations why GLOBAL is used. We started
using it ages ago, with [2], but that commit contains no explanation,
and a quick search didn't show up anything either. Peter?


Secondly:

For JITing internal & C operators, and other functions referenced in
JITed code, can be inlined if the definition is available in a suitable
form. The details how that works don't matter much here.


For my jitting work I've so far treated symbols of binaries and all
referenced extensions as being part of a single namespace. Currently the
order in which symbols with multiple definitions were looked up was in
essentially in filesystem order.

While I'm not aware of any observable problems, reviewing that decision
I think that's a terrible idea. First and foremost it seems obviously
wrong to not load symbols from the main postgres binary first. But
leaving that aside, I'm uncomfortable doing inlining between shlibs for
JITing, because that can lead to symbols beeing reloaded differently
between non JITed code (where all external libraries loaded in the
current library are searched in load/usage order) and JITed code (where
the symbols would be resolved according to some static order).

I'm about to rewrite it so that functions are inlined just from the main
binary, or the basename of the shared library explicitly referenced in
the function definition.  That's still different, but at least easy to
understand.  But that still means there's inconsistent order between
different methods of execution, which isn't great.

Comments?

Greetings,

Andres Freund

[1] http://archives.postgresql.org/message-id/2652.1475512158%40sss.pgh.pa.us
[2] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=c87bc779d4e9f109e92f8b8f1dfad5d6739f8e97



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Thomas Munro
On Fri, Feb 23, 2018 at 7:56 PM, Amit Kapila  wrote:
> On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro
>  wrote:
>> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
>>> By the way, in which case leader can exit early?  As of now, we do
>>> wait for workers to end both before the query is finished or in error
>>> cases.
>>
>> create table foo as select generate_series(1, 10)::int a;
>> alter table foo set (parallel_workers = 2);
>> set parallel_setup_cost = 0;
>> set parallel_tuple_cost = 0;
>> select count(a / 0) from foo;
>>
>> That reliably gives me:
>> ERROR:  division by zero [from leader]
>> ERROR:  could not map dynamic shared memory segment [from workers]
>>
>> I thought this was coming from resource manager cleanup, but you're
>> right: that happens after we wait for all workers to finish.  Perhaps
>> this is a race within DestroyParallelContext() itself: when it is
>> called by AtEOXact_Parallel() during an abort, it asks the postmaster
>> to SIGTERM the workers, then it immediately detaches from the DSM
>> segment, and then it waits for the worker to start up.
>>
>
> I guess you mean to say worker waits to shutdown/exit.  Why would it
> wait for startup at that stage?

Right, I meant to say shutdown/exit.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Amit Kapila
On Fri, Feb 23, 2018 at 8:48 AM, Thomas Munro
 wrote:
> On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
>> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
>>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
 PS  I noticed that for BecomeLockGroupMember() we say "If we can't
 join the lock group, the leader has gone away, so just exit quietly"
 but for various other similar things we spew errors (most commonly
 seen one being "ERROR:  could not map dynamic shared memory segment").
 Intentional?
>>>
>>> I suppose I thought that if we failed to map the dynamic shared memory
>>> segment, it might be down to any one of several causes; whereas if we
>>> fail to join the lock group, it must be because the leader has already
>>> exited.  There might be a flaw in that thinking, though.
>>>
>>
>> By the way, in which case leader can exit early?  As of now, we do
>> wait for workers to end both before the query is finished or in error
>> cases.
>
> create table foo as select generate_series(1, 10)::int a;
> alter table foo set (parallel_workers = 2);
> set parallel_setup_cost = 0;
> set parallel_tuple_cost = 0;
> select count(a / 0) from foo;
>
> That reliably gives me:
> ERROR:  division by zero [from leader]
> ERROR:  could not map dynamic shared memory segment [from workers]
>
> I thought this was coming from resource manager cleanup, but you're
> right: that happens after we wait for all workers to finish.  Perhaps
> this is a race within DestroyParallelContext() itself: when it is
> called by AtEOXact_Parallel() during an abort, it asks the postmaster
> to SIGTERM the workers, then it immediately detaches from the DSM
> segment, and then it waits for the worker to start up.
>

I guess you mean to say worker waits to shutdown/exit.  Why would it
wait for startup at that stage?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: committing inside cursor loop

2018-02-22 Thread Simon Riggs
On 20 February 2018 at 14:45, Tom Lane  wrote:
> Peter Eisentraut  writes:
>> Here is a patch that allows COMMIT inside cursor loops in PL/pgSQL.  As
>> alluded to in earlier threads, this is done by converting such cursors
>> to holdable automatically.  A special flag "auto-held" marks such
>> cursors, so we know to clean them up on exceptions.
>
> I haven't really read this patch, but this bit jumped out at me:
>
> +Inside a cursor loop, ROLLBACK is not allowed.  (That
> +would have to roll back the cursor query, thus invalidating the loop.)

Hmm, why?

Rollback would only invalidate the cursor if it hadn't yet hit a
Commit. But if you did a commit, then the cursor would become holdable
and you could happily continue reading through the loop even after the
rollback.

So if Commit changes a pinned portal into a holdable cursor, we just
make Rollback do that also. Obviously only for pinned portals, i.e.
the query/ies whose results we are currently looping through.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: check error messages in SSL tests

2018-02-22 Thread Michael Paquier
On Thu, Feb 22, 2018 at 08:34:30AM -0500, Peter Eisentraut wrote:
> I noticed that a couple of test cases in the SSL tests fail to connect
> not for the reason that the tests think they should.  Here is a patch to
> augment the test setup so that a test for connection rejection also
> checks that we get the expected error message.

+1 for the idea.  And good catches.

One of the tests is failing:
t/001_ssltests.pl .. 1/62
#   Failed test 'certificate authorization fails with revoked client cert: 
matches'
#   at /home//git/postgres/src/test/ssl/../../../src/test/perl/TestLib.pm 
line 354.
#   'psql: private key file "ssl/client-revoked.key" has
group or world access; permissions should be u=rw (0600) or less
# '
# doesn't match '(?^:SSL error)'
# Looks like you failed 1 test of 62.
t/001_ssltests.pl .. Dubious, test returned 1 (wstat 256, 0x100)
Failed 1/62 subtests

This comes from libpq itself, and the tree uses 0644 on this file.  You
just need to update this test so as ssl/client-revoked_tmp.key is used
instead of ssl/client-revoked.key, and then the tests pass.
--
Michael


signature.asc
Description: PGP signature


Re: Translations contributions urgently needed

2018-02-22 Thread Tom Lane
Alvaro Herrera  writes:
> Please join pgsql-translat...@postgresql.org.

What surprises me about this thread is that apparently the sad state
of the v10 translations wasn't already discussed on that list?

I have no objection to calling for more translation volunteers on
this list --- in fact, probably it'd be a good idea to call for
more help on pgsql-general, too.  But it doesn't seem like it's
quite on-topic for -hackers otherwise.

regards, tom lane



Re: FOR EACH ROW triggers on partitioned tables

2018-02-22 Thread Amit Langote
On Fri, Feb 23, 2018 at 11:32 AM, Alvaro Herrera
 wrote:
> Amit Langote wrote:
>> On 2018/02/23 8:52, Alvaro Herrera wrote:
>> > We could mitigate the performance loss to some extent by adding more to
>> > RelationData.  For example, a "is_partition" boolean would help: skip
>> > searching pg_inherits for a relation that is not a partition.
>>
>> Unless I'm missing something, doesn't rd_rel->relispartition help?
>
> Uh, wow, how have I missed that all this time!  Yes, it probably does.
> I'll rework this tomorrow ... and the already committed index patch too,
> I think.

BTW, not sure if you'd noticed but I had emailed about setting
relispartition on index partitions after you committed the first
indexes patch.

https://www.postgresql.org/message-id/12085bc4-0bc6-0f3a-4c43-57fe06817...@lab.ntt.co.jp

Thanks,
Amit



Re: Translations contributions urgently needed

2018-02-22 Thread Craig Ringer
On 23 February 2018 at 09:50, Tatsuo Ishii  wrote:

> > I was wondering a bit about how translations are maintained. Experience
> > with another highly internationalised project years ago showed that many
> > people who are willing to volunteer as translators are NOT willing to
> > interact with git (or at the time, svn) and the other tools many of us
> take
> > for granted.
> >
> > Can more be gained with user-friendly, probably web-based, translation
> > tools to make translation updating more accessible?
>
> Do you have any recommendation for such tools?
>
>
I was last involved a long time ago, so I don't have direct experience. But
I've head people speaking of how much it can help.

A look around finds a few things of interest like

http://pootle.translatehouse.org/index.html

https://translate.zanata.org/

https://localise.biz/about/gettext



-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
> On Fri, Feb 23, 2018 at 09:51:02AM +0900, Tatsuo Ishii wrote:
>> I'm not working on message translations.  Hotta-san is actively
>> working on the translations here:
>> 
>> https://github.com/hotta/pg-nls-ja
> 
> If you need help here, feel free to ping me.  I have done translation to
> Japanese for other things in the past in technical products, so anything
> related to Postgres in this language is not much an issue from here.

Thank you for the offering. Hotta-san and people working on the
translation work are communicating at:

https://ml.postgresql.jp/mailman/listinfo/jpug-doc/

So you might want to be subscribed to the list then talk to Hotta-san,
or directly write to him.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
> Michael Paquier wrote:
>> On Fri, Feb 23, 2018 at 09:51:02AM +0900, Tatsuo Ishii wrote:
>> > I'm not working on message translations.  Hotta-san is actively
>> > working on the translations here:
>> > 
>> > https://github.com/hotta/pg-nls-ja
>> 
>> If you need help here, feel free to ping me.  I have done translation to
>> Japanese for other things in the past in technical products, so anything
>> related to Postgres in this language is not much an issue from here.
> 
> How many repositories do we need, though?  There is already one in
> git.postgresql.org -- see https://babel.postgresql.org/
> 
> Surely we could add a few committers that can take care of Japanese.
> 
> Please join pgsql-translat...@postgresql.org.

In my understanding Hotta-san is using the github repository to get
some reviews from other people who are working on PostgreSQL manual
translation project before submitting the po files to the patch
tracker:

https://redmine.postgresql.org/projects/pgtranslation

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Thomas Munro
On Fri, Feb 23, 2018 at 3:29 PM, Amit Kapila  wrote:
> On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
>> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>>> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
>>> join the lock group, the leader has gone away, so just exit quietly"
>>> but for various other similar things we spew errors (most commonly
>>> seen one being "ERROR:  could not map dynamic shared memory segment").
>>> Intentional?
>>
>> I suppose I thought that if we failed to map the dynamic shared memory
>> segment, it might be down to any one of several causes; whereas if we
>> fail to join the lock group, it must be because the leader has already
>> exited.  There might be a flaw in that thinking, though.
>>
>
> By the way, in which case leader can exit early?  As of now, we do
> wait for workers to end both before the query is finished or in error
> cases.

create table foo as select generate_series(1, 10)::int a;
alter table foo set (parallel_workers = 2);
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
select count(a / 0) from foo;

That reliably gives me:
ERROR:  division by zero [from leader]
ERROR:  could not map dynamic shared memory segment [from workers]

I thought this was coming from resource manager cleanup, but you're
right: that happens after we wait for all workers to finish.  Perhaps
this is a race within DestroyParallelContext() itself: when it is
called by AtEOXact_Parallel() during an abort, it asks the postmaster
to SIGTERM the workers, then it immediately detaches from the DSM
segment, and then it waits for the worker to start up.  The workers
unblock signals before the they try to attach to the DSM segment, but
they don't CHECK_FOR_INTERRUPTS before they try to attach (and even if
they did it wouldn't solve nothing).

I don't like the error much, though at least the root cause error is
logged first.

I don't immediately see how BecomeLockGroupMember() could have the
same kind of problem though, for the reason you said: the leader waits
for the workers to finish, so I'm not sure in which circumstances it
would cease to be the lock group leader while the workers are still
running.

-- 
Thomas Munro
http://www.enterprisedb.com



SSL passphrase prompt external command

2018-02-22 Thread Peter Eisentraut
Here is a patch that adds a way to specify an external command for
obtaining SSL passphrases.  There is a new GUC setting
ssl_passphrase_command.

Right now, we rely on the OpenSSL built-in prompting mechanism, which
doesn't work in some situations, including under systemd.  This patch
allows a configuration to make that work, e.g., with systemd-ask-password.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 85dfab5aab956044f89f63edeb0c3cb852c296e5 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 22 Feb 2018 08:42:18 -0500
Subject: [PATCH] Add ssl_passphrase_command setting

This allows specifying an external command for prompting for or
otherwise obtaining passphrases for SSL key files.  This is useful
because in many cases there is no TTY easily available during service
startup.
---
 doc/src/sgml/config.sgml  |  45 ++
 src/backend/libpq/Makefile|   2 +-
 src/backend/libpq/be-secure-common.c  | 114 ++
 src/backend/libpq/be-secure-openssl.c |  58 +
 src/backend/libpq/be-secure.c |   1 +
 src/backend/utils/misc/guc.c  |  10 +++
 src/backend/utils/misc/postgresql.conf.sample |   1 +
 src/include/libpq/libpq.h |   7 ++
 src/test/ssl/Makefile |   5 ++
 src/test/ssl/README   |   3 +
 src/test/ssl/ServerSetup.pm   |   2 +-
 src/test/ssl/ssl/server-password.key  |  18 
 src/test/ssl/t/001_ssltests.pl|  35 ++--
 13 files changed, 280 insertions(+), 21 deletions(-)
 create mode 100644 src/backend/libpq/be-secure-common.c
 create mode 100644 src/test/ssl/ssl/server-password.key

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 4c998fe51f..4b38cb6207 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -1312,6 +1312,51 @@ SSL

   
  
+
+ 
+  ssl_passphrase_command (string)
+  
+   ssl_passphrase_command configuration 
parameter
+  
+  
+  
+   
+Sets an external command to be invoked when a passphrase for
+decrypting an SSL file such as a private key needs to be obtained.  By
+default, this parameter is empty, which means the built-in prompting
+mechanism is used.
+   
+   
+The command must print the passphrase to the standard output and exit
+with code 0.  In the parameter value, %p is
+replaced by a prompt string.  (Write %% for a
+literal %.)  Note that the prompt string will
+probably contain whitespace, so be sure to quote adequately.  A single
+newline is stripped from the end of the output if present.
+   
+   
+The command does not actually have to prompt the user for a
+passphrase.  It can read it from a file, obtain it from a keychain
+facility, or similar.  It is up to the user to make sure the chosen
+mechanism is adequately secure.
+   
+   
+The passphrase command will also be called during a configuration
+reload if a key file needs a passphrase.  If the setting starts with
+-, then it will be ignored during a reload and the
+SSL configuration will not be reloaded if a passphrase is needed.
+(The - will not be part of the actual command
+called.)  The latter setting is appropriate for a command that
+requires a TTY for prompting, which might not be available when the
+server is running.  The default behavior with an empty setting is also
+not to prompt during a reload.
+   
+   
+This parameter can only be set in the 
postgresql.conf
+file or on the server command line.
+   
+  
+ 
 
 

diff --git a/src/backend/libpq/Makefile b/src/backend/libpq/Makefile
index 7fa2b02743..3dbec23e30 100644
--- a/src/backend/libpq/Makefile
+++ b/src/backend/libpq/Makefile
@@ -14,7 +14,7 @@ include $(top_builddir)/src/Makefile.global
 
 # be-fsstubs is here for historical reasons, probably belongs elsewhere
 
-OBJS = be-fsstubs.o be-secure.o auth.o crypt.o hba.o ifaddr.o pqcomm.o \
+OBJS = be-fsstubs.o be-secure.o be-secure-common.o auth.o crypt.o hba.o 
ifaddr.o pqcomm.o \
pqformat.o pqmq.o pqsignal.o auth-scram.o
 
 ifeq ($(with_openssl),yes)
diff --git a/src/backend/libpq/be-secure-common.c 
b/src/backend/libpq/be-secure-common.c
new file mode 100644
index 00..a23ba27abc
--- /dev/null
+++ b/src/backend/libpq/be-secure-common.c
@@ -0,0 +1,114 @@
+/*-
+ *
+ * be-secure-common.c
+ *
+ * common implementation-independent SSL support code
+ *
+ * While be-secure.c contains the interfaces that the rest of the
+ * communications code calls, 

Re: Translations contributions urgently needed

2018-02-22 Thread Michael Paquier
On Thu, Feb 22, 2018 at 11:45:11PM -0300, Alvaro Herrera wrote:
> How many repositories do we need, though?  There is already one in
> git.postgresql.org -- see https://babel.postgresql.org/

Centralizing things into postgresql.org is nice for the long term.

> Please join pgsql-translat...@postgresql.org.

Oki, done.
--
Michael


signature.asc
Description: PGP signature


Re: Online enabling of checksums

2018-02-22 Thread Andres Freund
How does:

On 2018-02-23 11:48:16 +0900, Michael Paquier wrote:
> On Thu, Feb 22, 2018 at 11:24:37AM -0800, Andres Freund wrote:
> > I suspect I'm going to get some grief for this, but I think the time has
> > come to bite the bullet and support changing databases in the same
> > process...
> 
> I'd like to see that.  Last time this has been discussed, and Robert
> complained to me immediately when I suggested it, is that this is not
> worth it with the many complications around syscache handling and
> resource cleanup.

relate to:

> It is in the long term more stable to use a model
> where a parent process handles a set of children and decides to which
> database each child should spawn, which is what autovacuum does.

?



Re: Online enabling of checksums

2018-02-22 Thread Michael Paquier
On Thu, Feb 22, 2018 at 11:24:37AM -0800, Andres Freund wrote:
> I suspect I'm going to get some grief for this, but I think the time has
> come to bite the bullet and support changing databases in the same
> process...

I'd like to see that.  Last time this has been discussed, and Robert
complained to me immediately when I suggested it, is that this is not
worth it with the many complications around syscache handling and
resource cleanup.  It is in the long term more stable to use a model
where a parent process handles a set of children and decides to which
database each child should spawn, which is what autovacuum does.
--
Michael


signature.asc
Description: PGP signature


Re: Translations contributions urgently needed

2018-02-22 Thread Alvaro Herrera
Michael Paquier wrote:
> On Fri, Feb 23, 2018 at 09:51:02AM +0900, Tatsuo Ishii wrote:
> > I'm not working on message translations.  Hotta-san is actively
> > working on the translations here:
> > 
> > https://github.com/hotta/pg-nls-ja
> 
> If you need help here, feel free to ping me.  I have done translation to
> Japanese for other things in the past in technical products, so anything
> related to Postgres in this language is not much an issue from here.

How many repositories do we need, though?  There is already one in
git.postgresql.org -- see https://babel.postgresql.org/

Surely we could add a few committers that can take care of Japanese.

Please join pgsql-translat...@postgresql.org.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Translations contributions urgently needed

2018-02-22 Thread Michael Paquier
On Fri, Feb 23, 2018 at 09:51:02AM +0900, Tatsuo Ishii wrote:
> I'm not working on message translations.  Hotta-san is actively
> working on the translations here:
> 
> https://github.com/hotta/pg-nls-ja

If you need help here, feel free to ping me.  I have done translation to
Japanese for other things in the past in technical products, so anything
related to Postgres in this language is not much an issue from here.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Amit Kapila
On Thu, Feb 22, 2018 at 10:35 PM, Robert Haas  wrote:
> On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
>> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
>> join the lock group, the leader has gone away, so just exit quietly"
>> but for various other similar things we spew errors (most commonly
>> seen one being "ERROR:  could not map dynamic shared memory segment").
>> Intentional?
>
> I suppose I thought that if we failed to map the dynamic shared memory
> segment, it might be down to any one of several causes; whereas if we
> fail to join the lock group, it must be because the leader has already
> exited.  There might be a flaw in that thinking, though.
>

By the way, in which case leader can exit early?  As of now, we do
wait for workers to end both before the query is finished or in error
cases.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [bug fix] Cascaded standby cannot start after a clean shutdown

2018-02-22 Thread Michael Paquier
On Thu, Feb 22, 2018 at 04:55:38PM +0900, Michael Paquier wrote:
> I am definitely ready to buy that it can be possible to have garbage
> being read the length field which can cause allocate_recordbuf to fail
> as that's the only code path in xlogreader.c which does such an
> allocation.  Still, it seems to me that we should first try to see if
> there are strange allocation patterns that happen and see if it is
> possible to have a reproduceable test case or a pattern which gives us
> confidence that we are on the right track.  One idea I have to
> monitor those allocations like the following:
> --- a/src/backend/access/transam/xlogreader.c
> +++ b/src/backend/access/transam/xlogreader.c
> @@ -162,6 +162,10 @@ allocate_recordbuf(XLogReaderState *state, uint32 
> reclength)
> newSize += XLOG_BLCKSZ - (newSize % XLOG_BLCKSZ);
> newSize = Max(newSize, 5 * Max(BLCKSZ, XLOG_BLCKSZ));
> 
> +#ifndef FRONTEND
> +   elog(LOG, "Allocation for xlogreader increased to %u", newSize);
> +#endif

So, I have been playing a bit more with that and defined the following
strategy to see if it is possible to create inconsistencies:
- Use a primary and a standby.
- Set up max_wal_size and min_wal_size to a minimum of 80MB so as the
segment recycling takes effect more quickly.
- Create a single table with a UUID column to increase the likelihood of
random data in INSERT records and FPWs, and insert enough data to
trigger a full WAL recycling.
- Every 5 seconds, insert a set of tuples into the table, using 110 to
120 tuples generates enough data for a bit more than a full WAL page.
And then restart the primary.  This causes the standby to catch up with
normally a page streamed which is not completely initialized as it
fetches the page in the middle.

With the monitoring mentioned in the upper comment block, I have let the
whole thing run for a couple of hours, but I have not been able to catch
up problems, except the usual "invalid record length at 0/XXX: wanted
24, got 0".  The allocation for recordbuf did not get higher than 40960
bytes as well, which matches with 5 WAL pages.

An other, evil, idea that I have on top of all those things is to
directly hexedit the WAL segment of the standby just at the limit where
it would receive a record from the primary and insert in it garbage
data which would make the validation tests to blow up in xlogreader.c
for the record allocation.
--
Michael


signature.asc
Description: PGP signature


Re: Incorrect grammar

2018-02-22 Thread Etsuro Fujita

(2018/02/23 0:09), Robert Haas wrote:

On Thu, Feb 22, 2018 at 6:54 AM, Etsuro Fujita
  wrote:

Here is a tiny patch to fix $SUBJECT in a comment in execPartition.c.


Committed.


Thanks!

Best regards,
Etsuro Fujita



Re: [HACKERS] Runtime Partition Pruning

2018-02-22 Thread Amit Langote
Hi David.

On 2018/02/23 0:11, David Rowley wrote:
> On 23 February 2018 at 01:15, David Rowley  
> wrote:
>> One problem that I'm facing now is down to the way I'm gathering the
>> ParamIds that match the partkeys. As you'll see from the patch I've
>> added a 'paramids' field to PartitionPruneContext and I'm populating
>> this when the clauses are being pre-processed in
>> extract_partition_clauses(). The problem is that the or_clauses are
>> not pre-processed at all, so the current patch will not properly
>> perform run-time pruning when the Params are "hidden" in OR branches.
>>
>> One way I thought to fix this was to change the clause processing to
>> create an array of PartitionClauseInfos, one for each OR branch. This
>> would also improve the performance of the run-time pruning, meaning
>> that all of the or clauses would be already matched to the partition
>> keys once, rather than having to redo that again each time a Param
>> changes its value.
>>
>> If I go and write a patch to do that, would you want it in your patch,
>> or would you rather I kept it over here?  Or perhaps you have a better
>> idea on how to solve...?
> 
> I've attached a patch which does this. For now, the patch is only
> intended to assist in the discussion of the above idea.
> 
> The patch is based on a WIP version of run-time pruning that I'm not
> quite ready to post yet, but with a small amount of work you could
> probably take it and base it on your faster partition pruning v31
> patch [1].
> 
> I ended up pulling the PartitionPruneInfo out of the
> PartitionPruneContext. This was required due how I've now made
> extract_partition_clauses() recursively call itself. We don't want to
> overwrite the context's clauseinfo with the one from the recursive
> call. A side effect of this is that the subcontext is no longer
> required when processing the OR clauses. You only did this so that the
> context's clauseinfo was not overwritten. I also think it's better to
> seperate out the inputs and outputs. Anything in context was more
> intended to be for input fields.
> 
> Let me know your thoughts about this. If you don't want it for faster
> partition pruning, then I'll probably go and tidy it up and include it
> for run-time pruning.

Thanks for the patch.

I don't have time today to look at the patch closely, but at first blush,
it seems like something I should incorporate into my own patch, because
it's restructuring the partprune.c code.  I will study the issue and your
proposed solution in the form of this restructuring more closely over the
weekend and reply (probably with a new version of my patch) on Monday.

Thanks,
Amit




Re: FOR EACH ROW triggers on partitioned tables

2018-02-22 Thread Amit Langote
On 2018/02/23 8:52, Alvaro Herrera wrote:
> We could mitigate the performance loss to some extent by adding more to
> RelationData.  For example, a "is_partition" boolean would help: skip
> searching pg_inherits for a relation that is not a partition.
Unless I'm missing something, doesn't rd_rel->relispartition help?

Thanks,
Amit




Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
> I was wondering a bit about how translations are maintained. Experience
> with another highly internationalised project years ago showed that many
> people who are willing to volunteer as translators are NOT willing to
> interact with git (or at the time, svn) and the other tools many of us take
> for granted.
> 
> Can more be gained with user-friendly, probably web-based, translation
> tools to make translation updating more accessible?

Do you have any recommendation for such tools?

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: [HACKERS] path toward faster partition pruning

2018-02-22 Thread Amit Langote
On 2018/02/22 20:28, David Rowley wrote:
> On 22 February 2018 at 22:48, Amit Langote
>  wrote:
>>> I'm having to add some NULL handling there for the run-time pruning
>>> patch but wondered if it was also required for your patch.
>>
>> Hmm, not sure why.  Can you explain a bit more?
> 
> hmm, yeah, but perhaps we should be discussing on the other thread...
> 
> With a prepared statement the Param will be unavailable until
> execution, in which case we don't do the const folding.

Ah right.

> A simple case is:
> 
> create table listp (a int) partition by list (a);
> create table listp1 partition of listp for values in(1);
> prepare q1 (int) as  select * from listp where a = $1;
> explain analyze execute q1(1); -- repeat 5 times.
> explain analyze execute q1(null); -- partkey_datum_from_expr() gets a
> NULL param via the call from nodeAppend.c

I wonder if NULLs should somehow be managed at a higher level, resulting
in the same behavior as const-folding in the optimizer produces?  In any
case, I suppose that would be something for the run-time pruning patch to
handle.

Thanks,
Amit




Re: Translations contributions urgently needed

2018-02-22 Thread Craig Ringer
On 23 February 2018 at 08:51, Tatsuo Ishii  wrote:

> > I also noticed the same thing few days ago, by the fact that RPM
> > release of PG10 doesn't contain psql translation.
> >
> > @10 >  ja   | pg_basebackup,pg_resetxlog,
> plpython,pltcl,postgres,psql
> > @9.4>  ja   | pg_basebackup,pg_resetxlog
> >
> > I'm not sure how the translations are maintained but I think I can help.
> > Ishii-san, Thom, may I take some (or all?) of the files?
>
> I'm not working on message translations.  Hotta-san is actively
> working on the translations here:
>
> https://github.com/hotta/pg-nls-ja
>
>
I was wondering a bit about how translations are maintained. Experience
with another highly internationalised project years ago showed that many
people who are willing to volunteer as translators are NOT willing to
interact with git (or at the time, svn) and the other tools many of us take
for granted.

Can more be gained with user-friendly, probably web-based, translation
tools to make translation updating more accessible?

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: non-bulk inserts and tuple routing

2018-02-22 Thread Amit Langote
On 2018/02/23 1:10, Robert Haas wrote:
> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>  wrote:
>>> Attached is an updated version for that.
>>
>> Thanks for updating the patch.
> 
> Committed with a few changes.  The big one was that I got rid of the
> local variable is_update in ExecSetupPartitionTupleRouting.  That
> saved a level of indentation on a substantial chunk of code, and it
> turns out that test was redundant anyway.

Thank you!

Regards,
Amit




Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
> I also noticed the same thing few days ago, by the fact that RPM
> release of PG10 doesn't contain psql translation.
> 
> @10 >  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
> @9.4>  ja   | pg_basebackup,pg_resetxlog
> 
> I'm not sure how the translations are maintained but I think I can help.
> Ishii-san, Thom, may I take some (or all?) of the files?

I'm not working on message translations.  Hotta-san is actively
working on the translations here:

https://github.com/hotta/pg-nls-ja

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: non-bulk inserts and tuple routing

2018-02-22 Thread Amit Langote
On 2018/02/23 1:53, Alvaro Herrera wrote:
> Robert Haas wrote:
>> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>>  wrote:
 Attached is an updated version for that.
>>>
>>> Thanks for updating the patch.
>>
>> Committed with a few changes.
> 
> I propose to tweak a few comments to PartitionTupleRouting, as attached.

I'd missed those.  Thank you!

Regards,
Amit




Re: Translations contributions urgently needed

2018-02-22 Thread Kyotaro HORIGUCHI
# my clone repository has been corrupted...sigh.

At Thu, 22 Feb 2018 18:32:15 -0500, Robert Haas  wrote 
in 
> On Thu, Feb 22, 2018 at 5:54 PM, Tatsuo Ishii  wrote:
> >> I have found that Japanese language support for the database server
> >> has been dropped for 10.  This is because it fell below the 80% of
> >> strings translated requirement, so it was shipped without Japanese.
> >
> > Was that mentioned in the 10.0 release note? I didn't know that.
> 
> I don't think we ever mention translations that are added or dropped
> in minor release notes.  But it seems unfortunate that we lost such an
> important one with little comment.

I also noticed the same thing few days ago, by the fact that RPM
release of PG10 doesn't contain psql translation.

@10 >  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
@9.4>  ja   | pg_basebackup,pg_resetxlog

I'm not sure how the translations are maintained but I think I can help.
Ishii-san, Thom, may I take some (or all?) of the files?

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: PostgreSQL crashes with SIGSEGV

2018-02-22 Thread Peter Geoghegan
On Mon, Feb 12, 2018 at 6:15 PM, Peter Geoghegan  wrote:
> * For 9.5 and 9.6, the approach taken in bugfix commit d8589946d
> should be taken even further -- we should always copy. Moreover, we
> should always copy within tuplesort_getdatum(), for the same reasons.
>
> * For 9.5, 9.6, 10, and master, we should make sure that
> tuplesort_getdatum() uses the caller's memory context. The fact that
> it doesn't already do so seems like a simple oversight. We should do
> this to be consistent with tuplesort_gettupleslot(). (This isn't
> critical, but seems like a good idea.)
>
> * For 9.5, 9.6, 10, and master, we should adjust some comments from
> tuplesort_getdatum() callers, so that they no longer say that
> tuplesort datum tuple memory lives in tuplesort context. That won't be
> true anymore.

Attached patches do it that way. I'm happy with what I came up with,
which is a lot simpler than my first approach. The extra copying seems
likely to be well worth it, since it is fairly isolated in practice,
especially on 9.6. There is no extra copying from v10+, since they
don't need the first fix at all.

-- 
Peter Geoghegan
From fcbdb2e4fb76a75a0c56ca258f99df888b174dc3 Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Thu, 14 Dec 2017 21:22:41 -0800
Subject: [PATCH] Fix double free of tuple with grouping sets/tuplesort.c.

Have tuplesort_gettupleslot() copy tuple contents for slot into caller's
memory context in all cases.  This necessitates an extra copy on 9.5 and
9.6, though only for cases where an explicit copy was not already
required (mostly just TSS_SORTEDONTAPE fetches, though all
TSS_FINALMERGE fetches on 9.5).  Do the same for tuplesort_getdatum().

This fixes a crash that can occur on at least 9.5 and 9.6, where some
grouping sets queries may call tuplesort_end() at node shutdown time,
and free tuple memory that the executor/a tuple slot imagines it is
responsible for.

In the case of all other fetch routines, continue to allocate the memory
in the tuplesort-owned context (only change comments).  This is a little
inconsistent with what we now do for tuplesort_gettupleslot() and
tuplesort_getdatum(), but that's preferable to adding new copy overhead
in the backbranches where it's clearly unnecessary.  These other fetch
routines provide the weakest possible guarantees about tuple memory
lifespan from v10 on, anyway, so this actually seems more consistent
overall.

In passing, change the memory context switch point within
tuplesort_getdatum() to matchtuplesort_gettupleslot(), so that we
actually allocate memory in the right context there (caller's context),
which seems like an independent bug.  It's not clear that this can cause
a crash, but it's still wrong.

Only the latter, less important fix is required on v10 and master,
because in v10 there is already no question of not having to make a copy
in caller's context anyway (see commits e94568ec and 3856cf96).

Author: Peter Geoghegan
Reported-By: Bernd Helmle
Analyzed-By: Peter Geoghegan, Andreas Seltenreich, Bernd Helmle
Discussion: https://www.postgresql.org/message-id/1512661638.9720.34.ca...@oopsware.de
Backpatch: 9.5-, where the issue is known to cause a crash.
---
 src/backend/utils/adt/orderedsetaggs.c | 21 +-
 src/backend/utils/sort/tuplesort.c | 74 +-
 2 files changed, 63 insertions(+), 32 deletions(-)

diff --git a/src/backend/utils/adt/orderedsetaggs.c b/src/backend/utils/adt/orderedsetaggs.c
index fe44d56..4d1e201 100644
--- a/src/backend/utils/adt/orderedsetaggs.c
+++ b/src/backend/utils/adt/orderedsetaggs.c
@@ -457,10 +457,9 @@ percentile_disc_final(PG_FUNCTION_ARGS)
 		elog(ERROR, "missing row in percentile_disc");
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned is allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	/* We shouldn't have stored any nulls, but do the right thing anyway */
@@ -575,10 +574,9 @@ percentile_cont_final_common(FunctionCallInfo fcinfo,
 	}
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, because the value
-	 * to be returned may be allocated inside its sortcontext.  We could use
-	 * datumCopy to copy it out of there, but it doesn't seem worth the
-	 * trouble, since the cleanup callback will clear the tuplesort later.
+	 * Note: we could clean up the tuplesort object here, but it doesn't seem
+	 * worth the trouble, since the cleanup callback will clear the tuplesort
+	 * later.
 	 */
 
 	PG_RETURN_DATUM(val);
@@ -1097,10 +1095,9 @@ mode_final(PG_FUNCTION_ARGS)
 		pfree(DatumGetPointer(last_val));
 
 	/*
-	 * Note: we *cannot* clean up the tuplesort object here, 

Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
>> Was that mentioned in the 10.0 release note? I didn't know that.
> 
> I don't think we ever mention translations that are added or dropped
> in minor release notes.

I think it would better for users to be noticed that in the release
notes because the message translations are apparently visible to
users. In my understanding the note can be generated automatically.

> But it seems unfortunate that we lost such an
> important one with little comment.

Yeah. I have spent a lot of time on translating manuals into Japanese
(even has not finished the work for 10.0) and I didn't have spare time
to work on the Japanese message translations.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: ERROR: left and right pathkeys do not match in mergejoin

2018-02-22 Thread Tom Lane
Alexander Kuzmenkov  writes:
>> The third possibility is to decide that create_mergejoin_plan is being
>> overly paranoid and it's okay to extract merge details from a "redundant"
>> path key even though it specifies the opposite sort order from what the
>> current merge clause seems to need.

> This check looks necessary to me, because merge join can't work with 
> such a combination of outer pathkeys and merge clauses.

No, I think it's okay, for exactly the same reason that the pathkey got
thrown away as redundant in the first place.  That is, imagine that we
are mergejoining on "o.a = i.x AND o.b = i.x", and we know that the sort
ordering of the outer relation is "a ASC, b DESC".  It would appear that
this means the inner relation needs to be ordered like "x ASC, x DESC",
and certainly if we were to play dumb and sort it that way, you'd expect
everything to work.  The reason that we're having trouble is that the
pathkey machinery knows that "x ASC, x DESC" is stupid, and it throws
away the second column.  It would not matter whether we considered the
sort ordering of the inner relation to be "x ASC, x DESC", or just
"x ASC", or indeed "x ASC, x ASC": all of those describe exactly the same
row ordering, because we only get to comparing the second sort column for
rows having equal values in the first sort column, and then our conclusion
must still be that the rows' values are equal.

This logic still goes through if the sort column contents are not simply
duplicate variables but distinct variables that have gotten put into the
same eclass, that is "SELECT ... WHERE x = y ORDER BY x ASC, y DESC".
Equal is equal.  (We go to some pains to be sure this is true, ie that
the equality operator's semantics agree with the ordering operator's.)

So essentially, we can lie to the executor and say that we sorted the
inner rel as "x ASC, x DESC", even though we really did no such thing,
because there will be no difference in the actual input row order from
what it would be if we had done that.

Attached is a completed patch that fixes this and also deals with the
overall topic that the inner and outer pathkeys aren't so interchangeable
as some parts of the code thought.  Maybe the comments could use further
improvement but I think it's OK otherwise.  I haven't started on
back-porting this, but the bugs are demonstrable back to 8.3, so that
needs to be done.

regards, tom lane

diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 396ee27..a2777ea 100644
*** a/src/backend/optimizer/path/joinpath.c
--- b/src/backend/optimizer/path/joinpath.c
*** sort_inner_and_outer(PlannerInfo *root,
*** 1009,1018 
  			outerkeys = all_pathkeys;	/* no work at first one... */
  
  		/* Sort the mergeclauses into the corresponding ordering */
! 		cur_mergeclauses = find_mergeclauses_for_pathkeys(root,
! 		  outerkeys,
! 		  true,
! 		  extra->mergeclause_list);
  
  		/* Should have used them all... */
  		Assert(list_length(cur_mergeclauses) == list_length(extra->mergeclause_list));
--- 1009,1017 
  			outerkeys = all_pathkeys;	/* no work at first one... */
  
  		/* Sort the mergeclauses into the corresponding ordering */
! 		cur_mergeclauses = find_mergeclauses_for_outer_pathkeys(root,
! outerkeys,
! extra->mergeclause_list);
  
  		/* Should have used them all... */
  		Assert(list_length(cur_mergeclauses) == list_length(extra->mergeclause_list));
*** generate_mergejoin_paths(PlannerInfo *ro
*** 1102, 
  		jointype = JOIN_INNER;
  
  	/* Look for useful mergeclauses (if any) */
! 	mergeclauses = find_mergeclauses_for_pathkeys(root,
!   outerpath->pathkeys,
!   true,
!   extra->mergeclause_list);
  
  	/*
  	 * Done with this outer path if no chance for a mergejoin.
--- 1101,1109 
  		jointype = JOIN_INNER;
  
  	/* Look for useful mergeclauses (if any) */
! 	mergeclauses = find_mergeclauses_for_outer_pathkeys(root,
! 		outerpath->pathkeys,
! 		extra->mergeclause_list);
  
  	/*
  	 * Done with this outer path if no chance for a mergejoin.
*** generate_mergejoin_paths(PlannerInfo *ro
*** 1228,1237 
  			if (sortkeycnt < num_sortkeys)
  			{
  newclauses =
! 	find_mergeclauses_for_pathkeys(root,
!    trialsortkeys,
!    false,
!    mergeclauses);
  Assert(newclauses != NIL);
  			}
  			else
--- 1226,1234 
  			if (sortkeycnt < num_sortkeys)
  			{
  newclauses =
! 	trim_mergeclauses_for_inner_pathkeys(root,
! 		 mergeclauses,
! 		 trialsortkeys);
  Assert(newclauses != NIL);
  			}
  			else
*** generate_mergejoin_paths(PlannerInfo *ro
*** 1272,1281 
  	if (sortkeycnt < num_sortkeys)
  	{
  		newclauses =
! 			

Re: FOR EACH ROW triggers on partitioned tables

2018-02-22 Thread Alvaro Herrera
Alvaro Herrera wrote:

> Another option is to rethink this feature from the ground up: instead of
> cloning catalog rows for each children, maybe we should have the trigger
> lookup code, when running DML on the child relation (the partition),
> obtain trigger entries not only for the child relation itself but also
> for its parents recursively -- so triggers defined in the parent are
> fired for the partitions, too.

I have written this, and it seems to work fine; it's attached.

Generally speaking, I like this better than my previously proposed
patch: having duplicate pg_trigger rows seems lame, in hindsight.

I haven't measured the performance loss, but we now scan pg_inherits
each time we build a relcache entry and relhastriggers=on.  Can't be
good.  In general, the pg_inherits stuff looks generally unnatural --
manually doing scans upwards (find parents) and downwards (find
partitions).  It's messy and there are no nice abstractions.
Partitioning looks too much bolted-on still.

We could mitigate the performance loss to some extent by adding more to
RelationData.  For example, a "is_partition" boolean would help: skip
searching pg_inherits for a relation that is not a partition.  The
indexing patch already added some "has_superclass()" calls and they look
somewhat out of place.  Also, we could add a syscache to pg_inherits.

Regarding making partitioning feel more natural, we could add some API
"list all ancestors", "list all descendents".  Maybe I should have used
find_inheritance_children.

Some cutesy: scanning multiple parents looking for potential triggers
means the order of indexscan results no longer guarantees the correct
ordering.  I had to add a qsort() there.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index 71e20f2740..1c92212dd6 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1873,7 +1873,9 @@ SCRAM-SHA-256$iteration 
count:
   
   
True if table has (or once had) triggers; see
-   pg_trigger catalog
+   pg_trigger catalog.
+   If this is a partition, triggers on its partitioned ancestors are also
+   considered
   
  
 
@@ -6991,6 +6993,13 @@ SCRAM-SHA-256$iteration 
count:
  
 
  
+  tginherits
+  bool
+  
+  True if trigger applies to children relations too
+ 
+
+ 
   tgnargs
   int2
   
diff --git a/src/backend/bootstrap/bootparse.y 
b/src/backend/bootstrap/bootparse.y
index ed7a55596f..7ad0126df5 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -230,6 +230,7 @@ Boot_CreateStmt:

   RELPERSISTENCE_PERMANENT,

   shared_relation,

   mapped_relation,
+   
   false,

   true);
elog(DEBUG4, "bootstrap 
relation created");
}
@@ -252,6 +253,7 @@ Boot_CreateStmt:

  mapped_relation,

  true,

  0,
+   
  false,

  ONCOMMIT_NOOP,

  (Datum) 0,

  false,
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index cf36ce4add..815f371ac2 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -257,6 +257,7 @@ heap_create(const char *relname,
char relpersistence,
bool shared_relation,
bool mapped_relation,
+   bool has_triggers,
bool allow_system_table_mods)
 {
boolcreate_storage;
@@ -351,7 +352,8 @@ heap_create(const char *relname,
 
shared_relation,

Re: Translations contributions urgently needed

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 5:54 PM, Tatsuo Ishii  wrote:
>> I have found that Japanese language support for the database server
>> has been dropped for 10.  This is because it fell below the 80% of
>> strings translated requirement, so it was shipped without Japanese.
>
> Was that mentioned in the 10.0 release note? I didn't know that.

I don't think we ever mention translations that are added or dropped
in minor release notes.  But it seems unfortunate that we lost such an
important one with little comment.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [HACKERS] Pluggable storage

2018-02-22 Thread Robert Haas
On Fri, Feb 16, 2018 at 5:56 AM, Alexander Korotkov
 wrote:
> BTW, EnterpriseDB announces zheap table access method (heap with undo log)
> [2].  I think this is great, and I'm looking forward for publishing zheap in
> mailing lists.  But I'm concerning about its compatibility with pluggable
> table access methods API.  Does zheap use table AM API from this thread?  Or
> does it just override current heap and needs to be adopted to use table AM
> API?  Or does it implements own API?

Right now it just hacks the code.  The plan is to adapt it to whatever
API we settle on in this thread.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Typo in predicate.c comment

2018-02-22 Thread Robert Haas
On Wed, Feb 21, 2018 at 3:10 PM, Thomas Munro
 wrote:
> Here's a tiny patch to fix a typo.

I have pushed a tiny commit containing it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: ERROR: left and right pathkeys do not match in mergejoin

2018-02-22 Thread Alexander Kuzmenkov

El 22/02/18 a las 21:03, Tom Lane escribió:

The third possibility is to decide that create_mergejoin_plan is being
overly paranoid and it's okay to extract merge details from a "redundant"
path key even though it specifies the opposite sort order from what the
current merge clause seems to need.


This check looks necessary to me, because merge join can't work with 
such a combination of outer pathkeys and merge clauses. The order of 
outer and inner column must be the same. Here, two outer columns with 
different orders are compared to the same inner column, so there will be 
mismatch one way or another. Consider the original query with some 
sample data:


    outer inner
  desc  asc   desc
   i2   k2 i1
--
-> 1    0  1 <-
   1    1  0
   1    2
   0

The merge join starts at tuples marked with arrows. It finds that the 
outer tuple is less than the inner, so it advances the inner pointer, 
and loses the tuple with i1 = 1.


So, if we can't do a merge join on these clauses with these pathkeys 
anyway, it is OK to discard some of them in 
find_mergeclauses_for_pathkey. This helps the case when we use the 
existing order of outer relation. Also, we should take care not to 
create such combinations in select_outer_pathkeys_for_merge, when it 
tries to match root->query_pathkeys.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Translations contributions urgently needed

2018-02-22 Thread Tatsuo Ishii
> I have found that Japanese language support for the database server
> has been dropped for 10.  This is because it fell below the 80% of
> strings translated requirement, so it was shipped without Japanese.

Was that mentioned in the 10.0 release note? I didn't know that.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp



Re: file cloning in pg_upgrade and CREATE DATABASE

2018-02-22 Thread Peter Eisentraut
On 2/21/18 18:57, Tomas Vondra wrote:
> Nice improvement, of course. How does that affect performance on the
> cloned database? If I understand this correctly, it essentially enables
> CoW on the files, so what's the overhead on that? It'd be unfortunate to
> speed up CREATE DATABASE only to get degraded performance later.

I ran a little test (on APFS and XFS): Create a large (unlogged) table,
copy the database, then delete everything from the table in the copy.
That should need to CoW all the blocks.  It has about the same
performance with cloning, possibly slightly faster.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Hash Joins vs. Bloom Filters / take 2

2018-02-22 Thread Peter Geoghegan
On Thu, Feb 22, 2018 at 1:14 PM, Tomas Vondra
 wrote:
> OK, thanks for reminding me about SBF and for the discussion.
>
> At this point I'll probably focus on the other parts though -
> determining selectivity of the join, etc. Which I think is crucial, and
> we need to get that right even for accurate estimates. It's good to know
> that we have a solution for that part, though.

+1

There are probably significant opportunities to improve the Bloom
filter. That isn't that interesting right now, though. Figuring out
how scalable Bloom filters might save hash join from being reliant on
the accuracy of the initial estimate of set cardinality seems
premature at best, since we haven't established how sensitive this
use-case is to misestimations. My sense is that it's actually
naturally very insensitive, but there is no need to spend too much
time on it just yet.

It just makes sense, as a matter of procedure, to focus on the hash
join code, and drill down from there. Personally, I'm tired of talking
about the nitty-gritty details of Bloom filters rather than the actual
problem at hand.

-- 
Peter Geoghegan



Re: [HACKERS] Partition-wise aggregation/grouping

2018-02-22 Thread Robert Haas
On Thu, Feb 8, 2018 at 8:05 AM, Jeevan Chalke
 wrote:
> In this attached version, I have rebased my changes over new design of
> partially_grouped_rel. The preparatory changes of adding
> partially_grouped_rel are in 0001.

I spent today hacking in 0001; results attached.  The big change from
your version is that this now uses generate_gather_paths() to add
Gather/Gather Merge nodes (except in the case where we sort by group
pathkeys and then Gather Merge) rather than keeping all of the bespoke
code.  That turned up to be a bit less elegant than I would have liked
-- I had to an override_rows argument to generate_gather_paths to make
it work.  But overall I think this is still a big improvement, since
it lets us share code instead of duplicating it.  Also, it potentially
lets us add partially-aggregated but non-parallel paths into
partially_grouped_rel->pathlist and that should Just Work; they will
get the Finalize Aggregate step but not the Gather.  With your
arrangement that wouldn't work.

Please review/test.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


partially-grouped-rel-rmh.patch
Description: Binary data


Re: Hash Joins vs. Bloom Filters / take 2

2018-02-22 Thread Tomas Vondra


On 02/22/2018 09:52 PM, Claudio Freire wrote:
> On Thu, Feb 22, 2018 at 5:11 PM, Tomas Vondra
>  wrote:
>> On 02/22/2018 08:33 PM, Claudio Freire wrote:
>>> That's kinda slow to do per-item. I tried to "count" distinct items by
>>> checking the BF before adding (don't add redundantly), but that's less
>>> precise than a HLL in my experience.
>>
>> But you don't need to do that for each item you try to add - you know
>> that with M bits and K hash functions you can't reach 50% before at
>> least M/(2*K) additions. And you don't really need to track the number
>> of bits exactly - if it gets 55% full, it's not going to explode.
>>
>> So just wait until M/(2*K) additions, see how many bits remain until the
>> threshold - rinse and repeat. And use some reasonable minimum distance
>> (say, 5% of the capacity), not to do the check too often.
> 
> Ok, that works too.
> 
> Point is, SBFs help to keep the BF size as small as possible, keep it
> below work_mem, and to avoid caring about the total number of distinct
> items.
> 
> You may want the planner to try and estimate that to figure out 
> whether it's worth trying BFs or not, but once you decide to try,
> SBFs remove any dependency on the accuracy of that estimate.
> 

OK, thanks for reminding me about SBF and for the discussion.

At this point I'll probably focus on the other parts though -
determining selectivity of the join, etc. Which I think is crucial, and
we need to get that right even for accurate estimates. It's good to know
that we have a solution for that part, though.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add PGDLLIMPORT to enable_hashagg

2018-02-22 Thread Andres Freund
On 2018-02-21 11:41:31 -0800, Brian Cloutier wrote:
> On Wed, Feb 21, 2018 at 10:14 AM, Andres Freund  wrote:
> 
> > Could you take the relevant commit, backport it to the
> > relevant branches, resolve conflicts, make possibly appropriate
> > adaptions, and post?
> >
> 
> The original commit touched some new variables and therefore didn't apply
> cleanly. Attached are equivalent patches for REL_10_STABLE and
> REL9_6_STABLE.

Pushed.

- Andres



Re: Hash Joins vs. Bloom Filters / take 2

2018-02-22 Thread Claudio Freire
On Thu, Feb 22, 2018 at 5:11 PM, Tomas Vondra
 wrote:
> On 02/22/2018 08:33 PM, Claudio Freire wrote:
>> That's kinda slow to do per-item. I tried to "count" distinct items by
>> checking the BF before adding (don't add redundantly), but that's less
>> precise than a HLL in my experience.
>
> But you don't need to do that for each item you try to add - you know
> that with M bits and K hash functions you can't reach 50% before at
> least M/(2*K) additions. And you don't really need to track the number
> of bits exactly - if it gets 55% full, it's not going to explode.
>
> So just wait until M/(2*K) additions, see how many bits remain until the
> threshold - rinse and repeat. And use some reasonable minimum distance
> (say, 5% of the capacity), not to do the check too often.

Ok, that works too.

Point is, SBFs help to keep the BF size as small as possible, keep it
below work_mem, and to avoid caring about the total number of distinct
items.

You may want the planner to try and estimate that to figure out
whether it's worth trying BFs or not, but once you decide to try, SBFs
remove any dependency on the accuracy of that estimate.



Re: Allow workers to override datallowconn

2018-02-22 Thread Andres Freund
Hi,

On 2018-02-22 15:24:50 -0500, Tom Lane wrote:
> You could possibly make it work with more aggressive refactoring, but
> I remain of the opinion that this is a fundamentally bad idea anyhow.
> A GUC of this kind is just ripe for abuse, and I don't think it's solving
> any problem we really need solved.

Oh, and I actually find this a thing that I needed a couple times
before. Databases used as templates are occasionally useful. To fill
their contents one usually wants to connect to them occasionally, but
most of the time connections ought to not be allowed so creating a
database with them as a template actually works.  It's solvable via
other means, but not conveniently.

Greetings,

Andres Freund



Re: Online enabling of checksums

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 9:23 PM, Andres Freund  wrote:

> Hi,
>
> On 2018-02-22 21:16:02 +0100, Magnus Hagander wrote:
> > You could do that, but then you've moving the complexity to managing that
> > list in shared memory instead.
>
> Maybe I'm missing something, but how are you going to get quick parallel
> processing if you don't have a shmem piece? You can't assign one
> database per worker because commonly there's only one database. You
> don't want to start/stop a worker for each relation because that'd be
> extremely slow for databases with a lot of tables. Without shmem you
> can't pass more than an oid to a bgworker. To me the combination of
> these things imply that you need some other synchronization mechanism
> *anyway*.
>

Yes, you probably need something like that if you want to be able to
parallelize on things inside each database. If you are OK parallelizing
things on a per-database level, you don't need it.



> > I'm not  sure that's any easier... And
> > certainly adding a catalog flag for a usecase like this one is not making
> > it easier.
>
> Hm, I imagined you'd need that anyway. Imagine a 10TB database that's
> online converted to checksums. I assume you'd not want to reread 9TB if
> you crash after processing most of the cluster already?
>

I would prefer that yes. But having to re-read 9TB is still significantly
better than not being able to turn on checksums at all (state today). And
adding a catalog column for it will carry the cost of the migration
*forever*, both for clusters that never have checksums and those that had
it from the beginning.

Accepting that the process will start over (but only read, not re-write,
the blocks that have already been processed) in case of a crash does
significantly simplify the process, and reduce the long-term cost of it in
the form of entries in the catalogs. Since this is a on-time operation (or
for many people, a zero-time operation), paying that cost that one time is
probably better than paying a much smaller cost but constantly.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Allow workers to override datallowconn

2018-02-22 Thread Andres Freund
On 2018-02-22 15:24:50 -0500, Tom Lane wrote:
> Magnus Hagander  writes:
> > I hacked up an attempt to do this. It does seem to work in the very simple
> > case, but it does requiring changing the order in InitPostgres() to load
> > the startup packet before validating those.
> 
> I doubt that's safe.  It requires, to name just one thing, an assumption
> that no processing done in process_startup_options has any need to know
> the database encoding, which is established by CheckMyDatabase.  Thus
> for instance, if any GUC settings carried in the startup packet include
> non-ASCII characters, the wrong things will happen.

I think those are effectively ascii only anyway. We process them with
pretty much ascii (or well 8 byte ascii compatible) only logic
afaict. C.f. pg_split_opts().


> You could possibly make it work with more aggressive refactoring, but
> I remain of the opinion that this is a fundamentally bad idea anyhow.
> A GUC of this kind is just ripe for abuse, and I don't think it's
> solving any problem we really need solved.

How's that any less safe than allowing to load libraries, disabling
system indexes, and reams of other things we allow via GUCs?

Greetings,

Andres Freund



Re: Allow workers to override datallowconn

2018-02-22 Thread Tom Lane
Magnus Hagander  writes:
> I hacked up an attempt to do this. It does seem to work in the very simple
> case, but it does requiring changing the order in InitPostgres() to load
> the startup packet before validating those.

I doubt that's safe.  It requires, to name just one thing, an assumption
that no processing done in process_startup_options has any need to know
the database encoding, which is established by CheckMyDatabase.  Thus
for instance, if any GUC settings carried in the startup packet include
non-ASCII characters, the wrong things will happen.

You could possibly make it work with more aggressive refactoring, but
I remain of the opinion that this is a fundamentally bad idea anyhow.
A GUC of this kind is just ripe for abuse, and I don't think it's solving
any problem we really need solved.

regards, tom lane



Re: Online enabling of checksums

2018-02-22 Thread Andres Freund
Hi,

On 2018-02-22 21:16:02 +0100, Magnus Hagander wrote:
> You could do that, but then you've moving the complexity to managing that
> list in shared memory instead.

Maybe I'm missing something, but how are you going to get quick parallel
processing if you don't have a shmem piece? You can't assign one
database per worker because commonly there's only one database. You
don't want to start/stop a worker for each relation because that'd be
extremely slow for databases with a lot of tables. Without shmem you
can't pass more than an oid to a bgworker. To me the combination of
these things imply that you need some other synchronization mechanism
*anyway*.


> I'm not  sure that's any easier... And
> certainly adding a catalog flag for a usecase like this one is not making
> it easier.

Hm, I imagined you'd need that anyway. Imagine a 10TB database that's
online converted to checksums. I assume you'd not want to reread 9TB if
you crash after processing most of the cluster already?

Regards,

Andres Freund



Re: Online enabling of checksums

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 8:52 PM, Andres Freund  wrote:

>
>
> On February 22, 2018 11:44:17 AM PST, Magnus Hagander 
> wrote:
> >On Thu, Feb 22, 2018 at 8:41 PM, Andres Freund 
> >wrote:
> >In this particular case that would at least phase 1 simplify it because
> >we'd only need one process instead of worker/launcher. However, if we'd
> >ever want to parallellize it -- or any other process of the style, like
> >autovacuum -- you'd still need a launcher+worker combo. So making that
> >particular scenario simpler might be worthwhile on it's own.
>
> Why is that needed? You can just start two bgworkers and process a list of
> items stored in shared memory. Or even just check, I assume there'd be a
> catalog flag somewhere, whether a database / table / object of granularity
> has already been processed and use locking to prevent concurrent access.
>

You could do that, but then you've moving the complexity to managing that
list in shared memory instead. I'm not  sure that's any easier... And
certainly adding a catalog flag for a usecase like this one is not making
it easier.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Allow workers to override datallowconn

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 9:09 PM, Tom Lane  wrote:

> Andres Freund  writes:
> > The more important part I think is that we solve it via a GUC that can
> > be used outside of bgworkers.
>
> Are you proposing an "ignore_datallowconn" GUC?  That's a remarkably
> bad idea.  We don't have infrastructure that would allow it to be set
> at an appropriate scope.  I can't imagine any good use-case for allowing
> it to be on globally; you'd want it to be per-session (or per-bgworker).
> But I don't think there's any way to change the system default setting
> in time for the setting to take effect during connection startup or
> bgworker startup.
>

I hacked up an attempt to do this. It does seem to work in the very simple
case, but it does requiring changing the order in InitPostgres() to load
the startup packet before validating those.

PFA WIP, which also shows how to do it in the background worker.

This one also lets me do

PGOPTIONS="-c ignore_connection_restrictionon" psql template0

to bypass the restriction, which is what pg_upgrade would basically do.


Magnus' most recent patch in this thread seems like a fine answer for
> bgworkers.  You've moved the goalposts into the next county, and the
> design you're proposing to satisfy that goal is lousy.  It will end
> up being a large amount of additional work with no benefit except
> being able to use an arguably less ugly (but almost certainly no
> shorter) datallowconn override method in pg_upgrade.
>

*If* the moving of the startup packet processing to one step earlier in
InitPostgres is safe, then it is actually less code this way right now.
>From a quick look I think it's safe to move it, but I haven't looked in
detail.

I also haven't checked if this actually helps in the pg_upgrade case, but
if bypassing datallowconn is what's needed there then it shuld.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/backend/postmaster/checksumhelper.c b/src/backend/postmaster/checksumhelper.c
index 44535f9976..997bffa416 100644
--- a/src/backend/postmaster/checksumhelper.c
+++ b/src/backend/postmaster/checksumhelper.c
@@ -586,6 +588,8 @@ ChecksumHelperWorkerMain(Datum arg)
 	ereport(DEBUG1,
 			(errmsg("Checksum worker starting for database oid %d", dboid)));
 
+	SetConfigOption("ignore_connection_restriction", "true", PGC_SU_BACKEND, PGC_S_OVERRIDE);
+
 	BackgroundWorkerInitializeConnectionByOid(dboid, InvalidOid);
 
 	/*
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index 484628987f..7034697e1b 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -326,7 +326,7 @@ CheckMyDatabase(const char *name, bool am_superuser)
 		/*
 		 * Check that the database is currently allowing connections.
 		 */
-		if (!dbform->datallowconn)
+		if (!dbform->datallowconn && !IgnoreDatAllowConn)
 			ereport(FATAL,
 	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
 	 errmsg("database \"%s\" is not currently accepting connections",
@@ -999,14 +999,6 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	/* set up ACL framework (so CheckMyDatabase can check permissions) */
 	initialize_acl();
 
-	/*
-	 * Re-read the pg_database row for our database, check permissions and set
-	 * up database-specific GUC settings.  We can't do this until all the
-	 * database-access infrastructure is up.  (Also, it wants to know if the
-	 * user is a superuser, so the above stuff has to happen first.)
-	 */
-	if (!bootstrap)
-		CheckMyDatabase(dbname, am_superuser);
 
 	/*
 	 * Now process any command-line switches and any additional GUC variable
@@ -1016,6 +1008,15 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 	if (MyProcPort != NULL)
 		process_startup_options(MyProcPort, am_superuser);
 
+	/*
+	 * Re-read the pg_database row for our database, check permissions and set
+	 * up database-specific GUC settings.  We can't do this until all the
+	 * database-access infrastructure is up.  (Also, it wants to know if the
+	 * user is a superuser, so the above stuff has to happen first.)
+	 */
+	if (!bootstrap)
+		CheckMyDatabase(dbname, am_superuser);
+
 	/* Process pg_db_role_setting options */
 	process_settings(MyDatabaseId, GetSessionUserId());
 
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 039b63bb05..8ac6c7eefe 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -461,6 +461,7 @@ bool		row_security;
 bool		check_function_bodies = true;
 bool		default_with_oids = false;
 bool		session_auth_is_superuser;
+bool		IgnoreDatAllowConn;
 
 int			log_min_error_statement = ERROR;
 int			log_min_messages = WARNING;
@@ -1648,6 +1649,18 @@ static struct config_bool ConfigureNamesBool[] =
 	},
 
 	{
+		{"ignore_connection_restriction", PGC_SU_BACKEND, DEVELOPER_OPTIONS,
+			

Re: Online enabling of checksums

2018-02-22 Thread Peter Eisentraut
On 2/22/18 12:38, Magnus Hagander wrote:
> I'm not entirely sure which the others ones are. Auto-Vacuum obviously
> is one, which doesn't use the worker infrastructure. But I'm not sure
> which the others are referring to? 

autovacuum, subscription workers, auto prewarm

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Allow workers to override datallowconn

2018-02-22 Thread Tom Lane
Andres Freund  writes:
> The more important part I think is that we solve it via a GUC that can
> be used outside of bgworkers.

Are you proposing an "ignore_datallowconn" GUC?  That's a remarkably
bad idea.  We don't have infrastructure that would allow it to be set
at an appropriate scope.  I can't imagine any good use-case for allowing
it to be on globally; you'd want it to be per-session (or per-bgworker).
But I don't think there's any way to change the system default setting
in time for the setting to take effect during connection startup or
bgworker startup.

Magnus' most recent patch in this thread seems like a fine answer for
bgworkers.  You've moved the goalposts into the next county, and the
design you're proposing to satisfy that goal is lousy.  It will end
up being a large amount of additional work with no benefit except
being able to use an arguably less ugly (but almost certainly no
shorter) datallowconn override method in pg_upgrade.

regards, tom lane



Re: Allow workers to override datallowconn

2018-02-22 Thread Tom Lane
Magnus Hagander  writes:
> On Thu, Feb 22, 2018 at 8:26 PM, Andres Freund  wrote:
>> What's the argument against?

> Complexity for the bgw usecase.

They'd be completely different implementations and code paths, no?

For pg_upgrade to use such a thing it'd need to be a connection parameter
of some sort (implying, eg, infrastructure in libpq), while for a bgworker
there's no such animal as connection parameters because there's no
connection.

Certainly what pg_upgrade has to do is a bit ugly, but you'd be adding
an awful lot of code to get rid of a small amount of code.  Doesn't
seem like a great tradeoff.  Even if it is a good tradeoff, it seems
entirely unrelated to the bgworker's problem.

regards, tom lane



Re: Online enabling of checksums

2018-02-22 Thread Andres Freund


On February 22, 2018 11:44:17 AM PST, Magnus Hagander  
wrote:
>On Thu, Feb 22, 2018 at 8:41 PM, Andres Freund 
>wrote:
>In this particular case that would at least phase 1 simplify it because
>we'd only need one process instead of worker/launcher. However, if we'd
>ever want to parallellize it -- or any other process of the style, like
>autovacuum -- you'd still need a launcher+worker combo. So making that
>particular scenario simpler might be worthwhile on it's own.

Why is that needed? You can just start two bgworkers and process a list of 
items stored in shared memory. Or even just check, I assume there'd be a 
catalog flag somewhere, whether a database / table / object of granularity has 
already been processed and use locking to prevent concurrent access.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: ERROR: left and right pathkeys do not match in mergejoin

2018-02-22 Thread Tom Lane
I wrote:
> The third possibility is to decide that create_mergejoin_plan is being
> overly paranoid and it's okay to extract merge details from a "redundant"
> path key even though it specifies the opposite sort order from what the
> current merge clause seems to need.  This is scary at first glance, but
> it seems like it should work.

BTW, while working through this, I realized that there's an additional
problem in the same area, which can be demonstrated thus in the
regression database:

explain select * from
  (select * from tenk1 a order by a.unique2) a
  right join
  (select * from tenk1 b order by b.thousand, b.twothousand, b.fivethous) b
  on a.unique2 = b.thousand and a.hundred = b.twothousand and a.unique2 = 
b.fivethous;
ERROR:  outer pathkeys do not match mergeclauses

The problem here is that find_mergeclauses_for_pathkeys has an API
designed on the assumption that the outer and inner pathkeys have
identical relationships to the mergeclauses, ie, if you want to find
the mergeclauses to use given a particular available sort ordering,
it works basically the same way for outer or inner pathkeys.  But per
this discussion, tain't so.  What really happens is that initially
we choose a list of mergeclauses that line up with the outer pathkeys,
and then we identify inner pathkeys that make sense given that list
of mergeclauses; but, in the presence of redundancy, those inner pathkeys
are not necessarily one-for-one with the mergeclauses.  All fine so
far.  But then, generate_mergejoin_paths looks at inner paths that have
available sort keys that are truncations of the original inner pathkey
list, and it uses find_mergeclauses_for_pathkeys to decide which
mergeclauses still make sense for that truncated pathkey list.  That
doesn't work.  In my example, we consider the pre-ordered sub-select
output for B as the outer side, and we choose all three mergeclauses
in the order the example has them, and we arrive at the initial inner
pathkey list of ({a.unique2}, {a.hundred}), dropping the redundant
second appearance of {a.unique2}.  We can successfully make a plan
from that.  But then we consider the truncated inner pathkey list
({a.unique2}), which this example is set up to ensure will win since
it avoids an extra sort step.  As the code stands, the mergeclause list
that's extracted to use with that pair of input paths is
a.unique2 = b.thousand and a.unique2 = b.fivethous
(from the find_mergeclauses_for_pathkeys call with outer_keys = false).
That's just wrong, because it doesn't match the already-determined
outer path order, and create_mergejoin_plan is quite right to whine.

So it seems like find_mergeclauses_for_pathkeys should be renamed to
find_mergeclauses_for_outer_pathkeys, and then we need a second
function for the task of truncating the outer mergeclauses --- not
selecting them from scratch --- given a truncated inner pathkey list.
In this example, we could keep a.unique2 = b.thousand, but we'd have
to drop a.hundred = b.twothousand and then everything after it, since
the inner path doesn't have the sort order needed for that mergeclause.

regards, tom lane



Re: Allow workers to override datallowconn

2018-02-22 Thread Andres Freund
On 2018-02-22 20:30:02 +0100, Magnus Hagander wrote:
> Complexity for the bgw usecase. It now has to construct a key/value pair
> with proper escaping (well, for this one flag it would be easy, but if we
> do that wouldn't we also support the other config params? Were you thinking
> we'd have *just* tihs one?). Basically taking a data that you have in a
> "structured format"  in the btw already turning it to a string and then
> parsing it back into structured data in the same string.

The serialization problem seems out of scope, I don't see why this code
would need to deal with it. Receiving the options would be pretty easy,
we pretty much have the relevant code for PGOPTIONS handling.

The more important part I think is that we solve it via a GUC that can
be used outside of bgworkers. Whether we require setting it via
set_config() for bgworkers or have option parsing doesn't matter that
much.


> I think it'd be cleaner to let the bgw initializer pass those as flags. A
> "user connection" parameter could still use the booelan in InitPostgres()
> of course, and not invent a new things there, but the entry point API could
> stay simpler.

I'm pretty strongly against having a special case flag for bgw
initialization for this.

Greetings,

Andres Freund



Re: Online enabling of checksums

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 8:41 PM, Andres Freund  wrote:

>
> On 2018-02-22 20:30:52 +0100, Magnus Hagander wrote:
> > On Thu, Feb 22, 2018 at 8:24 PM, Andres Freund 
> wrote:
> > > I suspect I'm going to get some grief for this, but I think the time
> has
> > > come to bite the bullet and support changing databases in the same
> > > process...
> > >
> >
> > Hey, I can't even see the goalposts anymore :P
>
> Hah. I vote for making this a hard requirement :P
>

Hah! Are you handing out binoculars? :)



> > Are you saying this should be done *in general*, or specifically for
> > background workers? I'm assuming you mean the general case?
>
> I'd say bgworkers first. It's a lot clearer how to exactly do it
> there. Refactoring the mainloop handling in PostgresMain() would be a
> bigger task.
>
>
Yeah, it'd probably be easier. I don't know exactly what it'd involve but
clearly less.

In this particular case that would at least phase 1 simplify it because
we'd only need one process instead of worker/launcher. However, if we'd
ever want to parallellize it -- or any other process of the style, like
autovacuum -- you'd still need a launcher+worker combo. So making that
particular scenario simpler might be worthwhile on it's own.




> > That would be very useful, but is probably a fairly non-trivial task
> > (TM).
>
> I'm not actually that sure it is. We have nearly all the code, I
> think. Syscache inval, ProcKill(), and then you're nearly ready to do
> the normal connection dance again.
>

I'll take your word for it :) I haven't dug into that part.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Online enabling of checksums

2018-02-22 Thread Andres Freund
Hi,

On 2018-02-22 20:30:52 +0100, Magnus Hagander wrote:
> On Thu, Feb 22, 2018 at 8:24 PM, Andres Freund  wrote:
> > I suspect I'm going to get some grief for this, but I think the time has
> > come to bite the bullet and support changing databases in the same
> > process...
> >
> 
> Hey, I can't even see the goalposts anymore :P

Hah. I vote for making this a hard requirement :P


> Are you saying this should be done *in general*, or specifically for
> background workers? I'm assuming you mean the general case?

I'd say bgworkers first. It's a lot clearer how to exactly do it
there. Refactoring the mainloop handling in PostgresMain() would be a
bigger task.


> That would be very useful, but is probably a fairly non-trivial task
> (TM).

I'm not actually that sure it is. We have nearly all the code, I
think. Syscache inval, ProcKill(), and then you're nearly ready to do
the normal connection dance again.

Greetings,

Andres Freund



Re: Hash Joins vs. Bloom Filters / take 2

2018-02-22 Thread Claudio Freire
On Thu, Feb 22, 2018 at 12:45 PM, Tomas Vondra
 wrote:
>
>
> On 02/22/2018 12:44 PM, Claudio Freire wrote:
>> Let me reiterate, you can avoid both issues with scalable bloom filters[1].
>>
>
> I'm afraid it's not as straight-forward as "Use scalable bloom filters!"
>
> This is not merely a question of unreliable estimates of number of
> entries. That could have been solved by scalable bloom filters, which
> are essentially a sequence of larger and larger bloom filters, added
> when the smaller bloom filter "fills up" (1/2 full).
>
> The problem is twofold:
>
> (a) we need to decide what false positive rate to use (i.e. what is a
> reasonable trade-off between filter size and how much work it saves)
>
> (b) we also need to consider work_mem (which I assume we all agree we
> must respect)
> So for example we can't size the first bloom filter to just perfectly
> fit into work_mem, only to add larger bloom filters later (each 2x the
> size of the previous one). Not only will that increase the memory usage
> to 7x the initial estimate

Aside from a, (b) is exactly the problem SBFs solve.

You decide how much of work_mem you're willing to devote for BFs, and
then keep creating bigger and bigger BFs until you've allocated that
many.

Then you either keep updating the biggest filter, or give up entirely.

You can use a rather conservative initial bloom filter size for that,
and let scalability expand until you hit the limit.

> but it will also make the bloom filter less
> efficient (having to probe larger and larger filters, likely not fitting
> into CPU cache).

Again, you factor that into account when choosing the limit.

>> An HLL can be used to estimate set size, the paper makes no mention of
>> it, probably assuming only distinct items are added to the set.
>>
>
> The problem with HLL is, it's only an estimate of how many entries you
> saw so far. It only tells you that after observing the items, and it
> only tells you how many items you saw so far. What we need for sizing a
> bloom filter is an estimate of number of distinct values in advance.
>
> In other words, HLL is entirely useless for sizing Bloom Filters.

Normal BFs, yes. But that's exactly what you need for scalable BFs.
You need an estimate of the amount of distinct entries you've added to
your current filter, not the total set size.

> Furthermore, we could estimate number of observed distinct values from
> the number of 1s in the bloom filter

That's kinda slow to do per-item. I tried to "count" distinct items by
checking the BF before adding (don't add redundantly), but that's less
precise than a HLL in my experience.



Re: Online enabling of checksums

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 8:24 PM, Andres Freund  wrote:

> On 2018-02-22 08:22:48 -0500, Peter Eisentraut wrote:
> > On 2/21/18 15:53, Magnus Hagander wrote:
> > > *Two new functions are added, pg_enable_data_checksums() and
> > > pg_disable_data_checksums(). The disable one is easy -- it just changes
> > > to disable. The enable one will change the state to inprogress, and
> then
> > > start a background worker (the “checksumhelper launcher”). This worker
> > > in turn will start one sub-worker (“checksumhelper worker”) in each
> > > database (currently all done sequentially).*
> >
> > This is at least the fourth version of the pattern launcher plus worker
> > background workers.  I wonder whether we can do something to make this
> > easier and less repetitive.  Not in this patch, of course.
>
> I suspect I'm going to get some grief for this, but I think the time has
> come to bite the bullet and support changing databases in the same
> process...
>

Hey, I can't even see the goalposts anymore :P

Are you saying this should be done *in general*, or specifically for
background workers? I'm assuming you mean the general case? That would be
very useful, but is probably a fairly non-trivial task (TM).


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Allow workers to override datallowconn

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 8:26 PM, Andres Freund  wrote:

> Hi,
>
> On 2018-02-22 20:24:03 +0100, Magnus Hagander wrote:
> > In a background worker you can just set the parameter using
> > SetConfigOption(), no? That seems a lot easier than turning things in to
> a
> > kv pair and back...
>
> Sure, but, it doesn't seem bad to offer the option to only allow this
> for code running as superuser.



> > I can see the point for having such a parameter for pg_upgrade, but I'm
> not
> > sure we'd necessarily want to overload them.
>
> What's the argument against?
>

Complexity for the bgw usecase. It now has to construct a key/value pair
with proper escaping (well, for this one flag it would be easy, but if we
do that wouldn't we also support the other config params? Were you thinking
we'd have *just* tihs one?). Basically taking a data that you have in a
"structured format"  in the btw already turning it to a string and then
parsing it back into structured data in the same string.

I think it'd be cleaner to let the bgw initializer pass those as flags. A
"user connection" parameter could still use the booelan in InitPostgres()
of course, and not invent a new things there, but the entry point API could
stay simpler.

I haven't actually looked into what it would look like, so it could be that
I'm overestimating what it'd mean of course.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Allow workers to override datallowconn

2018-02-22 Thread Andres Freund
Hi,

On 2018-02-22 20:24:03 +0100, Magnus Hagander wrote:
> In a background worker you can just set the parameter using
> SetConfigOption(), no? That seems a lot easier than turning things in to a
> kv pair and back...

Sure, but, it doesn't seem bad to offer the option to only allow this
for code running as superuser.


> I can see the point for having such a parameter for pg_upgrade, but I'm not
> sure we'd necessarily want to overload them.

What's the argument against?

Greetings,

Andres Freund



Re: Online enabling of checksums

2018-02-22 Thread Andres Freund
On 2018-02-22 08:22:48 -0500, Peter Eisentraut wrote:
> On 2/21/18 15:53, Magnus Hagander wrote:
> > *Two new functions are added, pg_enable_data_checksums() and
> > pg_disable_data_checksums(). The disable one is easy -- it just changes
> > to disable. The enable one will change the state to inprogress, and then
> > start a background worker (the “checksumhelper launcher”). This worker
> > in turn will start one sub-worker (“checksumhelper worker”) in each
> > database (currently all done sequentially).*
> 
> This is at least the fourth version of the pattern launcher plus worker
> background workers.  I wonder whether we can do something to make this
> easier and less repetitive.  Not in this patch, of course.

I suspect I'm going to get some grief for this, but I think the time has
come to bite the bullet and support changing databases in the same
process...

Greetings,

Andres Freund



Re: Allow workers to override datallowconn

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 8:17 PM, Andres Freund  wrote:

> Hi,
>
> On 2018-02-22 19:01:35 +0100, Magnus Hagander wrote:
> > In working on the checksumhelper patch, we came across wanting a
> background
> > worker to be allowed to bypass datallowconn for a database. Right now we
> > didn't take care of that, and just said "you have to ALTER TABLE" first.
>
> I suspect you mean ALTER DATABASE, rather than table? ;)
>

Um. Yes :)



> I wonder if we don't want this to be a slightly more generic
> facility. E.g. pg_upgrade has the same need. Perhaps we whould
> superuser-only connection-establishment only option that allows
> connecting to a database even if datallowconn = false, and then
> additionally allow to pass connection arguments to
> BackgroundWorkerInitializeConnection()? It seems fairly reasonable to
> want to establish different GUCs via normal GUC-y mechanisms when
> establishing a bgworker connection...
>

In a background worker you can just set the parameter using
SetConfigOption(), no? That seems a lot easier than turning things in to a
kv pair and back...

I can see the point for having such a parameter for pg_upgrade, but I'm not
sure we'd necessarily want to overload them.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Allow workers to override datallowconn

2018-02-22 Thread Simon Riggs
On 22 February 2018 at 18:24, Tom Lane  wrote:

>> Are there any other caveats in doing that this actually makes it dangerous
>> to just allow bypassing it for extensions?
>
> Don't think so; we autovacuum such DBs anyway don't we?

Yeh, there is already precedent that should mean it is easy/default
for background workers to ignore datallowcon.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Allow workers to override datallowconn

2018-02-22 Thread Andres Freund
Hi,

On 2018-02-22 19:01:35 +0100, Magnus Hagander wrote:
> In working on the checksumhelper patch, we came across wanting a background
> worker to be allowed to bypass datallowconn for a database. Right now we
> didn't take care of that, and just said "you have to ALTER TABLE" first.

I suspect you mean ALTER DATABASE, rather than table? ;)

I wonder if we don't want this to be a slightly more generic
facility. E.g. pg_upgrade has the same need. Perhaps we whould
superuser-only connection-establishment only option that allows
connecting to a database even if datallowconn = false, and then
additionally allow to pass connection arguments to
BackgroundWorkerInitializeConnection()? It seems fairly reasonable to
want to establish different GUCs via normal GUC-y mechanisms when
establishing a bgworker connection...

Regards,

Andres



Re: Allow workers to override datallowconn

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 7:24 PM, Tom Lane  wrote:

> Magnus Hagander  writes:
> > Attached is a patch that adds new Override versions of the functions to
> > connect to a database from a background worker.
>
> > Another option would be to just add the parameter directly to the regular
> > connection function, and not create separate functions. But that would
> make
> > it an incompatible change. And since background workers are commonly used
> > in extensions, that would break a lot of extensions out there. I figured
> > it's probably not worth doing that, and thus added the new functions.
> What
> > do others think about that?
>
> Meh.  We change exported APIs in new major versions all the time.  As
> long as it's just a question of an added parameter, people can deal
> with it.  You could take the opportunity to future-proof a little by
> making this option be the first bit in a flags parameter, so that at
> least future boolean option additions don't require another API break
> or a whole new set of redundant functions.
>

Fair enough. In that case, something like the attached?


> Are there any other caveats in doing that this actually makes it dangerous
> > to just allow bypassing it for extensions?
>
> Don't think so; we autovacuum such DBs anyway don't we?
>

Yes. We set the freeze ages to 0 but we do run on it.


-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/contrib/pg_prewarm/autoprewarm.c b/contrib/pg_prewarm/autoprewarm.c
index f99f9c07af..bb28e237d1 100644
--- a/contrib/pg_prewarm/autoprewarm.c
+++ b/contrib/pg_prewarm/autoprewarm.c
@@ -445,7 +445,7 @@ autoprewarm_database_main(Datum main_arg)
 		ereport(ERROR,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
  errmsg("could not map dynamic shared memory segment")));
-	BackgroundWorkerInitializeConnectionByOid(apw_state->database, InvalidOid);
+	BackgroundWorkerInitializeConnectionByOid(apw_state->database, InvalidOid, 0);
 	block_info = (BlockInfoRecord *) dsm_segment_address(seg);
 	pos = apw_state->prewarm_start_idx;
 
diff --git a/src/backend/access/transam/parallel.c b/src/backend/access/transam/parallel.c
index 9d4efc0f8f..1d631b7275 100644
--- a/src/backend/access/transam/parallel.c
+++ b/src/backend/access/transam/parallel.c
@@ -1324,7 +1324,8 @@ ParallelWorkerMain(Datum main_arg)
 
 	/* Restore database connection. */
 	BackgroundWorkerInitializeConnectionByOid(fps->database_id,
-			  fps->authenticated_user_id);
+			  fps->authenticated_user_id,
+			  0);
 
 	/*
 	 * Set the client encoding to the database encoding, since that is what
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 28ff2f0979..1430894ad2 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -498,7 +498,7 @@ BootstrapModeMain(void)
 	 */
 	InitProcess();
 
-	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL);
+	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL, false);
 
 	/* Initialize stuff for bootstrap-file processing */
 	for (i = 0; i < MAXATTR; i++)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 702f8d8188..a52178f7d3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -477,7 +477,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 	InitProcess();
 #endif
 
-	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL);
+	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL, false);
 
 	SetProcessingMode(NormalProcessing);
 
@@ -1679,7 +1679,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 		 * Note: if we have selected a just-deleted database (due to using
 		 * stale stats info), we'll fail and exit here.
 		 */
-		InitPostgres(NULL, dbid, NULL, InvalidOid, dbname);
+		InitPostgres(NULL, dbid, NULL, InvalidOid, dbname, false);
 		SetProcessingMode(NormalProcessing);
 		set_ps_display(dbname, false);
 		ereport(DEBUG1,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3ddf828bb..6f1b621f70 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -5585,7 +5585,7 @@ MaxLivePostmasterChildren(void)
  * Connect background worker to a database.
  */
 void
-BackgroundWorkerInitializeConnection(const char *dbname, const char *username)
+BackgroundWorkerInitializeConnection(const char *dbname, const char *username, uint32 flags)
 {
 	BackgroundWorker *worker = MyBgworkerEntry;
 
@@ -5595,7 +5595,7 @@ BackgroundWorkerInitializeConnection(const char *dbname, const char *username)
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
  errmsg("database connection requirement not indicated during registration")));
 
-	InitPostgres(dbname, InvalidOid, username, InvalidOid, NULL);
+	InitPostgres(dbname, InvalidOid, username, InvalidOid, NULL, 

Re: [HACKERS] PoC: full merge join on comparison clause

2018-02-22 Thread Alexander Kuzmenkov

Here are some updates on this patch.

I split it into two parts. The preparatory part contains some mechanical 
changes to prepare for the main part. Most importantly, a new field is 
added, `RestrictInfo.is_mj_equality`. It is a marker of mergejoinable 
equality clauses, and `RestrictInfo.mergeopfamilies` is a more general 
marker of clauses that are mergejoinable but not necessarily equality. 
The usages are changed accordingly.


The main part consists of executor and planner changes required to 
support inequality merge joins.


The executor changes are as described in the original post.

The planner part has changed significantly since the last version. It 
used to apply some shady hacks to ensure we have the required sort 
orders of inner and outer paths. Now I think I found a reasonable way to 
generate the pathkeys we need. When we sort outer relation in 
`sort_inner_and_outer()`, the pathkeys are generated by 
`select_outer_pathkeys_for_merge()`. When we use the pathkeys we already 
have for the outer relation in `match_unsorted_outer()`, mergeclauses 
are selected by `find_mergeclauses_for_pathkeys()`. I changed these 
functions to select the right pathkey direction for merge clauses, and 
also ensure that we only have a single inequality merge clause and it is 
the last one. Also, to use the index paths, I changed 
`pathkeys_useful_for_merging()` to keep both pathkey directions for 
inequality merge clauses.


Some basic joins work, but I couldn't properly test all the corner cases 
with different orderings, because they depend on a bug in vanilla merge 
joins [1].


To sum up, the preparatory and executor changes are stable, and the 
planner part is WIP.


1. 
https://www.postgresql.org/message-id/flat/5dad9160-4632-0e47-e120-8e2082000...@postgrespro.ru


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/executor/nodeMergejoin.c b/src/backend/executor/nodeMergejoin.c
index f50205ec8a..861327b928 100644
--- a/src/backend/executor/nodeMergejoin.c
+++ b/src/backend/executor/nodeMergejoin.c
@@ -166,8 +166,8 @@ typedef enum
  * In addition to the expressions themselves, the planner passes the btree
  * opfamily OID, collation OID, btree strategy number (BTLessStrategyNumber or
  * BTGreaterStrategyNumber), and nulls-first flag that identify the intended
- * sort ordering for each merge key.  The mergejoinable operator is an
- * equality operator in the opfamily, and the two inputs are guaranteed to be
+ * sort ordering for each merge key.  The mergejoinable operator is a
+ * comparison operator in the opfamily, and the two inputs are guaranteed to be
  * ordered in either increasing or decreasing (respectively) order according
  * to the opfamily and collation, with nulls at the indicated end of the range.
  * This allows us to obtain the needed comparison function from the opfamily.
@@ -200,6 +200,9 @@ MJExamineQuals(List *mergeclauses,
 		Oid			op_righttype;
 		Oid			sortfunc;
 
+		if (parent->mj_Ineq_Present)
+			elog(ERROR, "inequality mergejoin clause must be the last one");
+
 		if (!IsA(qual, OpExpr))
 			elog(ERROR, "mergejoin clause is not an OpExpr");
 
@@ -225,9 +228,40 @@ MJExamineQuals(List *mergeclauses,
    _strategy,
    _lefttype,
    _righttype);
-		if (join_strategy != BTEqualStrategyNumber)	/* should not happen */
-			elog(ERROR, "cannot merge using non-equality operator %u",
- qual->opno);
+
+		/*
+		 * Determine whether we accept lesser and/or equal tuples of the inner
+		 * relation.
+		 */
+		if (join_strategy != BTEqualStrategyNumber)
+		{
+			parent->mj_Ineq_Present = true;
+			switch (join_strategy)
+			{
+case BTLessEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTLessStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTGreaterStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+case BTGreaterEqualStrategyNumber:
+	parent->mj_Ineq_JoinEqual = true;
+	/* fall through */
+case BTGreaterStrategyNumber:
+	parent->mj_Ineq_JoinLesser = true;
+	if (sort_strategy != BTLessStrategyNumber)
+		elog(ERROR, "join strategy %d is not compatible with sort strategy %d",
+			 join_strategy, sort_strategy);
+	break;
+
+default:
+	elog(ERROR, "unsupported join strategy %d", join_strategy);
+			}
+		}
 
 		/*
 		 * sortsupport routine must know if abbreviation optimization is
@@ -415,6 +449,19 @@ MJCompare(MergeJoinState *mergestate)
 	{
 		MergeJoinClause clause = >mj_Clauses[i];
 		int			sort_result;
+		bool join_equal = true;
+		bool join_lesser = false;
+
+		if (mergestate->mj_Ineq_Present && i == mergestate->mj_NumClauses - 1)
+		{
+			/*
+			 * If the last merge clause is an inequality, check whether
+			 * we have to join the inner tuples that are less 

Re: SHA-2 functions

2018-02-22 Thread Peter Eisentraut
On 2/22/18 01:05, Michael Paquier wrote:
> On Wed, Feb 21, 2018 at 03:45:17PM -0500, Peter Eisentraut wrote:
>> On 2/20/18 23:04, Michael Paquier wrote:
>>> I think that crypto_hash.c or hash_crypt.c would be adapted as well.
>>> crypt.c is too much generic, so including both concepts in the name is
>>> the way to go.  The name given by Tom here sounds actually nice.
>>
>> Updated patches
> 
> I have been reviewing both patches, and those look good to me.

Committed, thanks

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Allow workers to override datallowconn

2018-02-22 Thread Tom Lane
Magnus Hagander  writes:
> Attached is a patch that adds new Override versions of the functions to
> connect to a database from a background worker.

> Another option would be to just add the parameter directly to the regular
> connection function, and not create separate functions. But that would make
> it an incompatible change. And since background workers are commonly used
> in extensions, that would break a lot of extensions out there. I figured
> it's probably not worth doing that, and thus added the new functions. What
> do others think about that?

Meh.  We change exported APIs in new major versions all the time.  As
long as it's just a question of an added parameter, people can deal
with it.  You could take the opportunity to future-proof a little by
making this option be the first bit in a flags parameter, so that at
least future boolean option additions don't require another API break
or a whole new set of redundant functions.

> Are there any other caveats in doing that this actually makes it dangerous
> to just allow bypassing it for extensions?

Don't think so; we autovacuum such DBs anyway don't we?

regards, tom lane



Allow workers to override datallowconn

2018-02-22 Thread Magnus Hagander
In working on the checksumhelper patch, we came across wanting a background
worker to be allowed to bypass datallowconn for a database. Right now we
didn't take care of that, and just said "you have to ALTER TABLE" first.

Specifically for this usecase that is OK, but not paticularly user
friendly. And I think this is a use that can also be useful for other
things.

Attached is a patch that adds new Override versions of the functions to
connect to a database from a background worker.

Another option would be to just add the parameter directly to the regular
connection function, and not create separate functions. But that would make
it an incompatible change. And since background workers are commonly used
in extensions, that would break a lot of extensions out there. I figured
it's probably not worth doing that, and thus added the new functions. What
do others think about that?

Are there any other caveats in doing that this actually makes it dangerous
to just allow bypassing it for extensions?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 
diff --git a/src/backend/bootstrap/bootstrap.c b/src/backend/bootstrap/bootstrap.c
index 28ff2f0979..1430894ad2 100644
--- a/src/backend/bootstrap/bootstrap.c
+++ b/src/backend/bootstrap/bootstrap.c
@@ -498,7 +498,7 @@ BootstrapModeMain(void)
 	 */
 	InitProcess();
 
-	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL);
+	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL, false);
 
 	/* Initialize stuff for bootstrap-file processing */
 	for (i = 0; i < MAXATTR; i++)
diff --git a/src/backend/postmaster/autovacuum.c b/src/backend/postmaster/autovacuum.c
index 702f8d8188..a52178f7d3 100644
--- a/src/backend/postmaster/autovacuum.c
+++ b/src/backend/postmaster/autovacuum.c
@@ -477,7 +477,7 @@ AutoVacLauncherMain(int argc, char *argv[])
 	InitProcess();
 #endif
 
-	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL);
+	InitPostgres(NULL, InvalidOid, NULL, InvalidOid, NULL, false);
 
 	SetProcessingMode(NormalProcessing);
 
@@ -1679,7 +1679,7 @@ AutoVacWorkerMain(int argc, char *argv[])
 		 * Note: if we have selected a just-deleted database (due to using
 		 * stale stats info), we'll fail and exit here.
 		 */
-		InitPostgres(NULL, dbid, NULL, InvalidOid, dbname);
+		InitPostgres(NULL, dbid, NULL, InvalidOid, dbname, false);
 		SetProcessingMode(NormalProcessing);
 		set_ps_display(dbname, false);
 		ereport(DEBUG1,
diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3ddf828bb..1ded418960 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -434,6 +434,9 @@ static void StartAutovacuumWorker(void);
 static void MaybeStartWalReceiver(void);
 static void InitPostmasterDeathWatchHandle(void);
 
+static void BackgroundWorkerInitializeConnectionByOidInternal(Oid dboid, Oid useroid, bool override_allow_connections);
+static void BackgroundWorkerInitializeConnectionInternal(const char *dbname, const char *username, bool override_allow_connections);
+
 /*
  * Archiver is allowed to start up at the current postmaster state?
  *
@@ -5587,6 +5590,18 @@ MaxLivePostmasterChildren(void)
 void
 BackgroundWorkerInitializeConnection(const char *dbname, const char *username)
 {
+	BackgroundWorkerInitializeConnectionInternal(dbname, username, false);
+}
+
+void
+BackgroundWorkerInitializeConnectionOverride(const char *dbname, const char *username)
+{
+	BackgroundWorkerInitializeConnectionInternal(dbname, username, true);
+}
+
+static void
+BackgroundWorkerInitializeConnectionInternal(const char *dbname, const char *username, bool override_allow_connections)
+{
 	BackgroundWorker *worker = MyBgworkerEntry;
 
 	/* XXX is this the right errcode? */
@@ -5595,7 +5610,7 @@ BackgroundWorkerInitializeConnection(const char *dbname, const char *username)
 (errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
  errmsg("database connection requirement not indicated during registration")));
 
-	InitPostgres(dbname, InvalidOid, username, InvalidOid, NULL);
+	InitPostgres(dbname, InvalidOid, username, InvalidOid, NULL, override_allow_connections);
 
 	/* it had better not gotten out of "init" mode yet */
 	if (!IsInitProcessingMode())
@@ -5610,6 +5625,18 @@ BackgroundWorkerInitializeConnection(const char *dbname, const char *username)
 void
 BackgroundWorkerInitializeConnectionByOid(Oid dboid, Oid useroid)
 {
+	BackgroundWorkerInitializeConnectionByOidInternal(dboid, useroid, false);
+}
+
+void
+BackgroundWorkerInitializeConnectionByOidOverride(Oid dboid, Oid useroid)
+{
+	BackgroundWorkerInitializeConnectionByOidInternal(dboid, useroid, true);
+}
+
+void
+BackgroundWorkerInitializeConnectionByOidInternal(Oid dboid, Oid useroid, bool override_allow_connections)
+{
 	BackgroundWorker *worker = MyBgworkerEntry;
 
 	/* XXX is this the right errcode? */
@@ -5618,7 +5645,7 @@ BackgroundWorkerInitializeConnectionByOid(Oid 

Re: ERROR: left and right pathkeys do not match in mergejoin

2018-02-22 Thread Tom Lane
Alexander Kuzmenkov  writes:
> explain select * from j1_tbl full join (select * from j2_tbl order by 
> j2_tbl.i desc, j2_tbl.k) j2_tbl on j1_tbl.i = j2_tbl.i and j1_tbl.i = 
> j2_tbl.k;
> ERROR:  left and right pathkeys do not match in mergejoin

Nice example.  There are several places that we could consider trying
to fix this:

The first possibility is to teach find_mergeclauses_for_pathkeys that
it should not select both of the join clauses in a case like this,
perhaps based on noting that the associated pathkeys have the same
eclass but different sort orders.  However, that seems like a loser
for the reason specified in the comment in that function: we need to
select a maximal list of mergeclauses, not a minimal list, because
if it's a full join and we aren't able to include all the clauses as
mergeclauses then we won't have a valid plan.  (At the time this code
was written, that could have resulted in failure to produce a plan
at all.  Now we could fall back to a hash full join ... but that might
be inefficient, or we might not have hashable operators.)

The next possibility is to fix it in make_inner_pathkeys_for_merge,
by having it not suppress lower-order pathkeys unless they match earlier
ones in all details (not just eclass).  In an example like this, that
means emitting a redundant inner pathkey list, equivalent to "ORDER BY
j1_tbl.i DESC, j1_tbl.i ASC".  That's kind of annoying.  It seems to
work in a simple test, but it implies doing useless comparisons during
the sort (since we'd only reach comparing the second sort column if
the first sort column compares equal, whereupon the second must too).
And it means representing the inner sort order by a noncanonical
pathkey list, which is not nice.  For example, even if we have an inner
path available that delivers rows ordered by "j1_tbl.i DESC", we'd
still think we have to physically sort it to add the extra sort key.
(And no, I don't want to change pathkey comparison rules to avoid that.)

The third possibility is to decide that create_mergejoin_plan is being
overly paranoid and it's okay to extract merge details from a "redundant"
path key even though it specifies the opposite sort order from what the
current merge clause seems to need.  This is scary at first glance, but
it seems like it should work.  We'd essentially be lying to the executor
about the sort ordering of the lower-order merge column, and trusting that
it won't take any wrong actions as a result because it should never reach
comparison of that lower-order column except when the higher column(s)
are equal.  I can't see a reason for that to go wrong; it's basically a
restatement of the argument why the lower-order sort key can be considered
redundant in the first place.  But it doesn't leave a warm feeling in the
pit of the stomach, especially for a bug fix that needs to be back-patched
a long way.

On balance, though, this last choice seems like the thing to do.  There
are clear downsides to the first two, in terms of failing to construct a
plan or constructing a needlessly inefficient plan.

regards, tom lane



Re: Translations contributions urgently needed

2018-02-22 Thread Ioseph Kim
Hi, I’m translating to Korean since 10 years over.
I think this issue should be left to the local community.
It may be omitted in some versions, or added in some versions.
because, it depends on volunteering.
Let It Be!

Regards Ioseph.


On Thu, Feb 22, 2018 at 05:29:11PM +, Thom Brown wrote:
> On 22 February 2018 at 17:24, Magnus Hagander  wrote:
> > On Thu, Feb 22, 2018 at 6:20 PM, Thom Brown  wrote:
> >>
> >> Hi,
> >>
> >> I have found that Japanese language support for the database server
> >> has been dropped for 10.  This is because it fell below the 80% of
> >> strings translated requirement, so it was shipped without Japanese.
> >> This isn't true of all components, but it seems quite alarming that
> >> we've pushed out PostgreSQL 10 with no language support for a country
> >> that has contributed a significant amount to the project, and has a
> >> relatively large number of users.
> >>
> >> The database server also dropped support for Indonesian and Portugese
> >> (Brazil).  In fact, just between 9.6 and 10, the following language
> >> support was dropped for these components:
> >>
> >>  cs   | plpython
> >>  de   | pg_resetxlog
> >>  es   | pg_resetxlog
> >>  fr   | pg_resetxlog
> >>  id   | postgres
> >>  it   | pg_resetxlog
> >>  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
> >>  ko   | pg_resetxlog
> >>  pl   | pg_resetxlog
> >>  pt_BR| pg_basebackup,pg_resetxlog,pltcl,postgres
> >>  ru   | pg_resetxlog
> >>  sv   | pg_resetxlog
> >>  tr   | plperl
> >>  zh_CN| pg_basebackup,pg_resetxlog,pltcl
> >>  zh_TW| plperl
> >
> >
> > Arent all those pg_resetxlog entries because it was renamed to pg_resetwal?
> > Is that because they were actually not updated for the new name, or is it a
> > reporting side effect?
> 
> Oh yes, okay, that's true.  I guess we can ignore those, so it's a
> little less dire for 9.6 to 10 then.  We get this instead, which is
> still not good:
> 
>  cs   | plpython
>  id   | postgres
>  it   | pg_resetxlog
>  ja   | pg_basebackup,plpython,pltcl,postgres,psql
>  pt_BR| pg_basebackup,pltcl,postgres
>  tr   | plperl
>  zh_CN| pg_basebackup,pltcl
>  zh_TW| plperl
> 
> Thom
> 



Re: Online enabling of checksums

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 4:47 PM, Andrey Borodin 
wrote:

> Hello, Magnus, Peter!
>
> I'm excited that this feature emerged, thanks for the patch. Hope it will
> help to fix some mistakes made during initdb long time ago...
>
> 22 февр. 2018 г., в 18:22, Peter Eisentraut  com> написал(а):
>
> On 2/21/18 15:53, Magnus Hagander wrote:
>
> *Two new functions are added, pg_enable_data_checksums() and
> pg_disable_data_checksums(). The disable one is easy -- it just changes
> to disable. The enable one will change the state to inprogress, and then
> start a background worker (the “checksumhelper launcher”). This worker
> in turn will start one sub-worker (“checksumhelper worker”) in each
> database (currently all done sequentially).*
>
>
> This is at least the fourth version of the pattern launcher plus worker
> background workers.  I wonder whether we can do something to make this
> easier and less repetitive.  Not in this patch, of course.
>
>
> Peter, can I ask for some pointers in searching for previous versions?
> I want to review patch this patch and some code comparision could be
> handy
>
> So far I've found only this [0,1] (without code) and threads mentioned by
> Magnus [2,3]
>
> Or do you mean extracting "worker+lancher" for reuse for other purposes?
>
>
I'm pretty sure Peter means the second. Which could be interesting, but as
he says, not the topic for this patch.

I'm not entirely sure which the others ones are. Auto-Vacuum obviously is
one, which doesn't use the worker infrastructure. But I'm not sure which
the others are referring to?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Translations contributions urgently needed

2018-02-22 Thread Thom Brown
On 22 February 2018 at 17:24, Magnus Hagander  wrote:
> On Thu, Feb 22, 2018 at 6:20 PM, Thom Brown  wrote:
>>
>> Hi,
>>
>> I have found that Japanese language support for the database server
>> has been dropped for 10.  This is because it fell below the 80% of
>> strings translated requirement, so it was shipped without Japanese.
>> This isn't true of all components, but it seems quite alarming that
>> we've pushed out PostgreSQL 10 with no language support for a country
>> that has contributed a significant amount to the project, and has a
>> relatively large number of users.
>>
>> The database server also dropped support for Indonesian and Portugese
>> (Brazil).  In fact, just between 9.6 and 10, the following language
>> support was dropped for these components:
>>
>>  cs   | plpython
>>  de   | pg_resetxlog
>>  es   | pg_resetxlog
>>  fr   | pg_resetxlog
>>  id   | postgres
>>  it   | pg_resetxlog
>>  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
>>  ko   | pg_resetxlog
>>  pl   | pg_resetxlog
>>  pt_BR| pg_basebackup,pg_resetxlog,pltcl,postgres
>>  ru   | pg_resetxlog
>>  sv   | pg_resetxlog
>>  tr   | plperl
>>  zh_CN| pg_basebackup,pg_resetxlog,pltcl
>>  zh_TW| plperl
>
>
> Arent all those pg_resetxlog entries because it was renamed to pg_resetwal?
> Is that because they were actually not updated for the new name, or is it a
> reporting side effect?

Oh yes, okay, that's true.  I guess we can ignore those, so it's a
little less dire for 9.6 to 10 then.  We get this instead, which is
still not good:

 cs   | plpython
 id   | postgres
 it   | pg_resetxlog
 ja   | pg_basebackup,plpython,pltcl,postgres,psql
 pt_BR| pg_basebackup,pltcl,postgres
 tr   | plperl
 zh_CN| pg_basebackup,pltcl
 zh_TW| plperl

Thom



Re: [HACKERS] [PATCH] Move all am-related reloption code into src/backend/access/[am-name] and get rid of relopt_kind for custom AM

2018-02-22 Thread Nikolay Shaplov
В письме от 3 сентября 2017 11:45:43 пользователь Alvaro Herrera написал:
> I think we should split this in at least two commits,
I've added a third part of this patch to commitfest:
https://www.postgresql.org/message-id/flat/2083183.Rn7qOxG4Ov@x200m#2083183.Rn7qOxG4Ov@x200m

To finally commit the rest of the path (the main part of it at least) I need 
this, and enum-options patches to be commited. Hope it will be done in march 
and I will offer the last part in next commitfest.

-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.


Re: Translations contributions urgently needed

2018-02-22 Thread Magnus Hagander
On Thu, Feb 22, 2018 at 6:20 PM, Thom Brown  wrote:

> Hi,
>
> I have found that Japanese language support for the database server
> has been dropped for 10.  This is because it fell below the 80% of
> strings translated requirement, so it was shipped without Japanese.
> This isn't true of all components, but it seems quite alarming that
> we've pushed out PostgreSQL 10 with no language support for a country
> that has contributed a significant amount to the project, and has a
> relatively large number of users.
>
> The database server also dropped support for Indonesian and Portugese
> (Brazil).  In fact, just between 9.6 and 10, the following language
> support was dropped for these components:
>
>  cs   | plpython
>  de   | pg_resetxlog
>  es   | pg_resetxlog
>  fr   | pg_resetxlog
>  id   | postgres
>  it   | pg_resetxlog
>  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
>  ko   | pg_resetxlog
>  pl   | pg_resetxlog
>  pt_BR| pg_basebackup,pg_resetxlog,pltcl,postgres
>  ru   | pg_resetxlog
>  sv   | pg_resetxlog
>  tr   | plperl
>  zh_CN| pg_basebackup,pg_resetxlog,pltcl
>  zh_TW| plperl
>

Arent all those pg_resetxlog entries because it was renamed to pg_resetwal?
Is that because they were actually not updated for the new name, or is it a
reporting side effect?

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Translations contributions urgently needed

2018-02-22 Thread Pavel Stehule
2018-02-22 18:20 GMT+01:00 Thom Brown :

> Hi,
>
> I have found that Japanese language support for the database server
> has been dropped for 10.  This is because it fell below the 80% of
> strings translated requirement, so it was shipped without Japanese.
> This isn't true of all components, but it seems quite alarming that
> we've pushed out PostgreSQL 10 with no language support for a country
> that has contributed a significant amount to the project, and has a
> relatively large number of users.
>
> The database server also dropped support for Indonesian and Portugese
> (Brazil).  In fact, just between 9.6 and 10, the following language
> support was dropped for these components:
>
>  cs   | plpython
>  de   | pg_resetxlog
>  es   | pg_resetxlog
>  fr   | pg_resetxlog
>  id   | postgres
>  it   | pg_resetxlog
>  ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
>  ko   | pg_resetxlog
>  pl   | pg_resetxlog
>  pt_BR| pg_basebackup,pg_resetxlog,pltcl,postgres
>  ru   | pg_resetxlog
>  sv   | pg_resetxlog
>  tr   | plperl
>  zh_CN| pg_basebackup,pg_resetxlog,pltcl
>  zh_TW| plperl
>
> This is a huge amount of compared to what happened between 9.5 and 9.6:
>
>  cs   | psql
>
> between 9.4 and 9.5:
>
>  cs   | pltcl
>  ro   | pltcl
>  tr   | libpq,pltcl
>  zh_TW| pltcl,psql
>
> and between 9.3 and 9.4:
>
>  cs   | pg_basebackup,pg_resetxlog
>  ja   | pg_basebackup,pg_resetxlog
>  zh_TW| pg_ctl,postgres
>
>
> There are many translations that are on the verge of falling under
> 80%.  Japanese alone has 6 components between 80-83%, but other
> languages are in a similar position.
>
> I feel this is something that we, as a community, need to address.
> Unfortunately, some of us aren't in a position to contribute to such
> an effort.  We need folk to step forward to help get these
> translations updated, or this situation will likely only get worse.
>

I'll try to fix cs part. Some components are not too important for us - but
psql and pg_basebackup should be translated. Probably there are zero users
of pltcl and few users with plpython (but these users usually prefer
English lang).

Pavel


>
> Thanks
>
> Thom
>
>


Translations contributions urgently needed

2018-02-22 Thread Thom Brown
Hi,

I have found that Japanese language support for the database server
has been dropped for 10.  This is because it fell below the 80% of
strings translated requirement, so it was shipped without Japanese.
This isn't true of all components, but it seems quite alarming that
we've pushed out PostgreSQL 10 with no language support for a country
that has contributed a significant amount to the project, and has a
relatively large number of users.

The database server also dropped support for Indonesian and Portugese
(Brazil).  In fact, just between 9.6 and 10, the following language
support was dropped for these components:

 cs   | plpython
 de   | pg_resetxlog
 es   | pg_resetxlog
 fr   | pg_resetxlog
 id   | postgres
 it   | pg_resetxlog
 ja   | pg_basebackup,pg_resetxlog,plpython,pltcl,postgres,psql
 ko   | pg_resetxlog
 pl   | pg_resetxlog
 pt_BR| pg_basebackup,pg_resetxlog,pltcl,postgres
 ru   | pg_resetxlog
 sv   | pg_resetxlog
 tr   | plperl
 zh_CN| pg_basebackup,pg_resetxlog,pltcl
 zh_TW| plperl

This is a huge amount of compared to what happened between 9.5 and 9.6:

 cs   | psql

between 9.4 and 9.5:

 cs   | pltcl
 ro   | pltcl
 tr   | libpq,pltcl
 zh_TW| pltcl,psql

and between 9.3 and 9.4:

 cs   | pg_basebackup,pg_resetxlog
 ja   | pg_basebackup,pg_resetxlog
 zh_TW| pg_ctl,postgres


There are many translations that are on the verge of falling under
80%.  Japanese alone has 6 components between 80-83%, but other
languages are in a similar position.

I feel this is something that we, as a community, need to address.
Unfortunately, some of us aren't in a position to contribute to such
an effort.  We need folk to step forward to help get these
translations updated, or this situation will likely only get worse.

Thanks

Thom



Re: [HACKERS] SERIALIZABLE with parallel query

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 7:54 AM, Thomas Munro
 wrote:>
> The best solution I have come up with so far is to add a reference
> count to SERIALIZABLEXACT.  I toyed with putting the refcount into the
> DSM instead, but then I ran into problems making that work when you
> have a query with multiple Gather nodes.  Since the refcount is in
> SERIALIZABLEXACT I also had to add a generation counter so that I
> could detect the case where you try to attach too late (the leader has
> already errored out, the refcount has reached 0 and the
> SERIALIZABLEXACT object has been recycled).

I don't know whether that's safe or not.  It certainly sounds like
it's solving one category of problem, but is that the only issue?  If
some backends haven't noticed that we're safe, they might keep
acquiring SIREAD locks or doing other manipulations of shared state,
which maybe could cause confusion.  I haven't looked into this deeply
enough to understand whether there's actually a possibility of trouble
there, but I can't rule it out off-hand.

One approach is to just disable this optimization for parallel query.
Being able to use SERIALIZABLE with parallel query is better than not
being able to do it, even if some optimizations are not applied in
that case.  Of course making the optimizations work is better, but
we've got to be sure we're doing it right.

> PS  I noticed that for BecomeLockGroupMember() we say "If we can't
> join the lock group, the leader has gone away, so just exit quietly"
> but for various other similar things we spew errors (most commonly
> seen one being "ERROR:  could not map dynamic shared memory segment").
> Intentional?

I suppose I thought that if we failed to map the dynamic shared memory
segment, it might be down to any one of several causes; whereas if we
fail to join the lock group, it must be because the leader has already
exited.  There might be a flaw in that thinking, though.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: non-bulk inserts and tuple routing

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 11:53 AM, Alvaro Herrera
 wrote:
> Robert Haas wrote:
>> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>>  wrote:
>> >> Attached is an updated version for that.
>> >
>> > Thanks for updating the patch.
>>
>> Committed with a few changes.
>
> I propose to tweak a few comments to PartitionTupleRouting, as attached.

Sure, please go ahead.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: non-bulk inserts and tuple routing

2018-02-22 Thread Alvaro Herrera
Robert Haas wrote:
> On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
>  wrote:
> >> Attached is an updated version for that.
> >
> > Thanks for updating the patch.
> 
> Committed with a few changes.

I propose to tweak a few comments to PartitionTupleRouting, as attached.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/include/executor/execPartition.h 
b/src/include/executor/execPartition.h
index e94718608f..08a994bce1 100644
--- a/src/include/executor/execPartition.h
+++ b/src/include/executor/execPartition.h
@@ -58,11 +58,15 @@ typedef struct PartitionDispatchData *PartitionDispatch;
  * partition tree.
  * num_dispatchnumber of partitioned 
tables in the partition
  * tree (= length 
of partition_dispatch_info[])
- * partition_oids  Array of leaf partitions OIDs
+ * partition_oids  Array of leaf partitions OIDs 
with one entry
+ * for every leaf 
partition in the partition tree,
+ * initialized in 
full by
+ * 
ExecSetupPartitionTupleRouting.
  * partitions  Array of ResultRelInfo* objects 
with one entry
- * for every leaf 
partition in the partition tree.
+ * for every leaf 
partition in the partition tree,
+ * initialized 
lazily.
  * num_partitions  Number of leaf partitions in 
the partition tree
- * (= 'partitions' 
array length)
+ * (= 
'partitions_oid'/'partitions' array length)
  * parent_child_tupconv_maps   Array of TupleConversionMap objects with one
  * entry for every 
leaf partition (required to
  * convert tuple 
from the root table's rowtype to


[PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

2018-02-22 Thread Nikolay Shaplov

This is part or my bigger patch 
https://www.postgresql.org/message-id/flat/2146419.veIEZdk4E4@x200m#2146419.veIEZdk4E4@x200m
 we've decided to
commit by smaller parts.

Now in postgres an StdRdOptions structure is used as binary represenations of
reloptions for heap, toast, and some indexes. It has a number of fields, and
only heap relation uses them all. Toast relations uses only autovacuum
options, and indexes uses only fillfactor option.

So for example if you set custom fillfactor value for some index, then it will
lead to allocating 84 bytes of memory (sizeof StdRdOptions on i386) and only 8
bytes will be actually used (varlena header and fillfactor value). 74 bytes is
not much, but allocating them for each index for no particular reason is bad
idea, as I think.

Moreover when I wrote my big reloption refactoring patch, I came to "one
reloption kind - one binary representation" philosophy. It allows to write
code with more logic in it.

This patch replaces StdRdOptions with HeapRelOptions, ToastRelOptions,
BTRelOptions, HashRelOptions, SpGistRelOptions and PartitionedRelOptions

one for each relation kind that were using StdRdOptions before.

The second thing I've done, I've renamed Relation* macroses from
src/include/utils/rel.h, that were working with reloptions. I've renamed them
into Heap*, Toast* and View* (depend on what relation options they were
actually using)

I did it because there names were misleading. For example
RelationHasCheckOption can be called only for View relation, and will give
wrong result for other relation types. It just takes binary representation of
reloptions, cast is to (ViewOptions *) and then returns some value from it.
Naming it as ViewHasCheckOption would better reflect what it actually do, and
strictly specify that it is applicable only to View relations.


Possible flaws:

I replaced

saveFreeSpace = RelationGetTargetPageFreeSpace(state->rs_new_rel,

HEAP_DEFAULT_FILLFACTOR);

with

if (IsToastRelation(state->rs_new_rel))
  saveFreeSpace = ToastGetTargetPageFreeSpace();
else
  saveFreeSpace = HeapGetTargetPageFreeSpace(state->rs_new_rel);

wherever I met it (and other cases like that), but I am not sure if in some
cases that part of code is used for heap only or not. So may be this "ifs" is
not needed and should be removed, and only Heap-case should be left. But I am
not that much familiar with postgres internals to see it for sure... I need
advice of more experienced developers here.

--
Do code for fun.diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46276ce..bf7441e 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -22,7 +22,7 @@
 #include "access/htup_details.h"
 #include "access/nbtree.h"
 #include "access/reloptions.h"
-#include "access/spgist.h"
+#include "access/spgist_private.h"
 #include "access/tuptoaster.h"
 #include "catalog/pg_type.h"
 #include "commands/defrem.h"
@@ -46,9 +46,8 @@
  * upper and lower bounds (if applicable); for strings, consider a validation
  * routine.
  * (ii) add a record below (or use add__reloption).
- * (iii) add it to the appropriate options struct (perhaps StdRdOptions)
- * (iv) add it to the appropriate handling routine (perhaps
- * default_reloptions)
+ * (iii) add it to the appropriate options struct
+ * (iv) add it to the appropriate handling routine
  * (v) make sure the lock level is set correctly for that operation
  * (vi) don't forget to document the option
  *
@@ -986,7 +985,7 @@ extractRelOptions(HeapTuple tuple, TupleDesc tupdesc,
 		case RELKIND_TOASTVALUE:
 		case RELKIND_MATVIEW:
 		case RELKIND_PARTITIONED_TABLE:
-			options = heap_reloptions(classForm->relkind, datum, false);
+			options = relation_reloptions(classForm->relkind, datum, false);
 			break;
 		case RELKIND_VIEW:
 			options = view_reloptions(datum, false);
@@ -1319,61 +1318,133 @@ fillRelOptions(void *rdopts, Size basesize,


 /*
- * Option parser for anything that uses StdRdOptions.
+ * Option parsing definition for autovacuum. Used in toast and heap options.
+ */
+
+#define AUTOVACUUM_RELOPTIONS(OFFSET)\
+		{"autovacuum_enabled", RELOPT_TYPE_BOOL, \
+		OFFSET + offsetof(AutoVacOpts, enabled)},\
+		{"autovacuum_vacuum_threshold", RELOPT_TYPE_INT, \
+		OFFSET + offsetof(AutoVacOpts, vacuum_threshold)},   \
+		{"autovacuum_analyze_threshold", RELOPT_TYPE_INT,\
+		OFFSET + offsetof(AutoVacOpts, analyze_threshold)},  \
+		{"autovacuum_vacuum_cost_delay", RELOPT_TYPE_INT,\
+		OFFSET + offsetof(AutoVacOpts, vacuum_cost_delay)},  \
+		{"autovacuum_vacuum_cost_limit", RELOPT_TYPE_INT,\
+		OFFSET + offsetof(AutoVacOpts, vacuum_cost_limit)},  \
+		{"autovacuum_freeze_min_age", RELOPT_TYPE_INT,   \
+		OFFSET + 

Re: non-bulk inserts and tuple routing

2018-02-22 Thread Robert Haas
On Tue, Feb 20, 2018 at 8:06 PM, Amit Langote
 wrote:
>> Attached is an updated version for that.
>
> Thanks for updating the patch.

Committed with a few changes.  The big one was that I got rid of the
local variable is_update in ExecSetupPartitionTupleRouting.  That
saved a level of indentation on a substantial chunk of code, and it
turns out that test was redundant anyway.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Online enabling of checksums

2018-02-22 Thread Andrey Borodin
Hello, Magnus, Peter!

I'm excited that this feature emerged, thanks for the patch. Hope it will help 
to fix some mistakes made during initdb long time ago...

> 22 февр. 2018 г., в 18:22, Peter Eisentraut 
>  написал(а):
> 
> On 2/21/18 15:53, Magnus Hagander wrote:
>> *Two new functions are added, pg_enable_data_checksums() and
>> pg_disable_data_checksums(). The disable one is easy -- it just changes
>> to disable. The enable one will change the state to inprogress, and then
>> start a background worker (the “checksumhelper launcher”). This worker
>> in turn will start one sub-worker (“checksumhelper worker”) in each
>> database (currently all done sequentially).*
> 
> This is at least the fourth version of the pattern launcher plus worker
> background workers.  I wonder whether we can do something to make this
> easier and less repetitive.  Not in this patch, of course.

Peter, can I ask for some pointers in searching for previous versions?
I want to review patch this patch and some code comparision could be handy

So far I've found only this [0,1] (without code) and threads mentioned by 
Magnus [2,3]

Or do you mean extracting "worker+lancher" for reuse for other purposes?

Best regards, Andrey Borodin.


[0] 
https://www.postgresql.org/message-id/flat/E2B195BF-7AA1-47AF-85BE-0E936D157902%40endpoint.com#e2b195bf-7aa1-47af-85be-0e936d157...@endpoint.com
 

[1] 
https://www.postgresql.org/message-id/flat/7A00D9D1-535A-4C37-94C7-02296AAF063F%40endpoint.com#7a00d9d1-535a-4c37-94c7-02296aaf0...@endpoint.com
 

[2] 
https://www.postgresql.org/message-id/flat/CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp%3D-7OJWBbcg%40mail.gmail.com#CABUevEx8KWhZE_XkZQpzEkZypZmBp3GbM9W90JLp=-7ojwb...@mail.gmail.com
 

[3] 
https://www.postgresql.org/message-id/flat/FF393672-5608-46D6-9224-6620EC532693%40endpoint.com#ff393672-5608-46d6-9224-6620ec532...@endpoint.com

Re: Hash Joins vs. Bloom Filters / take 2

2018-02-22 Thread Tomas Vondra


On 02/22/2018 12:44 PM, Claudio Freire wrote:
> On Wed, Feb 21, 2018 at 11:21 PM, Tomas Vondra
>  wrote:
>> On 02/21/2018 02:10 AM, Peter Geoghegan wrote:
>>> ...
>>> I misunderstood. I would probably do something like double or triple
>>> the original rows estimate instead, though. The estimate must be at
>>> least slightly inaccurate when we get to this point, but I don't
>>> think that that's a good enough reason to give up on the estimate
>>> completely.
>>>
>>
>> That's a problem only for the multi-batch case, though.
>>
>> With a single batch we can walk the hash table and count non-empty
>> buckets, to get a good ndistinct estimate cheaply. And then size the
>> filter considering both memory requirements (fits into CPU cache) and
>> false positive rate. There are other things we may need to consider
>> (memory usage vs. work_mem) but that's a separate issue.
>>
>> With multiple batches I think we could use the "size the bloom filter
>> for a fraction of work_mem" which the current patch uses when switching
>> to multiple batches halfway-through. That pretty much entirely ignores
>> the estimate and essentially replaces it with a "fictional" estimate.
>>
>> I think that's a better approach than using some arbitrary multiple of
>> the estimate. When we have to start batching halfway through, the
>> estimate is proven to be rather bogus anyway, but we may treat it as a
>> lower boundary for the bloom filter size.
> 
> ...
> 
>>> As I said, X should not be a portion of work_mem, because that has
>>> only a weak relationship to what really matters.
>>>
>>
>> I agree a fixed fraction of work_mem may not be the right thing, but the
>> goal was to make the bloom filter part of the Hash memory budget, i.e.
>>
>> bloom filter + hash table <= work_mem
>>
>> (which I think we agree should be the case), without increasing the
>> number of batches too much. For example, if you size the filter ignoring
>> this, and it end up being 90% of work_mem, you may need to do the hash
>> join in 128 batches instead of just 16. Or something like that.
>>
>> Maybe that would still be a win, though. Firstly, the higher number of
>> batches may not have a huge impact - in one case we need to serialie
>> 15/16 and in the other one 127/128. That's 93% vs. 99%. And if the more
>> accurate filter allows us to discard much more data from the outer
>> relation ...
> 
> Let me reiterate, you can avoid both issues with scalable bloom filters[1].
> 

I'm afraid it's not as straight-forward as "Use scalable bloom filters!"

This is not merely a question of unreliable estimates of number of
entries. That could have been solved by scalable bloom filters, which
are essentially a sequence of larger and larger bloom filters, added
when the smaller bloom filter "fills up" (1/2 full).

The problem is twofold:

(a) we need to decide what false positive rate to use (i.e. what is a
reasonable trade-off between filter size and how much work it saves)

(b) we also need to consider work_mem (which I assume we all agree we
must respect)

So for example we can't size the first bloom filter to just perfectly
fit into work_mem, only to add larger bloom filters later (each 2x the
size of the previous one). Not only will that increase the memory usage
to 7x the initial estimate, but it will also make the bloom filter less
efficient (having to probe larger and larger filters, likely not fitting
into CPU cache).

> An HLL can be used to estimate set size, the paper makes no mention of
> it, probably assuming only distinct items are added to the set.
> 

The problem with HLL is, it's only an estimate of how many entries you
saw so far. It only tells you that after observing the items, and it
only tells you how many items you saw so far. What we need for sizing a
bloom filter is an estimate of number of distinct values in advance.

In other words, HLL is entirely useless for sizing Bloom Filters.

Furthermore, we could estimate number of observed distinct values from
the number of 1s in the bloom filter - we essentially ask "How many
items we observed if each item sets k random bits, and we have K bits
sets?" HLL does the same thing, but it throws away the ability to answer
which elements are in the set.


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: phasing out pg_pltemplate?

2018-02-22 Thread Tom Lane
Peter Eisentraut  writes:
> What is the plan for pg_pltemplate?  Is there a roadmap to get rid of
> it?  (It's not currently blocking anything for me.  I'm just wondering.)

I think it's just waiting for someone to put in the effort to make it
unnecessary.

It seems like the extension mechanism could supersede it now, by switching
to a convention where the CREATE LANGUAGE command in the extension script
specifies all the language parameters explicitly.  But we would need to do
something extra to replace the functionality of tmpldbacreate --- perhaps
another extension control file flag?  Or maybe it'd be good enough to
hard-wire the db-owner-can-create behavior as enabled by TRUSTED, since
tmpltrusted = tmpldbacreate in every existing row.

One thing we'd have to address is how to not choke on old dump scripts
that contain "CREATE LANGUAGE foo" rather than CREATE EXTENSION.
I wonder if we could finesse that by redefining CREATE LANGUAGE with
no parameters as equivalent to CREATE EXTENSION.

pg_upgrade'ing across such a change might pose some challenges too,
not sure.

regards, tom lane



Re: [HACKERS] Runtime Partition Pruning

2018-02-22 Thread David Rowley
On 23 February 2018 at 01:15, David Rowley  wrote:
> One problem that I'm facing now is down to the way I'm gathering the
> ParamIds that match the partkeys. As you'll see from the patch I've
> added a 'paramids' field to PartitionPruneContext and I'm populating
> this when the clauses are being pre-processed in
> extract_partition_clauses(). The problem is that the or_clauses are
> not pre-processed at all, so the current patch will not properly
> perform run-time pruning when the Params are "hidden" in OR branches.
>
> One way I thought to fix this was to change the clause processing to
> create an array of PartitionClauseInfos, one for each OR branch. This
> would also improve the performance of the run-time pruning, meaning
> that all of the or clauses would be already matched to the partition
> keys once, rather than having to redo that again each time a Param
> changes its value.
>
> If I go and write a patch to do that, would you want it in your patch,
> or would you rather I kept it over here?  Or perhaps you have a better
> idea on how to solve...?

Hi Amit,

I've attached a patch which does this. For now, the patch is only
intended to assist in the discussion of the above idea.

The patch is based on a WIP version of run-time pruning that I'm not
quite ready to post yet, but with a small amount of work you could
probably take it and base it on your faster partition pruning v31
patch [1].

I ended up pulling the PartitionPruneInfo out of the
PartitionPruneContext. This was required due how I've now made
extract_partition_clauses() recursively call itself. We don't want to
overwrite the context's clauseinfo with the one from the recursive
call. A side effect of this is that the subcontext is no longer
required when processing the OR clauses. You only did this so that the
context's clauseinfo was not overwritten. I also think it's better to
seperate out the inputs and outputs. Anything in context was more
intended to be for input fields.

Let me know your thoughts about this. If you don't want it for faster
partition pruning, then I'll probably go and tidy it up and include it
for run-time pruning.

[1] 
https://www.postgresql.org/message-id/00ae2273-bb6b-1287-9ebc-5459b37c9078%40lab.ntt.co.jp

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


generate_PartitionClauseInfos_for_or_clauses.patch
Description: Binary data


Re: [HACKERS] Runtime Partition Pruning

2018-02-22 Thread Jesper Pedersen

Hi David,

On 02/21/2018 04:06 AM, David Rowley wrote:

I've attached v11 of the patch.



Are UPDATE and DELETE suppose to be supported ?

With

-- test.sql --
CREATE TABLE test (a integer NOT NULL, b integer) PARTITION BY HASH(a);
CREATE TABLE test_p00 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 0);
CREATE TABLE test_p01 PARTITION OF test FOR VALUES WITH (MODULUS 2, 
REMAINDER 1);

CREATE INDEX idx_test_a ON test (a);
CREATE INDEX idx_test_b ON test (b);

INSERT INTO test (SELECT i,i FROM generate_series(1, 100) AS i);

ANALYZE;
-- test.sql --

and

UPDATE test SET b = 1 WHERE a = ?
DELETE FROM test WHERE a = ?

both shows that all partitions are scanned;

 Update on test
   Update on test_p00
   Update on test_p01
   ->  Index Scan using test_p00_a_idx on test_p00
 Index Cond: (a = 1)
   ->  Index Scan using test_p01_a_idx on test_p01
 Index Cond: (a = 1)

Using prune_v32 and runtime_v11 with conflicts resolved.

Best regards,
 Jesper



Re: Incorrect grammar

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 6:54 AM, Etsuro Fujita
 wrote:
> Here is a tiny patch to fix $SUBJECT in a comment in execPartition.c.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: PlaceHolderVars in pushed down child-join cause error

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 7:41 AM, Ashutosh Bapat
 wrote:
> postgres_fdw isn't expected to push down joins with placeholder vars.
> But the check for that in foreign_join_ok() only considers
> joinrel->relids. For a child-join relids contains the child relids but
> PlaceHolderInfo refers to the top parent's relids. Hence postgres_fdw
> tries to push down a child-join with PlaceHolderVars in it and fails
> with error "unsupported expression type for deparse: 198". 198 being
> T_PlaceHolderVar.
>
> The fix is to use joinrel->top_parent_relids for a child-join.
> Attached patch for the same.

Committed, but I changed the formatting, because as you had it,
pgindent would have mangled it.  While I was at it, I tweaked the
wording of the comment a bit.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgsql: Avoid valgrind complaint about write() of uninitalized bytes.

2018-02-22 Thread Robert Haas
On Wed, Feb 21, 2018 at 3:18 PM, Peter Geoghegan  wrote:
> On Wed, Feb 21, 2018 at 10:55 AM, Peter Geoghegan  wrote:
>> Sure, but it looks like it has the exact same underlying cause to the
>> LogicalTapeFreeze() issue. It shouldn't be very hard to write an
>> equivalent patch for LogicalTapeRewindForRead() -- I pointed out that
>> this could happen there instead before the first patch went in, in
>> fact. My mistake was to imagine that that could never happen during
>> the regression tests.
>
> Attached patch does this. I cannot recreate this issue locally, but
> this should still fix it on skink.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Possible performance regression in version 10.1 with pgbench read-write tests.

2018-02-22 Thread Robert Haas
On Wed, Feb 21, 2018 at 10:03 PM, Mithun Cy  wrote:
> seeing futex in the call stack andres suggested that following commit could
> be the reason for regression
>
> commit ecb0d20a9d2e09b7112d3b192047f711f9ff7e59
> Author: Tom Lane 
> Date:   2016-10-09 18:03:45 -0400
>
> Use unnamed POSIX semaphores, if available, on Linux and FreeBSD.
>
> Commenting out same in src/template/linux I did run the benchmark tests
> again
> performance improved from 26871.567326 to 34286.620251 (both median of 3
> TPS).

Hmm.  So that commit might not have been the greatest idea.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [doc fix] Correct calculation of vm.nr_hugepages

2018-02-22 Thread Robert Haas
On Wed, Feb 21, 2018 at 3:28 PM, Justin Pryzby  wrote:
> On Wed, Feb 21, 2018 at 03:14:57PM -0500, Robert Haas wrote:
>> On Mon, Feb 19, 2018 at 9:43 PM, Tsunakawa, Takayuki
>>  wrote:
>> > Thanks, I'd like to take this.
>>
>> Why are these values so large?  The example in the documentation shows
>> 6490428 kB, and in my test I got 8733888 kB.  But 8733888 kB = 8.3 TB!
>>  8.3 GB would make sense, but 8.3 TB does not.
>
> pryzbyj@pryzbyj:~$ units -t -v 8733888kB GiB
> 8733888kB = 8.1340671 GiB

Sigh.  It would be nice if I were less stupid.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgsql: Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 8:31 AM, Robert Haas  wrote:
> On Thu, Feb 22, 2018 at 12:35 AM, Tom Lane  wrote:
>> Robert Haas  writes:
>>> Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.
>>
>> This seems to have produced some plan instability in the buildfarm.
>
> I was worried about that.  Looking at it now.  I wish we had some way
> to figure out whether a given plan was likely to be unstable on the
> buildfarm other than committing it and seeing what happens.  It's
> surprisingly difficult to write non-trivial tests that produce
> consistent EXPLAIN output, and annoyingly time-consuming to try to
> figure out why they don't.

I pushed a commit to try to fix this by making the three tables being
joined not all identical in terms of row count.  That seems like a
clear improvement, but there's every possibility it won't solve the
problem entirely.  I'll keep an eye on the buildfarm and see what
happens.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



check error messages in SSL tests

2018-02-22 Thread Peter Eisentraut
I noticed that a couple of test cases in the SSL tests fail to connect
not for the reason that the tests think they should.  Here is a patch to
augment the test setup so that a test for connection rejection also
checks that we get the expected error message.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 23057f8f24c81ae18a59bc445857e686fc51351c Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 22 Feb 2018 00:24:32 -0500
Subject: [PATCH] Check error messages in SSL tests

In tests that check whether a connection fails, also check the error
message.  That makes sure that the connection was rejected for the right
reason.

This discovered that two tests had their connection failing for the
wrong reason.  One test failed because pg_hba.conf was not set up to
allow that user, one test failed because the client key file did not
have the right permissions.  Fix those tests and add a new one that is
really supposed to check the file permission issue.
---
 src/test/ssl/ServerSetup.pm| 42 --
 src/test/ssl/ssl/.gitignore|  2 +-
 src/test/ssl/t/001_ssltests.pl | 33 ++---
 src/test/ssl/t/002_scram.pl|  4 +++-
 4 files changed, 50 insertions(+), 31 deletions(-)

diff --git a/src/test/ssl/ServerSetup.pm b/src/test/ssl/ServerSetup.pm
index 45991d61a2..27a676b65c 100644
--- a/src/test/ssl/ServerSetup.pm
+++ b/src/test/ssl/ServerSetup.pm
@@ -27,7 +27,6 @@ use Test::More;
 use Exporter 'import';
 our @EXPORT = qw(
   configure_test_server_for_ssl
-  run_test_psql
   switch_server_cert
   test_connect_fails
   test_connect_ok
@@ -35,37 +34,28 @@ our @EXPORT = qw(
 
 # Define a couple of helper functions to test connecting to the server.
 
-# Attempt connection to server with given connection string.
-sub run_test_psql
-{
-   my $connstr   = $_[0];
-
-   my $cmd = [
-   'psql', '-X', '-A', '-t', '-c', "SELECT \$\$connected with 
$connstr\$\$",
-   '-d', "$connstr" ];
-
-   my $result = run_log($cmd);
-   return $result;
-}
-
 # The first argument is a base connection string to use for connection.
 # The second argument is a complementary connection string.
 sub test_connect_ok
 {
-   my $common_connstr = $_[0];
-   my $connstr = $_[1];
-   my $test_name = $_[2];
+   my ($common_connstr, $connstr, $test_name) = @_;
 
-   ok(run_test_psql("$common_connstr $connstr"), $test_name);
+   my $cmd = [
+   'psql', '-X', '-A', '-t', '-c', "SELECT \$\$connected with 
$connstr\$\$",
+   '-d', "$common_connstr $connstr" ];
+
+   command_ok($cmd, $test_name);
 }
 
 sub test_connect_fails
 {
-   my $common_connstr = $_[0];
-   my $connstr = $_[1];
-   my $test_name = $_[2];
+   my ($common_connstr, $connstr, $expected_stderr, $test_name) = @_;
+
+   my $cmd = [
+   'psql', '-X', '-A', '-t', '-c', "SELECT \$\$connected with 
$connstr\$\$",
+   '-d', "$common_connstr $connstr" ];
 
-   ok(!run_test_psql("$common_connstr $connstr"), $test_name);
+   command_fails_like($cmd, $expected_stderr, $test_name);
 }
 
 # Copy a set of files, taking into account wildcards
@@ -169,12 +159,12 @@ sub configure_hba_for_ssl
print $hba
 "# TYPE  DATABASEUSERADDRESS METHOD\n";
print $hba
-"hostssl trustdb ssltestuser $serverhost/32
$authmethod\n";
+"hostssl trustdb all $serverhost/32
$authmethod\n";
print $hba
-"hostssl trustdb ssltestuser ::1/128 
$authmethod\n";
+"hostssl trustdb all ::1/128 
$authmethod\n";
print $hba
-"hostssl certdb  ssltestuser $serverhost/32cert\n";
+"hostssl certdb  all $serverhost/32cert\n";
print $hba
-"hostssl certdb  ssltestuser ::1/128 cert\n";
+"hostssl certdb  all ::1/128 cert\n";
close $hba;
 }
diff --git a/src/test/ssl/ssl/.gitignore b/src/test/ssl/ssl/.gitignore
index 10b74f0848..af753d4c7d 100644
--- a/src/test/ssl/ssl/.gitignore
+++ b/src/test/ssl/ssl/.gitignore
@@ -1,3 +1,3 @@
 /*.old
 /new_certs_dir/
-/client_tmp.key
+/client*_tmp.key
diff --git a/src/test/ssl/t/001_ssltests.pl b/src/test/ssl/t/001_ssltests.pl
index e53bd12ae9..c040d4783c 100644
--- a/src/test/ssl/t/001_ssltests.pl
+++ b/src/test/ssl/t/001_ssltests.pl
@@ -2,7 +2,7 @@
 use warnings;
 use PostgresNode;
 use TestLib;
-use Test::More tests => 40;
+use Test::More tests => 62;
 use ServerSetup;
 use File::Copy;
 
@@ -20,6 +20,8 @@
 # of the key stored in the code tree and update its permissions.
 copy("ssl/client.key", "ssl/client_tmp.key");
 chmod 0600, "ssl/client_tmp.key";
+copy("ssl/client-revoked.key", 

Re: pgsql: Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.

2018-02-22 Thread Robert Haas
On Thu, Feb 22, 2018 at 12:35 AM, Tom Lane  wrote:
> Robert Haas  writes:
>> Charge cpu_tuple_cost * 0.5 for Append and MergeAppend nodes.
>
> This seems to have produced some plan instability in the buildfarm.

I was worried about that.  Looking at it now.  I wish we had some way
to figure out whether a given plan was likely to be unstable on the
buildfarm other than committing it and seeing what happens.  It's
surprisingly difficult to write non-trivial tests that produce
consistent EXPLAIN output, and annoyingly time-consuming to try to
figure out why they don't.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



phasing out pg_pltemplate?

2018-02-22 Thread Peter Eisentraut
What is the plan for pg_pltemplate?  Is there a roadmap to get rid of
it?  (It's not currently blocking anything for me.  I'm just wondering.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



  1   2   >