Re: Error message inconsistency

2019-07-05 Thread Amit Kapila
On Mon, Jul 1, 2019 at 10:05 PM Alvaro Herrera  wrote:
>
> Do we have an actual patch here?
>

We have a patch, but it needs some more work like finding similar
places and change all of them at the same time and then change the
tests to adapt the same.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Bruce Momjian wrote:

> On Fri, Jul  5, 2019 at 05:00:42PM -0400, Bruce Momjian wrote:
> > On Fri, Jul  5, 2019 at 04:24:54PM -0400, Alvaro Herrera wrote:

> > > Oh, is that the idea?  I was kinda assuming that the data was kept
> > > as-stored in shared buffers, ie. it would be decrypted on access, not on
> > > read from disk.  The system seems very prone to leakage if you have it
> > > decrypted in shared memory.
> > 
> > Well, the overhead of decrypting on every access will make the slowdown
> > huge, and I don't know what security value that would have.  I am not
> > sure what security value TDE itself has, but I think encrypting shared
> > buffer contents has even less.
> 
> Sorry I didn't answer your question directly.  Since the shared buffers
> are in memory, if the decryption key is also unlocked in memory, there
> isn't much value to encrypting shared buffers, and the overhead would be
> huge.

Oh, I get your point now.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: "WIP: Data at rest encryption" patch and, PostgreSQL 11-beta3

2019-07-05 Thread Bruce Momjian
On Tue, Jun 25, 2019 at 02:28:00PM +0200, Peter Eisentraut wrote:
> On 2019-06-17 11:23, Antonin Houska wrote:
> > I'm thinking how to teach postmaster to accept FEBE protocol connections
> > temporarily, just to receive the key. The user applications like pg_ctl,
> > initdb or pg_upgrade would retrieve the key / password from the DBA, then
> > start postmaster and send it the key.
> > 
> > Perhaps the message format should be a bit generic so that extensions like
> > this can use it to receive their keys too.
> > 
> > (The idea of an unix socket or named pipe I proposed upthread is not good
> > because it's harder to implement in a portable way.)
> 
> How are the requirements here different from ssl_passphrase_command?
> Why do we need a new mechanism?

Agreed.  My pgcryptokey prompting shell script was mostly a
proof-of-concept.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: POC: Cleaning up orphaned files using undo logs

2019-07-05 Thread Amit Kapila
On Fri, Jul 5, 2019 at 7:39 PM Robert Haas  wrote:
>
> On Tue, Jun 25, 2019 at 4:00 PM Amit Kapila  wrote:
> > Fair enough.  I have implemented it based on next_retry_at and use
> > constant time 10s for the next retry.  I have used define instead of a
> > GUC as all the other constants for similar things are defined as of
> > now.  One thing to note is that we want the linger time (defined as
> > UNDO_WORKER_LINGER_MS) for a undo worker to be more than failure retry
> > time (defined as UNDO_FAILURE_RETRY_DELAY_MS) as, otherwise, the undo
> > worker can exit before retrying the failed requests.
>
> Uh, I think we want exactly the opposite.  We want the workers to exit
> before retrying, so that there's a chance for other databases to get
> processed, I think.
>

The workers will exit if there is any chance for other databases to
get processed.  Basically, we linger only when we find there is no
work in other databases.  Not only that even if some new work is added
to the queues for some other database then also we stop the lingering
worker if there is no worker available for the new request that has
arrived.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Declared but no defined functions

2019-07-05 Thread Tom Lane
Masahiko Sawada  writes:
> I think the following functions are mistakenly left in the header
> file. So attached patch removes them.

> dsa_startup()
> TransactionIdAbort()
> renameatt_type()

Agreed, these are referenced nowhere.  I pushed the patch.

> I realized that TransactionIdAbort is declared in the transam.h but
> there is not its function body. As far as I found there are three
> similar functions in total by the following script.
> for func in `git ls-files | egrep "\w+\.h$" | xargs cat | egrep
> "extern \w+ \w+\(.*\);" | sed -e "s/.* \(.*\)(.*);/\1(/g"`
> do
> if [ `git grep "$func" -- "*.c" | wc -l` -lt 1 ];then
> echo $func
> fi
> done

FWIW, that won't catch declarations that lack "extern", nor functions
that return pointer-to-something.  (Omitting "extern" is something
I consider bad style, but other people seem to be down with it.)
Might be worth another pass to look harder?

regards, tom lane




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Fri, Jul  5, 2019 at 04:10:04PM -0400, Bruce Momjian wrote:
> On Fri, Jul  5, 2019 at 03:46:28PM -0400, Alvaro Herrera wrote:
> > On 2019-Jul-05, Bruce Momjian wrote:
> > 
> > > What people really want with more-granular-than-cluster encryption is
> > > the ability to supply their passphrase key _when_ they want to access
> > > their data, and then leave and be sure their data is secure from
> > > decryption.  That will not be possible since the WAL will be encrypted
> > > and any replay of it will need their passphrase key to unlock it, or the
> > > entire system will be unrecoverable.
> > 
> > I'm not sure I understand why WAL replay needs the passphrase to work.
> > Why isn't the data saved in WAL already encrypted, which can be applied
> > as raw bytes to each data block, without needing to decrypt anything?
> > Only if somebody wants to interpret the bytes they need the passphrase,
> > no?
> 
> Uh, well, you have the WAL record, and you want to write it to an 8k
> page.  You have to read the 8k page from disk into shared buffers, and
> you have to decrypt the 8k page to do that, right?  We aren't going to
> store 8k pages encrypted in shared buffers, right?
> 
> If you did want to do that, or wanted to write them to disk without
> decrypting the 8k page, it still would not work since AES is a 16-byte
> encryption cipher.  I don't think we can break 8k pages up into 16-byte
> chunks and be sure we can just place data into those 16-byte boundaries.

I am not sure I was clear in describing this.  If you want to copy data
directly from WAL to the 8k pages, you have to use either block mode or
streaming mode for both 8k pages and WAL.  If you use block mode, then
changing any data on the pages will change all encrypted storage in the
same pages after the change, and computing WAL will require 16-byte
boundries.  If you use streaming mode, you have to compute the proper
stream at the point in the 8k pages where you are changing the row.  My
point is that in neither case can you just encrypt for the row for WAL
and assume it can be placed in an 8k pages.  Neither option seems
desirable.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Change atoi to strtol in same place

2019-07-05 Thread Tomas Vondra

Hi Surafel,

On Mon, Jul 01, 2019 at 08:48:27PM +0300, Surafel Temesgen wrote:

Hello,

we use atoi for user argument processing in same place which return zero
for both invalid input and input value zero. In most case its ok because we
error out with appropriate error message for input zero but in same place
where we accept zero as valued input it case a problem by preceding for
invalid input as input value zero. The attached patch change those place to
strtol which can handle invalid input

regards

Surafel


This seems to have bit-rotted (due to minor changes to pg_basebackup).
Can you fix that and post an updated version?

In general, I think it's a good idea to fix those places, but I wonder
if we need to change the error messages too.

regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Fri, Jul  5, 2019 at 05:00:42PM -0400, Bruce Momjian wrote:
> On Fri, Jul  5, 2019 at 04:24:54PM -0400, Alvaro Herrera wrote:
> > On 2019-Jul-05, Bruce Momjian wrote:
> > 
> > > Uh, well, you have the WAL record, and you want to write it to an 8k
> > > page.  You have to read the 8k page from disk into shared buffers, and
> > > you have to decrypt the 8k page to do that, right?  We aren't going to
> > > store 8k pages encrypted in shared buffers, right?
> > 
> > Oh, is that the idea?  I was kinda assuming that the data was kept
> > as-stored in shared buffers, ie. it would be decrypted on access, not on
> > read from disk.  The system seems very prone to leakage if you have it
> > decrypted in shared memory.
> 
> Well, the overhead of decrypting on every access will make the slowdown
> huge, and I don't know what security value that would have.  I am not
> sure what security value TDE itself has, but I think encrypting shared
> buffer contents has even less.

Sorry I didn't answer your question directly.  Since the shared buffers
are in memory, if the decryption key is also unlocked in memory, there
isn't much value to encrypting shared buffers, and the overhead would be
huge.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Extending PostgreSQL with a Domain-Specific Language (DSL) - Development

2019-07-05 Thread Tomas Vondra

First of all, it's pretty difficult to follow the discussion when it's
not clear what's the original message and what's the response. E-mail
clients generally indent the original message with '>' or someting like
that, but your client does not do that (which is pretty silly). And
copying the message at the top does not really help. Please do something
about that.


On Fri, Jul 05, 2019 at 09:37:03PM +, Tom Mercha wrote:

I might be missing something, but it seems like you intend to replace
the SQL grammar we have with something else. It's not clear to me what
would be the point of doing that, and it definitely looks like a huge
amount of work - e.g. we don't have any support for switching between
two distinct grammars the way you envision, and just that alone seems
like a multi-year project. And if you don't have that capability, all
external tools kinda stop working. Good luck with running such database.


I was considering having two distinct grammars as an option - thanks
for indicating the effort involved. At the end of the day I want both
my DSL and the PostgreSQL grammars to coexist. Is extending
PostgreSQL's grammar with my own through the PostgreSQL extension
infrastructure worth consideration or is it also difficult to develop?
Could you suggest any reference material on this topic?



Well, I'm not an expert in that area, but we currently don't have any
infrastructure to support that. It's a topic that was discussed in the
past (perhaps you can find some references in the archives) and it
generally boils down to:

1) We're using bison as parser generator.
2) Bison does not allow adding rules on the fly.

So you have to modify the in-core src/backend/parser/gram.y and rebuild
postgres. See for example for an example of such discussion

https://www.postgresql.org/message-id/flat/CABSN6VeeEhwb0HrjOCp9kHaWm0Ljbnko5y-0NKsT_%3D5i5C2jog%40mail.gmail.com

When two of the smartest people on the list say it's a hard problem, it
probably is. Particularly for someone who does not know the internals.


What I'd look at first is implementing the grammar as a procedural
language (think PL/pgSQL, pl/perl etc.) implementing whatever you
expect from your DSL. And it's not like you'd have to wrap everything
in functions, because we have anonymous DO blocks.


Thanks for pointing out this direction! I think I will indeed adopt
this approach especially if directly extending PostgreSQL grammar would
be difficult.


Well, it's the only way to deal with it at the moment.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Extending PostgreSQL with a Domain-Specific Language (DSL) - Development

2019-07-05 Thread Tom Mercha
I might be missing something, but it seems like you intend to replace
the SQL grammar we have with something else. It's not clear to me what
would be the point of doing that, and it definitely looks like a huge
amount of work - e.g. we don't have any support for switching between
two distinct grammars the way you envision, and just that alone seems
like a multi-year project. And if you don't have that capability, all
external tools kinda stop working. Good luck with running such database.
I was considering having two distinct grammars as an option - thanks for 
indicating the effort involved. At the end of the day I want both my DSL and 
the PostgreSQL grammars to coexist. Is extending PostgreSQL's grammar with my 
own through the PostgreSQL extension infrastructure worth consideration or is 
it also difficult to develop? Could you suggest any reference material on this 
topic?

What I'd look at first is implementing the grammar as a procedural
language (think PL/pgSQL, pl/perl etc.) implementing whatever you expect
from your DSL. And it's not like you'd have to wrap everything in
functions, because we have anonymous DO blocks.
Thanks for pointing out this direction! I think I will indeed adopt this 
approach especially if directly extending PostgreSQL grammar would be difficult.

Regards
Tom

From: Tomas Vondra 
Sent: 05 July 2019 20:48
To: Tom Mercha
Cc: pgsql-hack...@postgresql.org
Subject: Re: Extending PostgreSQL with a Domain-Specific Language (DSL) - 
Development

On Fri, Jul 05, 2019 at 07:55:15AM +, Tom Mercha wrote:
>Dear Hackers
>
>I am interested in implementing my own Domain Specific Language (DSL)
>using PostgreSQL internals. Originally, the plan was not to use
>PostgreSQL and I had developed a grammar and used ANTLRv4 for parser
>work and general early development.
>
>Initially, I was hoping for a scenario where I could have PostgreSQL's
>parser to change grammar (e.g. SET parser_language=SQL vs. SET
>parser_language=myDSL) in which case my ANTLRv4 project would override
>the PostgreSQL parser module. I guess another direction that my project
>could take is to extend PostgreSQL's SQL parser to factor in my DSL
>keywords and requirements.
>
>To make matters more complicated, this version of ANTLR does not
>support code generation to C, but it does support generation to C++.
>Integrating the generated C++ code requires making it friendly to
>PostgreSQL e.g. using Plain Old Data Structures as described here
>https://www.postgresql.org/docs/9.0/extend-cpp.html, which seems to be
>suggesting to me that I may be using the wrong approach towards my
>goal.
>
>I would be grateful if anyone could provide any general advice or
>pointers regarding my approach, for example regarding development
>effort, so that the development with PostgreSQL internals can be smooth
>and of a high quality. Maybe somebody has come across another DSL
>attempt which used PostgreSQL and that I could follow as a reference?
>

I might be missing something, but it seems like you intend to replace
the SQL grammar we have with something else. It's not clear to me what
would be the point of doing that, and it definitely looks like a huge
amount of work - e.g. we don't have any support for switching between
two distinct grammars the way you envision, and just that alone seems
like a multi-year project. And if you don't have that capability, all
external tools kinda stop working. Good luck with running such database.

What I'd look at first is implementing the grammar as a procedural
language (think PL/pgSQL, pl/perl etc.) implementing whatever you expect
from your DSL. And it's not like you'd have to wrap everything in
functions, because we have anonymous DO blocks. So you could do:

  DO LANGUAGE mydsl $$
 ... whatever my dsl allows ...
  $$;

It's still a fair amount of code to implement this (both the PL handler
and the DSL implementation), but it's orders of magnitude simpler than
what you described.

See https://www.postgresql.org/docs/current/plhandler.html for details
about how to write a language handler.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: [PATCH v4] Add \warn to psql

2019-07-05 Thread David Fetter
On Fri, Jul 05, 2019 at 12:38:02PM -0400, Tom Lane wrote:
> I wrote:
> > David Fetter  writes:
> >> [ v7-0001-Add-warn-to-psql.patch ]
> 
> > I took a look at this.  I have no quibble with the proposed feature,
> > and the implementation is certainly simple enough.  But I'm unconvinced
> > about the proposed test scaffolding.
> 
> I pushed this with the simplified test methodology.

Thanks!

> While I was fooling with it I noticed that the existing code for -n
> is buggy.  The documentation says clearly that only the first
> argument is a candidate to be -n:
> 
> If the first argument is an unquoted -n the 
> trailing
> newline is not written.
> 
> but the actual implementation allows any argument to be recognized as
> -n:
> 
> regression=# \echo this -n should not be -n like this
> this should not be like thisregression=# 
> 
> I fixed that, but I'm wondering if we should back-patch that fix
> or leave the back branches alone.

+0.5 for back-patching.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Fri, Jul  5, 2019 at 05:00:42PM -0400, Bruce Momjian wrote:
> On Fri, Jul  5, 2019 at 04:24:54PM -0400, Alvaro Herrera wrote:
> > On 2019-Jul-05, Bruce Momjian wrote:
> > 
> > > Uh, well, you have the WAL record, and you want to write it to an 8k
> > > page.  You have to read the 8k page from disk into shared buffers, and
> > > you have to decrypt the 8k page to do that, right?  We aren't going to
> > > store 8k pages encrypted in shared buffers, right?
> > 
> > Oh, is that the idea?  I was kinda assuming that the data was kept
> > as-stored in shared buffers, ie. it would be decrypted on access, not on
> > read from disk.  The system seems very prone to leakage if you have it
> > decrypted in shared memory.
> 
> Well, the overhead of decrypting on every access will make the slowdown
> huge, and I don't know what security value that would have.  I am not
> sure what security value TDE itself has, but I think encrypting shared
> buffer contents has even less.

Sorry for the delay --- here is some benchmark info:


https://www.postgresql.org/message-id/4723a402-b14f-4994-2de9-d85b55a56b7f%40cybertec.at

as far as benchmarking is concerned: i did a quick test yesterday (not 
with the final AES implementation yet) and i got pretty good results. 
with a reasonably well cached database in a typical application I expect

to loose around 10-20%. if everything fits in memory there is 0 loss of 
course. the worst I got with the standard AES (no hardware support used 
yet) I lost around 45% or so. but this requires a value as low as 32 MB 
of shared buffers or so.

Notice the 0% overhead if everything fits in RAM, meaning it is not
decrypting on RAM access.  If it is 10-20% for a "reasonably well cached
database", I am sure it will be 10x that for decrypting on RAM access.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +





Re: using explicit_bzero

2019-07-05 Thread Thomas Munro
On Sat, Jul 6, 2019 at 1:07 AM Peter Eisentraut
 wrote:
> On 2019-07-05 14:06, Thomas Munro wrote:
> > +#ifndef HAVE_EXPLICIT_BZERO
> > +#define explicit_bzero(b, len) memset(b, 0, len)
> > +#endif
> >
> > I noticed some other libraries use memset through a function pointer
> > or at least define a function the compiler can't see.
>
> I don't understand what you are getting at here.

Do we want to provide a replacement implementation that actually
prevents the compiler from generating no code in some circumstances?
Then I think we need at least a function defined in another
translation unit so the compiler can't see what it does, no?

-- 
Thomas Munro
https://enterprisedb.com




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Fri, Jul  5, 2019 at 04:24:54PM -0400, Alvaro Herrera wrote:
> On 2019-Jul-05, Bruce Momjian wrote:
> 
> > Uh, well, you have the WAL record, and you want to write it to an 8k
> > page.  You have to read the 8k page from disk into shared buffers, and
> > you have to decrypt the 8k page to do that, right?  We aren't going to
> > store 8k pages encrypted in shared buffers, right?
> 
> Oh, is that the idea?  I was kinda assuming that the data was kept
> as-stored in shared buffers, ie. it would be decrypted on access, not on
> read from disk.  The system seems very prone to leakage if you have it
> decrypted in shared memory.

Well, the overhead of decrypting on every access will make the slowdown
huge, and I don't know what security value that would have.  I am not
sure what security value TDE itself has, but I think encrypting shared
buffer contents has even less.

> If you keep it unencrypted in shared_buffers, you'd require WAL replay
> and checkpointer to have every single key in the system.  That sounds
> nightmarish -- a single user can create a table, hide the key and block
> WAL replay and checkpointing for the whole system.

Yep, bingo!

> I haven't read the whole thread, sorry about that -- maybe these points
> have already been discussed.
> 
> > Also, that assumes that we are only encrypting the WAL payload, and not
> > the relation oids or other header information, which I think is a
> > mistake because it will lead to information leakage.
> 
> Well, that was part of my question.  Why do you care to encrypt metadata
> such as the relation OID (or really relfilenode)?  Yes, I was assuming
> that only the WAL payload is encrypted.

Well, you would need to decide what WAL information needs to be secured.
Is the fact an insert was performed on a table a security issue? 
Depends on your risks.  My point is that almost anything you do beyond
cluster-level encryption either adds complexity that is bug-prone or
fragile, or adds unacceptable overhead, or leaks security information.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Bruce Momjian wrote:

> Uh, well, you have the WAL record, and you want to write it to an 8k
> page.  You have to read the 8k page from disk into shared buffers, and
> you have to decrypt the 8k page to do that, right?  We aren't going to
> store 8k pages encrypted in shared buffers, right?

Oh, is that the idea?  I was kinda assuming that the data was kept
as-stored in shared buffers, ie. it would be decrypted on access, not on
read from disk.  The system seems very prone to leakage if you have it
decrypted in shared memory.

If you keep it unencrypted in shared_buffers, you'd require WAL replay
and checkpointer to have every single key in the system.  That sounds
nightmarish -- a single user can create a table, hide the key and block
WAL replay and checkpointing for the whole system.

I haven't read the whole thread, sorry about that -- maybe these points
have already been discussed.

> Also, that assumes that we are only encrypting the WAL payload, and not
> the relation oids or other header information, which I think is a
> mistake because it will lead to information leakage.

Well, that was part of my question.  Why do you care to encrypt metadata
such as the relation OID (or really relfilenode)?  Yes, I was assuming
that only the WAL payload is encrypted.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Tomas Vondra

On Fri, Jul 05, 2019 at 03:38:28PM -0400, Bruce Momjian wrote:

On Sun, Jun 16, 2019 at 03:57:46PM -0400, Stephen Frost wrote:

Greetings,

* Bruce Momjian (br...@momjian.us) wrote:
> On Sun, Jun 16, 2019 at 12:42:55PM -0400, Joe Conway wrote:
> > On 6/16/19 9:45 AM, Bruce Momjian wrote:
> > > On Sun, Jun 16, 2019 at 07:07:20AM -0400, Joe Conway wrote:
> > >> In any case it doesn't address my first point, which is limiting the
> > >> volume encrypted with the same key. Another valid reason is you might
> > >> have data at varying sensitivity levels and prefer different keys be
> > >> used for each level.
> > >
> > > That seems quite complex.
> >
> > How? It is no more complex than encrypting at the tablespace level
> > already gives you - in that case you get this property for free if you
> > care to use it.
>
> All keys used to encrypt WAL data must be unlocked at all times or crash
> recovery, PITR, and replication will not stop when it hits a locked key.
> Given that, how much value is there in allowing a key per tablespace?

There's a few different things to discuss here, admittedly, but I don't
think it means that there's no value in having a key per tablespace.

Ideally, a given backend would only need, and only have access to, the
keys for the tablespaces that it is allowed to operate on.  I realize
that's a bit farther than what we're talking about today, but hopefully
not too much to be able to consider.


What people really want with more-granular-than-cluster encryption is
the ability to supply their passphrase key _when_ they want to access
their data, and then leave and be sure their data is secure from
decryption.  That will not be possible since the WAL will be encrypted
and any replay of it will need their passphrase key to unlock it, or the
entire system will be unrecoverable.

This is a fundamental issue, and will eventually doom any more granular
encryption approach, unless we want to use the same key for all
encrypted tablespaces, create separate WALs for each tablespace, or say
recovery of some tablespaces will fail.  I doubt any of those will be
acceptable.



I agree this is a pretty crucial challenge, and those requirements seem
in direct conflict. Users use encryption to protect privacy of the data,
but we need access to some of the data to implement some of the
important tasks of a RDBMS.

And it's not just about things like recovery or replication. How do you
do ANALYZE on encrypted data? Sure, if a user runs it in a session that
has the right key, that's fine. But what about autovacuum/autoanalyze?

I suspect the issue here is that we're trying to retrofit a solution for
data-at-rest encryption to something that seems closer to protecting
data during execution.

Which is a worthwhile goal, of course, but perhaps we're trying to use
the wrong tool to achieve it? To paraphrase the hammer/nail saying "If
all you know is a block encryption, everything looks like a block."


What if the granular encryption (not the "whole cluster with a single
key") case does not encrypt whole blocks, but just tuple data? Would
that allow at least the most critical WAL use cases (recovery, physical
replication) to work without having to know all the encryption keys?

Of course, that would be a much less efficient compared to plain block
encryption, but that may be the "natural cost" of the feature.

It would not solve e.g. logical replication or ANALYZE, which both
require access to the plaintext data, though.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: POC: Cleaning up orphaned files using undo logs

2019-07-05 Thread Robert Haas
On Mon, Jul 1, 2019 at 3:54 AM Thomas Munro  wrote:
> [ new patches ]

I took a look at 0012 today, Amit's patch for extending the binary
heap machinery, and 0013, Amit's patch for "Infrastructure to register
and fetch undo action requests."

I don't think that binaryheap_allocate_shm() is a good design.  It
presupposes that we want to store the binary heap as its own chunk of
shared memory allocated via ShmemInitStruct(), but we might want to do
something else, like embed in another structure, store it in a DSM or
DSA, etc., and this function can't do any of that.  I think we should
have something more like:

extern Size binaryheap_size(int capacity);
extern void binaryheap_initialize(binaryheap *, int capacity,
binaryheap_comparator compare, void *arg);

Then the caller can do something like:

sz = binaryheap_size(capacity);
bh = ShmemInitStruct(name, sz, );
if (!found)
binaryheap_initialize(bh, capacity, comparator, whatever);

If it wants to get the memory in some other way, it just needs to
initialize bh differently; the rest is the same.  Note that there is
no need, in this design, for binaryheap_size/initialize to make use of
"shared" memory.  They could equally well be used on backend-local
memory.  They do not need to care. You just provide the memory, and
they do their thing.

I wasn't very happy about binaryheap_nth(), binaryheap_remove_nth(),
and binaryheap_remove_nth_unordered() and started looking at how they
are used to try to see if there might be a better way.  That led me to
look at 0013.  Unfortunately, I find it really hard to understand what
this code is actually doing.  There's a lot of redundant and
badly-written stuff in here.  As a general principle, if you have two
or three data structures of some particular type, you don't write a
separate family of functions for manipulating each one.  You write one
function for each operation, and you pass the particular copy of the
data structure with which you are working as an argument.

In the lengthy section of macros definitions at the top of
undorequest.c, we have macros InitXidQueue, XidQueueIsEmpty,
GetXidQueueSize, GetXidQueueElem, GetXidQueueTopElem,
GetXidQueueNthElem, and SetXidQueueElem.  Several of these are used in
only one place or are not used anywhere at all; those should be
removed altogether and inlined into the single call site if there is
one.  Now, then after this, there is a matching set of macros,
InitSizeQueue, SizeQueueIsEmpty, GetSizeQueueSize, GetSizeQueueElem,
GetSizeQueueTopElem, GetSizeQueueNthElem, and  SetSizeQueueElem.  Many
of these macros are exactly the same as the previous set of macros
except that they operate on a different queue, which as I mentioned in
the previous paragraph, is not a good design.  It leads to extensive
code duplication.

Look, for example, at RemoveOldElemsFromSizeQueue and
RemoveOldElemsFromXidQueue.  They are basically identical except for
s/Size/Xid/g and s/SIZE/XID/g, but you can't unify them easily because
they are calling different functions.  However, if you didn't have one
function called GetSizeQueueSize and another called GetXidQueueSize,
but just had a pointer to the relevant binary heap, then both
functions could just call binaryheap_empty() on it, which would be
better style, use fewer macros, generate less machine code, and be
easier to read.  Ideally, you'd get to the point where you could just
have one function rather than two, and pass the queue upon which it
should operate as an argument.  There seems to be a good deal of this
kind of duplication in this file and it really needs to be cleaned up.

Now, one objection to the above line of attack is the different queues
actually contain different types of elements.  Apparently, the XID
queue contains elements of type UndoXidQueue and the size queue
contains elements of type UndoSizeQueue.  It is worth noting here that
these are bad type names, because they sound like they are describing
a type of queue, but it seems that they are actually describing an
element in the queue. However, there are two larger problems:

1. I don't think we should have three different kinds of objects for
each of the three different queues.  It seems like it would be much
simpler and easier to just have one kind of object that stores all the
information we need (full_xid, start_urec_ptr, dbid, request_size,
next_retry_at, error_ocurred_at) and use that everywhere. You could
object that this would increase the storage space requirement, but it
wouldn't be enough to make any real difference and it probably would
be well worth it for the avoidance of complexity.

2. However, I don't think we should have a separate request object for
each queue anyway.  We should insert pointers to the same objects in
all the relevant queue (either size + XID, or else error).  So instead
of having three sets of objects, one for each queue, we'd just have
one set of objects and point to them with as many as two pointers.
We'd therefore need LESS memory than 

Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Fri, Jul  5, 2019 at 03:46:28PM -0400, Alvaro Herrera wrote:
> On 2019-Jul-05, Bruce Momjian wrote:
> 
> > What people really want with more-granular-than-cluster encryption is
> > the ability to supply their passphrase key _when_ they want to access
> > their data, and then leave and be sure their data is secure from
> > decryption.  That will not be possible since the WAL will be encrypted
> > and any replay of it will need their passphrase key to unlock it, or the
> > entire system will be unrecoverable.
> 
> I'm not sure I understand why WAL replay needs the passphrase to work.
> Why isn't the data saved in WAL already encrypted, which can be applied
> as raw bytes to each data block, without needing to decrypt anything?
> Only if somebody wants to interpret the bytes they need the passphrase,
> no?

Uh, well, you have the WAL record, and you want to write it to an 8k
page.  You have to read the 8k page from disk into shared buffers, and
you have to decrypt the 8k page to do that, right?  We aren't going to
store 8k pages encrypted in shared buffers, right?

If you did want to do that, or wanted to write them to disk without
decrypting the 8k page, it still would not work since AES is a 16-byte
encryption cipher.  I don't think we can break 8k pages up into 16-byte
chunks and be sure we can just place data into those 16-byte boundaries.

Also, that assumes that we are only encrypting the WAL payload, and not
the relation oids or other header information, which I think is a
mistake because it will lead to information leakage.

You can use AES in stream cipher mode, but then the ordering of the
encryption is critical and you can't move 16-byte chunks around --- they
have to be decrypted in their encrypted order.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Stephen Frost wrote:

> I had been specifically thinking of tablespaces because we might be able
> to do something exactly along these lines- keep which tablespace the
> data is in directly in the WAL (and not encrypted), but then have the
> data itself be encrypted, and with the key for that tablespace.

Hmm, I was imagining that the user-level data is encrypted, while the
metadata such as the containing relfilenode is not encrypted and thus
can be read by system processes such as checkpointer or WAL-apply
without needing to decrypt anything.  Maybe I'm just lacking imagination
for an attack that uses that unencrypted metadata, though.

> Splitting the WAL by tablespace would be even nicer, of course... :)

Hmm, I think you would have to synchronize the apply anyway (i.e. not
replay in one tablespace ahead of a record in another tablespace with an
earlier LSN.)  What are you thinking are the gains of doing that, anyway?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> On 2019-Jul-05, Bruce Momjian wrote:
> 
> > What people really want with more-granular-than-cluster encryption is
> > the ability to supply their passphrase key _when_ they want to access
> > their data, and then leave and be sure their data is secure from
> > decryption.  That will not be possible since the WAL will be encrypted
> > and any replay of it will need their passphrase key to unlock it, or the
> > entire system will be unrecoverable.
> 
> I'm not sure I understand why WAL replay needs the passphrase to work.
> Why isn't the data saved in WAL already encrypted, which can be applied
> as raw bytes to each data block, without needing to decrypt anything?
> Only if somebody wants to interpret the bytes they need the passphrase,
> no?

I had been specifically thinking of tablespaces because we might be able
to do something exactly along these lines- keep which tablespace the
data is in directly in the WAL (and not encrypted), but then have the
data itself be encrypted, and with the key for that tablespace.

Splitting the WAL by tablespace would be even nicer, of course... :)

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Bruce Momjian wrote:

> What people really want with more-granular-than-cluster encryption is
> the ability to supply their passphrase key _when_ they want to access
> their data, and then leave and be sure their data is secure from
> decryption.  That will not be possible since the WAL will be encrypted
> and any replay of it will need their passphrase key to unlock it, or the
> entire system will be unrecoverable.

I'm not sure I understand why WAL replay needs the passphrase to work.
Why isn't the data saved in WAL already encrypted, which can be applied
as raw bytes to each data block, without needing to decrypt anything?
Only if somebody wants to interpret the bytes they need the passphrase,
no?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Sun, Jun 16, 2019 at 07:07:20AM -0400, Joe Conway wrote:
> On 6/15/19 9:28 PM, Bruce Momjian wrote:
> > There are no known non-exhaustive plaintext attacks on AES:
> > 
> > 
> > https://crypto.stackexchange.com/questions/1512/why-is-aes-resistant-to-known-plaintext-attacks
> 
> Even that non-authoritative stackexchange thread has varying opinions.
> Surely you don't claim that limiting know plaintext as much as is
> practical is a bad idea in general.

AES is used to encrypt TLS/https, and web traffic is practically always
mostly-known plaintext.  I don't know of any cases where only part of a
webpage is encrypted by TLS to avoid encrypting known plaintext.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: [Proposal] Table-level Transparent Data Encryption (TDE) and Key Management Service (KMS)

2019-07-05 Thread Bruce Momjian
On Sun, Jun 16, 2019 at 03:57:46PM -0400, Stephen Frost wrote:
> Greetings,
> 
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Sun, Jun 16, 2019 at 12:42:55PM -0400, Joe Conway wrote:
> > > On 6/16/19 9:45 AM, Bruce Momjian wrote:
> > > > On Sun, Jun 16, 2019 at 07:07:20AM -0400, Joe Conway wrote:
> > > >> In any case it doesn't address my first point, which is limiting the
> > > >> volume encrypted with the same key. Another valid reason is you might
> > > >> have data at varying sensitivity levels and prefer different keys be
> > > >> used for each level.
> > > > 
> > > > That seems quite complex.
> > > 
> > > How? It is no more complex than encrypting at the tablespace level
> > > already gives you - in that case you get this property for free if you
> > > care to use it.
> > 
> > All keys used to encrypt WAL data must be unlocked at all times or crash
> > recovery, PITR, and replication will not stop when it hits a locked key.
> > Given that, how much value is there in allowing a key per tablespace?
> 
> There's a few different things to discuss here, admittedly, but I don't
> think it means that there's no value in having a key per tablespace.
> 
> Ideally, a given backend would only need, and only have access to, the
> keys for the tablespaces that it is allowed to operate on.  I realize
> that's a bit farther than what we're talking about today, but hopefully
> not too much to be able to consider.

What people really want with more-granular-than-cluster encryption is
the ability to supply their passphrase key _when_ they want to access
their data, and then leave and be sure their data is secure from
decryption.  That will not be possible since the WAL will be encrypted
and any replay of it will need their passphrase key to unlock it, or the
entire system will be unrecoverable.

This is a fundamental issue, and will eventually doom any more granular
encryption approach, unless we want to use the same key for all
encrypted tablespaces, create separate WALs for each tablespace, or say
recovery of some tablespaces will fail.  I doubt any of those will be
acceptable.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Use relative rpath if possible

2019-07-05 Thread Tom Lane
Peter Eisentraut  writes:
> rebased patch attached, no functionality changes

I poked at this a bit, and soon found that it fails check-world,
because the isolationtester binary is built with an rpath that
only works if it's part of the temp install tree, which it ain't.

/home/postgres/pgsql/src/test/isolation/isolationtester: error while loading 
shared libraries: libpq.so.5: cannot open shared object file: No such file or 
directory

(The cfbot seems to get past that, for reasons that are entirely unclear
to me; but it falls over later in the ecpg tests with what's presumably
the same problem.)

While there might be some argument for making isolationtester part
of the installed set of executables, that approach certainly doesn't
scale; we can't insist that every test tool should be part of the
installation.

So I think we need some more-intelligent rule about when to apply
the relative rpath.  Which in turn seems to mean we still need to
set up LD_LIBRARY_PATH in some cases.

Another thing I noticed is that if, say, you build a new version
of psql and try to test it out with "./psql ...", that doesn't work
anymore (whereas today, it does work as long as you installed libpq
earlier).  That might be acceptable collateral damage, but it's not
very desirable IMO.

I'm also slightly concerned that this effectively mandates that every
library we install be immediately in $(libdir), never subdirectories
thereof; else it'd need more than one ".." in its rpath and there's no
way to adjust that.  That's not a showstopper problem probably, because
we have no such libraries today, but I wonder if somebody would want
some in the future.

A possible partial solution to these issues is to make the rpath look
like $ORIGIN/../lib and then the normal absolute rpath.  But that
doesn't fix the problem for non-installed binaries used in check-world
with no pre-existing installation.

>> (Yes, something for macOS would be nice, to work around SIP issues, but
>> I'll leave that as a separate future item.)

TBH, I think that supporting macOS with SIP enabled is really the
only interesting case here.  On these other platforms, changing this
won't fix anything very critical, and it seems like it will make
some cases worse.

regards, tom lane




Re: Periods

2019-07-05 Thread Paul A Jungwirth
On Thu, Jul 4, 2019 at 11:44 AM Alvaro Herrera  wrote:
> I think that the functionality in your patch is already integrated in
> Paul's patch for temporal PK/FK elsewhere ... is that correct, or is
> this patch orthogonal to that work?

Hi Vik, I wasn't aware that you had moved your work over to an
extension. It looks like you've made a lot of progress! I'd be eager
to work with you on getting this into core.

Alvaro: it's a mix of orthogonal and not orthogonal. :-) My work lets
you use range types directly in SQL:2011 constructs, whereas Vik's
work lets you use SQL:2011 PERIODs. I would like to support *both*, so
I've been intending to add Vik's original patch letting you define
PERIODs to my own work.

There is no conflict in supporting both ranges and PERIODs because you
can't name a PERIOD the same as an existing column.

Behind the scenes the easiest way to implement PERIODs is with range
types, so there would be very little duplication to permit both.

PERIODs suffer from being outside relational theory as a
quasi-attribute. You especially see this in composing queries & result
sets. For example with ranges you could say this:

WITH x AS (
  SELECT * FROM y
  FOR PORTION OF some_range FROM t1 TO t2
)
SELECT * FROM x
FOR PORTION OF some_range FROM t3 TO t4
;

But with PERIODs you can't because a PERIOD is not included in `SELECT
*`. Also it's unclear how "attached" it is to a table definition. Can
you say `SELECT *, some_period FROM x`? If so can you then use `FOR
PORTION OF` on that result set? Can you construct an on-the-fly PERIOD
expression? Can you pass a period to a function as a parameter? Can a
function return a period? Can you ORDER BY a period? GROUP BY one? Can
you cast to/from a period? Can you ask a period for its high/low
values? Do we treat a PERIOD as a whole new datatype? Can you define a
real column of type PERIOD? I haven't found text from the standard
that answers most of these questions. (The standard does say you can
construct a `PERIOD(t1, t2)` expression but apparently only inside a
"period predicate".)

Also you can't define PERIODs on types other than
date/timestamp/timestamptz, unlike ranges.

Also PERIODs require a sentinel value to mean "unbounded" (e.g.
31-JAN-) whereas ranges let you express that with a NULL.
(Postgres does have Infinity and -Infinity for timestamp types but
I've noticed that client programming languages can't always express
ranges with those values.)

Personally I intend to use ranges any time I build temporal tables,
but supporting PERIODs might have value for people more interested in
database portability or someone migrating from elsewhere to Postgres.

I had some conversations at PGCon that I felt validated the
permit-PERIODS-or-ranges approach, so I'm about ready to expand my
patch to handle PERIODs too. For that I would love to draw on Vik's
work so far. I think his original patch against core is more likely to
be helpful than the extension, but I'll certainly consult both, and
Vik if you have any advice let me know! :-)

A big difference between a temporal extension vs temporal features in
core is implementing DML. An extension pretty much requires you to use
INSTEAD OF triggers. Also Vik's README points out that implementing
temporal DELETE is hard that way. In core I believe you'd do temporal
DML in the executor node. (That's my working theory anyway; I'm still
new to that part of the code.)

The first thing on my TODO list is to write a blog post comparing how
other RDMBSes handle PERIODs and other temporal features. Besides the
questions above, how does a trigger work on a table? For example when
you DELETE in the middle of a range/period, and it becomes an INSERT
plus an UPDATE, I *believe* you still fire the DELETE trigger. And you
need to set the NEW/OLD tuples appropriately. You *don't* fire any
INSERT & UPDATE triggers. The standard isn't super explicit but that's
my take on it, and I want to write down what other vendors are doing.

Yours,
Paul




Re: [HACKERS] [WIP] Effective storage of duplicates in B-tree index.

2019-07-05 Thread Bruce Momjian
On Thu, Jul  4, 2019 at 05:06:09PM -0700, Peter Geoghegan wrote:
> This result is very impressive. We'll need to revisit what the right
> trade-off is for the compression scheme, which Heikki had some
> thoughts on when we left off 3 years ago, but that should be a lot
> easier now. I am very encouraged by the fact that this relatively
> simple approach already works quite nicely. It's also great to see
> that bulk insertions with lots of compression are very clearly faster
> with this latest revision of your patch, unlike earlier versions from
> 2016 that made those cases slower (though I haven't tested indexes
> that don't really use compression). I think that this is because you
> now do the compression lazily, at the point where it looks like we may
> need to split the page. Previous versions of the patch had to perform
> compression eagerly, just like GIN, which is not really appropriate
> for nbtree.

I am also encouraged and am happy we can finally move this duplicate
optimization forward.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +




Re: Extending PostgreSQL with a Domain-Specific Language (DSL) - Development

2019-07-05 Thread Tomas Vondra

On Fri, Jul 05, 2019 at 07:55:15AM +, Tom Mercha wrote:

Dear Hackers

I am interested in implementing my own Domain Specific Language (DSL)
using PostgreSQL internals. Originally, the plan was not to use
PostgreSQL and I had developed a grammar and used ANTLRv4 for parser
work and general early development.

Initially, I was hoping for a scenario where I could have PostgreSQL's
parser to change grammar (e.g. SET parser_language=SQL vs. SET
parser_language=myDSL) in which case my ANTLRv4 project would override
the PostgreSQL parser module. I guess another direction that my project
could take is to extend PostgreSQL's SQL parser to factor in my DSL
keywords and requirements.

To make matters more complicated, this version of ANTLR does not
support code generation to C, but it does support generation to C++.
Integrating the generated C++ code requires making it friendly to
PostgreSQL e.g. using Plain Old Data Structures as described here
https://www.postgresql.org/docs/9.0/extend-cpp.html, which seems to be
suggesting to me that I may be using the wrong approach towards my
goal.

I would be grateful if anyone could provide any general advice or
pointers regarding my approach, for example regarding development
effort, so that the development with PostgreSQL internals can be smooth
and of a high quality. Maybe somebody has come across another DSL
attempt which used PostgreSQL and that I could follow as a reference?



I might be missing something, but it seems like you intend to replace
the SQL grammar we have with something else. It's not clear to me what
would be the point of doing that, and it definitely looks like a huge
amount of work - e.g. we don't have any support for switching between
two distinct grammars the way you envision, and just that alone seems
like a multi-year project. And if you don't have that capability, all
external tools kinda stop working. Good luck with running such database.

What I'd look at first is implementing the grammar as a procedural
language (think PL/pgSQL, pl/perl etc.) implementing whatever you expect
from your DSL. And it's not like you'd have to wrap everything in
functions, because we have anonymous DO blocks. So you could do:

 DO LANGUAGE mydsl $$
... whatever my dsl allows ...
 $$; 


It's still a fair amount of code to implement this (both the PL handler
and the DSL implementation), but it's orders of magnitude simpler than
what you described.

See https://www.postgresql.org/docs/current/plhandler.html for details
about how to write a language handler.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Optimize partial TOAST decompression

2019-07-05 Thread Binguo Bao
Tomas Vondra  于2019年7月5日周五 上午1:46写道:

> I've done a bit of testing and benchmaring on this patch today, and
> there's a bug somewhere, making it look like there are corrupted data.
>
> What I'm seeing is this:
>
> CREATE TABLE t (a text);
>
> -- attached is data for one row
> COPY t FROM '/tmp/t.data';
>
>
> SELECT length(substr(a,1000)) from t;
> psql: ERROR:  compressed data is corrupted
>
> SELECT length(substr(a,0,1000)) from t;
>  length
> 
> 999
> (1 row)
>
> SELECT length(substr(a,1000)) from t;
> psql: ERROR:  invalid memory alloc request size 2018785106
>
> That's quite bizarre behavior - it does work with a prefix, but not with
> suffix. And the exact ERROR changes after the prefix query. (Of course,
> on master it works in all cases.)
>
> The backtrace (with the patch applied) looks like this:
>
> #0  toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2291
> #1  toast_decompress_datum (attr=0x12572e0) at tuptoaster.c:2277
> #2  0x004c3b08 in heap_tuple_untoast_attr_slice (attr= out>, sliceoffset=0, slicelength=-1) at tuptoaster.c:315
> #3  0x0085c1e5 in pg_detoast_datum_slice (datum=,
> first=, count=) at fmgr.c:1767
> #4  0x00833b7a in text_substring (str=133761519127512, start=0,
> length=, length_not_specified=) at
> varlena.c:956
> ...
>
> I've only observed this with a very small number of rows (the  data is
> generated randomly with different compressibility etc.), so I'm only
> attaching one row that exhibits this issue.
>
> My guess is toast_fetch_datum_slice() gets confused by the headers or
> something, or something like that. FWIW the new code added to this
> function does not adhere to our code style, and would deserve some
> additional explanation of what it's doing/why. Same for the
> heap_tuple_untoast_attr_slice, BTW.
>
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Hi, Tomas!
Thanks for your testing and the suggestion.

That's quite bizarre behavior - it does work with a prefix, but not with
> suffix. And the exact ERROR changes after the prefix query.


I think bug is caused by "#2  0x004c3b08 in
heap_tuple_untoast_attr_slice (attr=, sliceoffset=0,
slicelength=-1) at tuptoaster.c:315",
since I ignore the case where slicelength is negative, and I've appended
some comments for heap_tuple_untoast_attr_slice for the case.

FWIW the new code added to this
> function does not adhere to our code style, and would deserve some
> additional explanation of what it's doing/why. Same for the
> heap_tuple_untoast_attr_slice, BTW.


I've added more comments to explain the code's behavior.
Besides, I also modified the macro "TOAST_COMPRESS_RAWDATA" to
"TOAST_COMPRESS_DATA" since
it is used to get toast compressed data rather than raw data.

Best Regards, Binguo Bao.
From bcf04278b4d5956cd5f5fdab0d954b36adfd0022 Mon Sep 17 00:00:00 2001
From: BBG 
Date: Sun, 2 Jun 2019 19:18:46 +0800
Subject: [PATCH] Optimize partial TOAST decompression

---
 src/backend/access/heap/tuptoaster.c | 57 
 src/common/pg_lzcompress.c   | 26 
 src/include/common/pg_lzcompress.h   |  1 +
 3 files changed, 72 insertions(+), 12 deletions(-)

diff --git a/src/backend/access/heap/tuptoaster.c b/src/backend/access/heap/tuptoaster.c
index 55d6e91..1eb6cca 100644
--- a/src/backend/access/heap/tuptoaster.c
+++ b/src/backend/access/heap/tuptoaster.c
@@ -61,7 +61,8 @@ typedef struct toast_compress_header
  */
 #define TOAST_COMPRESS_HDRSZ		((int32) sizeof(toast_compress_header))
 #define TOAST_COMPRESS_RAWSIZE(ptr) (((toast_compress_header *) (ptr))->rawsize)
-#define TOAST_COMPRESS_RAWDATA(ptr) \
+#define TOAST_COMPRESS_SIZE(ptr)	((int32) VARSIZE(ptr) - TOAST_COMPRESS_HDRSZ)
+#define TOAST_COMPRESS_DATA(ptr) \
 	(((char *) (ptr)) + TOAST_COMPRESS_HDRSZ)
 #define TOAST_COMPRESS_SET_RAWSIZE(ptr, len) \
 	(((toast_compress_header *) (ptr))->rawsize = (len))
@@ -252,6 +253,8 @@ heap_tuple_untoast_attr(struct varlena *attr)
  *
  *		Public entry point to get back part of a toasted value
  *		from compression or external storage.
+ *
+ * Note if slicelength is negative, return suffix of the value.
  * --
  */
 struct varlena *
@@ -273,8 +276,23 @@ heap_tuple_untoast_attr_slice(struct varlena *attr,
 		if (!VARATT_EXTERNAL_IS_COMPRESSED(toast_pointer))
 			return toast_fetch_datum_slice(attr, sliceoffset, slicelength);
 
-		/* fetch it back (compressed marker will get set automatically) */
-		preslice = toast_fetch_datum(attr);
+		/*
+		 * If only need the prefix slice, fetch enough datums to decompress.
+		 * Otherwise, fetch all datums.
+		 */
+		if (slicelength > 0 && sliceoffset >= 0)
+		{
+			int32 max_size;
+			max_size = pglz_maximum_compressed_size(sliceoffset + slicelength,
+	TOAST_COMPRESS_SIZE(attr));
+			/*
+			 * Be sure to get enough compressed slice
+			 * and compressed marker will get 

Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Fri, Jul 5, 2019 at 10:57 AM Paul A Jungwirth
 wrote:
> I take it that a multirange contains of *disjoint* ranges,

*consists* of. :-)




Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Fri, Jul 5, 2019 at 10:45 AM David Fetter  wrote:
> If I understand the cases correctly, the combination of covering_range
> and multi_range types covers all cases. To recap, covering_range_agg
> assigns a weight, possibly 0, to each non-overlapping sub-range.  A
> cast from covering_range to multi_range would meld adjacent ranges
> with non-zero weights into single ranges in O(N) (N sub-ranges) time
> and some teensy amount of memory for tracking progress of adjacent
> ranges of non-zero weight.  Gaps would just be multi_range consisting
> of the sub-ranges of the covering_range with weight 0, and wouldn't
> require any tracking.

I take it that a multirange contains of *disjoint* ranges, so instead
of {[1,2), [2,3), [6,7)} you'd have {[1,3), [6,7)}. Jeff does that
match your expectation?

I just realized that since weighted_range_agg and covering_range_agg
return tuples of (anyrange, integer) (maybe other numeric types too?),
their elements are *not ranges*, so they couldn't return a multirange.
They would have to return an array of those tuples.

I agree that if you had a pre-sorted list of weighted ranges (with or
without zero-weight elements), you could convert it to a multirange in
O(n) very easily.

Paul




Re: range_agg

2019-07-05 Thread David Fetter
On Fri, Jul 05, 2019 at 09:58:02AM -0700, Paul A Jungwirth wrote:
> On Mon, Jul 1, 2019 at 3:38 PM Jeff Davis  wrote:
> >
> > For getting into core though, it should be a more complete set of
> > related operations. The patch is implicitly introducing the concept of
> > a "multirange" (in this case, an array of ranges), but it's not making
> > the concept whole.
> >
> > What else should return a multirange? This patch implements the union-
> > agg of ranges, but we also might want an intersection-agg of ranges
> > (that is, the set of ranges that are subranges of every input). Given
> > that there are other options here, the name "range_agg" is too generic
> > to mean union-agg specifically.
> 
> Thanks for the review!
> 
> I like the idea of adding a multirange type that works like range
> types, although I'm not sure I want to build it. :-)
> 
> My own motivations for the range_agg patch are for temporal databases,
> where I have two use-cases: checking temporal foreign keys [1] and
> doing a Snodgrass "coalesce" operation [2]. The FKs use-case is more
> important. For coalesce I just immediately UNNEST the array, so a
> multirange would sort of "get in the way". It's no big deal if I can
> cast a multirange to an array, although for something that would run
> on every INSERT/UPDATE/DELETE I'd like to understand what the cast
> would cost us in performance. But coalesce isn't part of SQL:2011 and
> should be optional behavior or just something in an extension. The FKs
> use case matters to me a lot more, and I think a multirange would be
> fine for that. Also a multirange would solve the generics problem
> Pavel asked about. (I'll say more about that in a reply to him.)
> 
> I'm not convinced that an intersection aggregate function for
> multiranges would be used by anyone, but I don't mind building one.
> Every other *_agg function has an "additive" sense, not a
> "subtractive" sense. json{,b}_object_agg are the closest since you
> *could* imagine intersection semantics for those, but they are unions.
> So in terms of *naming* I think using "range_agg" for union semantics
> is natural and would fit expectations. (I'm not the first to name this
> function range_agg btw: [3]).
> 
> But there is clearly more than one worthwhile way to aggregate ranges:
> 
> - David suggested weighted_range_agg and covering_range_agg. At PGCon

If I understand the cases correctly, the combination of covering_range
and multi_range types covers all cases. To recap, covering_range_agg
assigns a weight, possibly 0, to each non-overlapping sub-range.  A
cast from covering_range to multi_range would meld adjacent ranges
with non-zero weights into single ranges in O(N) (N sub-ranges) time
and some teensy amount of memory for tracking progress of adjacent
ranges of non-zero weight.  Gaps would just be multi_range consisting
of the sub-ranges of the covering_range with weight 0, and wouldn't
require any tracking.

What have I missed?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Mon, Jul 1, 2019 at 3:38 PM Jeff Davis  wrote:
>
> The patch is implicitly introducing the concept of
> a "multirange" (in this case, an array of ranges),

I meant to say before: this patch always returns a sorted array, and I
think a multirange should always act as if sorted when we stringify it
or cast it to an array. If you disagree let me know. :-)

You could imagine that when returning arrays we rely on the caller to
do the sorting (range_agg(r ORDER BY r)) and otherwise give wrong
results. But hopefully everyone agrees that would not be nice. :-) So
even the array-returning version should always return a sorted array I
think. (I'm not sure anything else is really coherent or at least easy
to describe.)

Paul




Re: mcvstats serialization code is still shy of a load

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Tom Lane wrote:

> FWIW, I don't think there's a need for every catversion on the back branch
> to look older than any catversion on HEAD.  The requirement so far as the
> core code is concerned is only for non-equality.  Now, extension code does
> often do something like "if catversion >= xxx", but in practice they're
> only concerned about numbers used by released versions.

pg_upgrade also uses >= catversion comparison for a couple of things.  I
don't think it affects this case, but it's worth keeping in mind.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-05 Thread Julien Rouhaud
On Fri, Jul 5, 2019 at 6:16 PM Peter Eisentraut
 wrote:
>
> On 2019-07-02 10:30, Julien Rouhaud wrote:
> > That's a great idea, and would make the parallelism in reindexdb much
> > simpler.  There's however a downside, as users won't have a way to
> > benefit from index filtering until they upgrade to this version.  OTOH
> > glibc 2.28 is already there, and a hypothetical fancy reindexdb is far
> > from being released.
>
> Isn't that also the case for your proposal?  We are not going to release
> a new reindexdb before a new REINDEX.

Sure, but my point was that once the new reindexdb is released (or if
you're so desperate, using a nightly build or compiling your own), it
can be used against any previous major version.  There is probably a
large fraction of users who don't perform a postgres upgrade when they
upgrade their OS, so that's IMHO also something to consider.




Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Fri, Jul 5, 2019 at 4:31 AM Pavel Stehule  wrote:
> The first issue is unstable regress tests - there is a problem with opr_sanity

I would prefer to avoid needing to add anything to opr_sanity really.
A multirange would let me achieve that I think. But otherwise I'll add
the ordering. Thanks!

> +   rangeTypeId = get_fn_expr_argtype(fcinfo->flinfo, 1);
> +   if (!type_is_range(rangeTypeId))
> +   {
> +   ereport(ERROR, (errmsg("range_agg must be called with a range")));
> +   }

I think this was left over from my attempts to support user-defined
ranges. I believe I can remove it now.

> +# Making 2- and 3-param range_agg polymorphic is difficult
> +# because it would take an anyrange and return an anyrange[],
> +# which doesn't exist.
> +# As a workaround we define separate functions for each built-in range type.
> +# This is what causes the mess in src/test/regress/expected/opr_sanity.out.
> +{ oid => '8003', descr => 'aggregate transition function',
>
> This is strange. Does it means so range_agg will not work with custom range 
> types?

You would have to define your own range_agg with your own custom type
in the signature. I gave instructions about that in my range_agg
extension (https://github.com/pjungwir/range_agg), but it's not as
easy with range_agg in core because I don't think you can define a
custom function that is backed by a built-in C function. At least I
couldn't get that to work.

The biggest argument for a multirange to me is that we could have
anymultirange, with similar rules as anyarray. That way I could have
`range_agg(r anyrange) RETURNS anymultirange`. [1] is a conversation
where I asked about doing this before multiranges were suggested. Also
I'm aware of your own recent work on polymorphic types at [2] but I
haven't read it closely enough to see if it would help me here. Do you
think it applies?

> I am not sure about implementation. I think so accumulating all ranges, 
> sorting and processing on the end can be memory and CPU expensive.

I did some research and couldn't find any algorithm that was better
than O(n log n), but if you're aware of any I'd like to know about it.
Assuming we can't improve on the complexity bounds, I think a sort +
iteration is desirable because it keeps things simple and benefits
from past & future work on the sorting code. I care more about
optimizing time here than RAM, especially since we'll use this in FK
checks, where the inputs will rarely be more than a few and probably
never in the millions.

We especially want to avoid an O(n^2) algorithm, which would be easy
to stumble into if we naively accumulated the result as we go. For
example if we had multiranges you could imagine an implementation that
just did `result + r` for all inputs r. But that would have the same
n^2 pattern as looping over strcat.

We could use a tree to keep things sorted and accumulate as we go, but
the implementation would be more complicated and I think slower.

Thanks for the review!

Paul

[1] 
https://www.postgresql.org/message-id/CA%2BrenyVOjb4xQZGjdCnA54-1nzVSY%2B47-h4nkM-EP5J%3D1z%3Db9w%40mail.gmail.com

[2] 
https://www.postgresql.org/message-id/cafj8prdna7vqni8gr+tt2ktmz0cq5g93guc3sbn_nvpldxa...@mail.gmail.com




Re: SHOW CREATE

2019-07-05 Thread Corey Huinker
On Fri, Jul 5, 2019 at 12:32 PM David Fetter  wrote:

> Folks,
>
> Corey Huinker put together the documentation for this proposed
> feature. Does this seem like a reasonable way to do it?
>
>
In doing that work, it became clear that the command was serving two
masters:
1. A desire to see the underlying nuts and bolts of a given database object.
2. A desire to essentially make the schema portion of pg_dump a server side
command.

To that end, I see splitting this into two commands, SHOW CREATE and SHOW
DUMP.

SHOW DUMP would the original command minus the object type and object name
specifier, and it would dump the entire current database as seen from the
current user (again, no data).

SHOW CREATE would still have all the object_type parameters as before, but
would only dump the one specified object, plus any dependent objects
specified in the WITH options (comments, grants, indexes, constraints,
partitions, all).

Please note that any talk of a server side DESCRIBE is separate from this.
That would be a series of commands that would have result sets tailored to
the object type, and each one would be an inherent compromise between
completeness and readability.

I'd like to hear what others have to say, and incorporate that feedback
into a follow up proposal.


Re: Fix runtime errors from -fsanitize=undefined

2019-07-05 Thread Tom Lane
Peter Eisentraut  writes:
> On 2019-07-05 01:33, Noah Misch wrote:
>> I just saw this proposal.  The undefined behavior in question is strictly
>> academic.  These changes do remove the need for new users to discover
>> -fno-sanitize=nonnull-attribute, but they make the code longer and no 
>> clearer.
>> Given the variety of code this touches, I expect future commits will
>> reintroduce the complained-of usage patterns, prompting yet more commits to
>> restore the invariant achieved here.  Hence, I'm -0 for this change.

> This sanitizer has found real problems in the past.  By removing these
> trivial issues we can then set up a build farm animal or similar to
> automatically check for any new issues.

I think Noah's point is just that we can do that with the addition of
-fno-sanitize=nonnull-attribute.  I agree with him that it's very
unclear why we should bother to make the code clean against that
specific subset of warnings.

regards, tom lane




Re: mcvstats serialization code is still shy of a load

2019-07-05 Thread Tom Lane
Tomas Vondra  writes:
> I've pushed the REL_12_STABLE backpatches too, now. I've ended up using
> 201907031 and 201907032 - those values precede the first catversion bump
> in master (201907041), so the back branch looks "older". And there's a
> bit of slack for additional bumps (if the unlikely case we need them).

FWIW, I don't think there's a need for every catversion on the back branch
to look older than any catversion on HEAD.  The requirement so far as the
core code is concerned is only for non-equality.  Now, extension code does
often do something like "if catversion >= xxx", but in practice they're
only concerned about numbers used by released versions.  HEAD's catversion
will be strictly greater than v12's soon enough, even if you had made it
not so today.  So I think sticking to today's-date-with-some-N is better
than artificially assigning other dates.

What's done is done, and there's no need to change it, but now you
know what to do next time.

> We might have "fixed" this by backpatching the commit with the extra
> catversion bump (7b925e12) but the commit seems a bit too large for
> that. It's fairly isolated though. But it seems like a bad practice.

Yeah, that approach flies in the face of the notion of feature freeze.

regards, tom lane




Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Thu, Jul 4, 2019 at 11:34 AM Alvaro Herrera  wrote:
>
> I noticed that this patch has a // comment about it segfaulting.  Did
> you ever figure that out?  Is the resulting code the one you intend as
> final?

Thanks for the review! I haven't revisited it but I'll see if I can
track it down. I consider this a WIP patch, not something final. (I
don't think Postgres likes C++-style comments, so anything that is //
marks something I consider needs more work.)

> Stylistically, the code does not match pgindent's choices very closely.
> I can return a diff to a pgindented version of your v0002 for your
> perusal, if it would be useful for you to learn its style.

Sorry about that, and thank you for making it easier for me to learn
how to do it the right way. :-)

Paul




Re: Fix runtime errors from -fsanitize=undefined

2019-07-05 Thread Noah Misch
On Fri, Jul 05, 2019 at 06:14:31PM +0200, Peter Eisentraut wrote:
> On 2019-07-05 01:33, Noah Misch wrote:
> > I just saw this proposal.  The undefined behavior in question is strictly
> > academic.  These changes do remove the need for new users to discover
> > -fno-sanitize=nonnull-attribute, but they make the code longer and no 
> > clearer.
> > Given the variety of code this touches, I expect future commits will
> > reintroduce the complained-of usage patterns, prompting yet more commits to
> > restore the invariant achieved here.  Hence, I'm -0 for this change.
> 
> This sanitizer has found real problems in the past.  By removing these
> trivial issues we can then set up a build farm animal or similar to
> automatically check for any new issues.

Has it found one real problem that it would not have found given
"-fno-sanitize=nonnull-attribute"?  I like UBSan in general, but I haven't
found a reason to prefer plain "-fsanitize=undefined" over
"-fsanitize=undefined -fno-sanitize=nonnull-attribute".




Re: range_agg

2019-07-05 Thread Paul A Jungwirth
On Mon, Jul 1, 2019 at 3:38 PM Jeff Davis  wrote:
>
> For getting into core though, it should be a more complete set of
> related operations. The patch is implicitly introducing the concept of
> a "multirange" (in this case, an array of ranges), but it's not making
> the concept whole.
>
> What else should return a multirange? This patch implements the union-
> agg of ranges, but we also might want an intersection-agg of ranges
> (that is, the set of ranges that are subranges of every input). Given
> that there are other options here, the name "range_agg" is too generic
> to mean union-agg specifically.

Thanks for the review!

I like the idea of adding a multirange type that works like range
types, although I'm not sure I want to build it. :-)

My own motivations for the range_agg patch are for temporal databases,
where I have two use-cases: checking temporal foreign keys [1] and
doing a Snodgrass "coalesce" operation [2]. The FKs use-case is more
important. For coalesce I just immediately UNNEST the array, so a
multirange would sort of "get in the way". It's no big deal if I can
cast a multirange to an array, although for something that would run
on every INSERT/UPDATE/DELETE I'd like to understand what the cast
would cost us in performance. But coalesce isn't part of SQL:2011 and
should be optional behavior or just something in an extension. The FKs
use case matters to me a lot more, and I think a multirange would be
fine for that. Also a multirange would solve the generics problem
Pavel asked about. (I'll say more about that in a reply to him.)

I'm not convinced that an intersection aggregate function for
multiranges would be used by anyone, but I don't mind building one.
Every other *_agg function has an "additive" sense, not a
"subtractive" sense. json{,b}_object_agg are the closest since you
*could* imagine intersection semantics for those, but they are unions.
So in terms of *naming* I think using "range_agg" for union semantics
is natural and would fit expectations. (I'm not the first to name this
function range_agg btw: [3]).

But there is clearly more than one worthwhile way to aggregate ranges:

- David suggested weighted_range_agg and covering_range_agg. At PGCon
2019 someone else said he has had to build something that was
essentially weighted_range_agg. I can see it used for
scheduling/capacity/max-utilization problems.
- You suggested an intersection range_agg.
- At [4] there is a missing_ranges function that only gives the *gaps*
between the input ranges.

Nonetheless I still think I would call the union behavior simply
range_agg, and then use weighted_range_agg, covering_range_agg,
intersection_range_agg, and missing_range_agg for the rest (assuming
we built them all). I'm not going to insist on any of that, but it's
what feels most user-friendly to me.

> What can we do with a multirange? A lot of range operators still make
> sense, like "contains" or "overlaps"; but "adjacent" doesn't quite
> work. What about new operations like inverting a multirange to get the
> gaps?

I can think of a lot of cool operators for `multirange \bigoplus
multirange` or `multirange \bigoplus range` (commuting of course). And
I've certainly wanted `range + range` and `range - range` in the past,
which would both return a multirange.

> If we have a more complete set of operators here, the flags for
> handling overlapping ranges and gaps will be unnecessary.

Both of my use cases should permit overlaps & gaps (range_agg(r, true,
true)), so I'm actually pretty okay with dropping the flags entirely
and just giving a one-param function that behavior. Or defining a
strict_range_agg that offers more control. But also I don't understand
how richer *operators* make the flags for the *aggregate* unnecessary.

So I really like the idea of multiranges, but I'm reluctant to take it
on myself, especially since this patch is just a utility function for
my other temporal patches. But I don't want my rush to leave a blemish
in our standard library either. But I think what really persuades me
to add multiranges is making a range_agg that more easily supports
user-defined range types. So how about I start on it and see how it
goes? I expect I can follow the existing code for range types pretty
closely, so maybe it won't be too hard.

Another option would be to rename my function range_array_agg (or
something) so that we are leaving space for a multirange function in
the future. I don't love this idea myself but it would could a Plan B.
What do you think of that?

Regards,
Paul

[1] With range_agg I can make the temporal FK check use similar SQL to
the non-temporal FK check:

SELECT 1
FROM [ONLY]  x
WHERE pkatt1 = $1 [AND ...]
FOR KEY SHARE OF x

vs

SELECT 1
FROM (
SELECT range_agg(r, true, true) AS r
FROM  (
SELECT pkperiodatt AS r
FROM [ONLY] pktable x
WHERE pkatt1 = $1 [AND ...]
FOR KEY SHARE OF x
) x1
) x2
WHERE $n <@ 

Re: range_agg

2019-07-05 Thread Jeff Davis
On Fri, 2019-07-05 at 07:58 +0200, Pavel Stehule wrote:
> The question is naming - should be this agg function named
> "range_agg", and multi range agg "multirange_agg"? Personally, I have
> not a problem with range_agg, and I have not a problem so it is based
> on union operation. It is true so only result of union can be
> implemented as range simply. When I though about multi range result,
> then there are really large set of possible algorithms how to do some
> operations over two, three values.

Hi Pavel,

Can you explain in more detail? Would an intersection-based aggregate
be useful? If so, and we implement it in the future, what would we call
it?

Regards,
Jeff Davis






Re: [PATCH v4] Add \warn to psql

2019-07-05 Thread Tom Lane
Fabien COELHO  writes:
> I agree that using TAP test if another simpler option is available is not 
> a good move.

> However, in the current state, as soon as there is some variation a test 
> is removed and coverage is lost, but they could be kept if the check could 
> be against a regexp.

I'm fairly suspicious of using TAP tests just to get a regexp match.
The thing I don't like about TAP tests for this is that they won't
notice if the test case prints extra stuff beyond what you were
expecting --- at least, not without care that I don't think we usually
take.

I've thought for some time that we should steal an idea from MySQL
and extend pg_regress so that individual lines of an expected-file
could have regexp match patterns rather than being just exact matches.
I'm not really sure how to do that without reimplementing diff(1)
for ourselves :-(, but that would be a very large step forward if
we could find a reasonable implementation.

Anyway, my opinion about having TAP test(s) for psql remains that
it'll be a good idea as soon as somebody submits a test that adds
a meaningful amount of code coverage that way (and the coverage
can't be gotten more simply).  But we don't need a patch that is
just trying to get the camel's nose under the tent.

regards, tom lane




Re: log bind parameter values on error

2019-07-05 Thread Alexey Bashtanov

Please find the rebased patch attached.

Tested like the following.
Provided you're in the postgres checkout and you've run make in 
src/test/examples/ and connected to db=postgres:


CREATE SCHEMA testlibpq3;
SET search_path = testlibpq3;
CREATE TABLE test1_(i int4, t text, b bytea);
INSERT INTO test1_ VALUES(0, '', '');
CREATE VIEW test1 AS SELECT 1/i i, t, b FROM test1_;

-- will log only statement
\! ./src/test/examples/testlibpq3

ALTER SYSTEM SET log_parameters_on_error TO on;
SELECT pg_reload_conf();

-- will log statement with parameters
\! ./src/test/examples/testlibpq3

Best regards,
  Alexey
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index c91e3e1550..9f2e9bae33 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6423,6 +6423,23 @@ log_line_prefix = '%m [%p] %q%u@%d/%a '
   
  
 
+ 
+  log_parameters_on_error (boolean)
+  
+   log_parameters_on_error configuration parameter
+  
+  
+  
+   
+Controls whether the statement is logged with bind parameter values. 
+It adds some overhead, as postgres will cache textual
+representations of parameter values in memory for all statements,
+even if they eventually do not get logged. The default is
+off. Only superusers can change this setting.
+   
+  
+ 
+
  
   log_statement (enum)
   
diff --git a/src/backend/nodes/params.c b/src/backend/nodes/params.c
index f5d56138ee..5e5ccf1a4f 100644
--- a/src/backend/nodes/params.c
+++ b/src/backend/nodes/params.c
@@ -45,6 +45,7 @@ makeParamList(int numParams)
 	retval->parserSetup = NULL;
 	retval->parserSetupArg = NULL;
 	retval->numParams = numParams;
+	retval->hasTextValues = false;
 
 	return retval;
 }
@@ -58,6 +59,10 @@ makeParamList(int numParams)
  * set of parameter values.  If dynamic parameter hooks are present, we
  * intentionally do not copy them into the result.  Rather, we forcibly
  * instantiate all available parameter values and copy the datum values.
+ *
+ * Since this function is never used for copying parameters into a top portal,
+ * we don't copy textual representations. Neither we set the hasTextValues
+ * flag, so noone will access them.
  */
 ParamListInfo
 copyParamList(ParamListInfo from)
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 44a59e1d4f..3559163ac3 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -86,6 +86,12 @@
  */
 const char *debug_query_string; /* client-supplied query string */
 
+/*
+ * The top-level portal that the client is immediately working with:
+ * creating, binding, executing, or all at one using simple protocol
+ */
+Portal current_top_portal = NULL;
+
 /* Note: whereToSendOutput is initialized for the bootstrap/standalone case */
 CommandDest whereToSendOutput = DestDebug;
 
@@ -1694,6 +1700,9 @@ exec_bind_message(StringInfo input_message)
 	else
 		portal = CreatePortal(portal_name, false, false);
 
+	Assert(current_top_portal == NULL);
+	current_top_portal = portal;
+
 	/*
 	 * Prepare to copy stuff into the portal's memory context.  We do all this
 	 * copying first, because it could possibly fail (out-of-memory) and we
@@ -1731,6 +1740,9 @@ exec_bind_message(StringInfo input_message)
 	 */
 	if (numParams > 0)
 	{
+		/* GUC value can change, so we remember its state to be consistent */
+		bool need_text_values = log_parameters_on_error;
+
 		params = makeParamList(numParams);
 
 		for (int paramno = 0; paramno < numParams; paramno++)
@@ -1798,9 +1810,31 @@ exec_bind_message(StringInfo input_message)
 
 pval = OidInputFunctionCall(typinput, pstring, typioparam, -1);
 
-/* Free result of encoding conversion, if any */
-if (pstring && pstring != pbuf.data)
-	pfree(pstring);
+if (pstring)
+{
+	if (need_text_values)
+	{
+		if (pstring == pbuf.data)
+		{
+			/*
+			 * Copy textual representation to portal context.
+			 */
+			params->params[paramno].textValue =
+			pstrdup(pstring);
+		}
+		else
+		{
+			/* Reuse the result of encoding conversion for it */
+			params->params[paramno].textValue = pstring;
+		}
+	}
+	else
+	{
+		/* Free result of encoding conversion */
+		if (pstring != pbuf.data)
+			pfree(pstring);
+	}
+}
 			}
 			else if (pformat == 1)	/* binary mode */
 			{
@@ -1826,6 +1860,22 @@ exec_bind_message(StringInfo input_message)
 			(errcode(ERRCODE_INVALID_BINARY_REPRESENTATION),
 			 errmsg("incorrect binary data format in bind parameter %d",
 	paramno + 1)));
+
+/*
+ * Compute textual representation for further logging. We waste
+ * some time and memory here, maybe one day we could skip
+ * certain types like built-in primitives, which are safe to get
+ * it calculated later in an aborted xact.
+ */
+if (!isNull && need_text_values)
+{
+	Oid			typoutput;
+	bool		

Re: [PATCH v4] Add \warn to psql

2019-07-05 Thread Tom Lane
I wrote:
> David Fetter  writes:
>> [ v7-0001-Add-warn-to-psql.patch ]

> I took a look at this.  I have no quibble with the proposed feature,
> and the implementation is certainly simple enough.  But I'm unconvinced
> about the proposed test scaffolding.

I pushed this with the simplified test methodology.

While I was fooling with it I noticed that the existing code for -n
is buggy.  The documentation says clearly that only the first
argument is a candidate to be -n:

If the first argument is an unquoted -n the trailing
newline is not written.

but the actual implementation allows any argument to be recognized as
-n:

regression=# \echo this -n should not be -n like this
this should not be like thisregression=# 

I fixed that, but I'm wondering if we should back-patch that fix
or leave the back branches alone.

regards, tom lane




Re: Fix runtime errors from -fsanitize=undefined

2019-07-05 Thread Raúl Marín Rodríguez
> This sanitizer has found real problems in the past.  By removing these
> trivial issues we can then set up a build farm animal or similar to
> automatically check for any new issues.

We have done exactly this in postgis with 2 different jobs (gcc and clang)
and, even though it doesn't happen too often, it's really satisfying when
it detects these issues automatically.

-- 
Raúl Marín Rodríguez
carto.com




SHOW CREATE

2019-07-05 Thread David Fetter
Folks,

Corey Huinker put together the documentation for this proposed
feature. Does this seem like a reasonable way to do it?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
>From b1bbf5884da2ba329418d73e2b993edda8971926 Mon Sep 17 00:00:00 2001
From: coreyhuinker 
Date: Sun, 30 Jun 2019 18:22:14 -0400
Subject: [PATCH v1] first draft of SHOW CREATE
To: hackers
MIME-Version: 1.0
Content-Type: multipart/mixed; boundary="2.21.0"

This is a multi-part message in MIME format.
--2.21.0
Content-Type: text/plain; charset=UTF-8; format=fixed
Content-Transfer-Encoding: 8bit


 create mode 100644 doc/src/sgml/ref/show_create.sgml


--2.21.0
Content-Type: text/x-patch; name="v1-0001-first-draft-of-SHOW-CREATE.patch"
Content-Transfer-Encoding: 8bit
Content-Disposition: attachment; filename="v1-0001-first-draft-of-SHOW-CREATE.patch"

diff --git a/doc/src/sgml/ref/allfiles.sgml b/doc/src/sgml/ref/allfiles.sgml
index 8d91f3529e..c482c57556 100644
--- a/doc/src/sgml/ref/allfiles.sgml
+++ b/doc/src/sgml/ref/allfiles.sgml
@@ -181,6 +181,7 @@ Complete list of usable sgml source files in this directory.
 
 
 
+
 
 
 
diff --git a/doc/src/sgml/ref/show_create.sgml b/doc/src/sgml/ref/show_create.sgml
new file mode 100644
index 00..b49a17f246
--- /dev/null
+++ b/doc/src/sgml/ref/show_create.sgml
@@ -0,0 +1,676 @@
+
+
+
+ 
+  SHOW CREATE
+ 
+
+ 
+  SHOW CREATE
+  7
+  SQL - Language Statements
+ 
+
+ 
+  SHOW CREATE
+  List the SQL statements needed to create a given object
+ 
+
+ 
+  
+SHOW CREATE 
+{
+  DATABASE [ database_name ] |
+  SCHEMA schema_name |
+  AGGREGATE aggregate_name ( aggregate_signature ) |
+  ACCESS METHOD object_name |
+  CAST (source_type AS target_type) |
+  COLLATION object_name |
+  COLUMN relation_name.column_name |
+  CONSTRAINT constraint_name ON table_name |
+  CONSTRAINT constraint_name ON DOMAIN domain_name |
+  CONVERSION object_name |
+  DOMAIN object_name |
+  EXTENSION object_name |
+  EVENT TRIGGER object_name |
+  FOREIGN DATA WRAPPER object_name |
+  FOREIGN TABLE object_name |
+  FUNCTION function_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] |
+  INDEX object_name |
+  LARGE OBJECT large_object_oid |
+  MATERIALIZED VIEW object_name |
+  OPERATOR operator_name (left_type, right_type) |
+  OPERATOR CLASS object_name USING index_method |
+  OPERATOR FAMILY object_name USING index_method |
+  POLICY policy_name ON table_name |
+  [ PROCEDURAL ] LANGUAGE object_name |
+  PROCEDURE procedure_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] |
+  PUBLICATION object_name |
+  ROLE object_name |
+  ROUTINE routine_name [ ( [ [ argmode ] [ argname ] argtype [, ...] ] ) ] |
+  RULE rule_name ON table_name |
+  SEQUENCE object_name |
+  SERVER object_name |
+  STATISTICS object_name |
+  SUBSCRIPTION object_name |
+  TABLE object_name |
+  TABLESPACE object_name |
+  TEXT SEARCH CONFIGURATION object_name |
+  TEXT SEARCH DICTIONARY object_name |
+  TEXT SEARCH PARSER object_name |
+  TEXT SEARCH TEMPLATE object_name |
+  TRANSFORM FOR type_name LANGUAGE lang_name |
+  TRIGGER trigger_name ON table_name |
+  TYPE object_name |
+  VIEW object_name
+} [ [ WITH ] ( option [, ...] ) ]
+
+where option can be one of:
+
+CLEAN { boolean | 'IF EXISTS' }
+COMMENTS [ boolean ]
+CREATE [ boolean ]
+DOLLAR_QUTOING [ boolean ]
+ENCODING 'encoding_name'
+OWNER [ boolean ]
+PRE_DATA_SECTION [ boolean ]
+PRIVILEGES [ boolean ]
+POST_DATA_SECTION [ boolean ]
+PUBLICATIONS [ boolean ]
+QUOTE_ALL [ boolean ]
+SECURITY_LABELS [ boolean ]
+SCHEMAS 'schema_pattern' 
+SCHEMAS_EXCLUDE 'schema_pattern' 
+STRICT_NAMES [ boolean ]
+SUBSCRIPTIONS [ boolean ]
+TABLES 'table_pattern' 
+TABLES_EXCLUDE 'table_pattern' 
+TABLESPACES [ boolean ]
+USE_SET_SESSION_AUTHORIZATION [ boolean ]
+VERBOSE [ boolean ]
+  
+ 
+
+ 
+  Description
+
+  
+SHOW CREATE  will retrieve a set of rows containing a single column, where each row is a SQL statement required to create the object specified by the command.
+  
+ 
+
+ 
+  Parameters
+  
+   
+database_name
+
+ 
+  The name of the database to be dumped. If the is database_name
+  is omitted then the current database is assumed.
+ 
+
+   
+   
+schema_name
+
+ 
+  The name of the schema to be dumped.
+ 
+
+   
+   
+object_name
+relation_name.column_name
+aggregate_name
+constraint_name
+function_name
+operator_name
+policy_name
+procedure_name
+routine_name
+rule_name
+trigger_name
+
+ 
+  The name of the object to be dumped. Names of objects can be
+  schema-qualified, if not then the current session's search path is used.
+  Aggregates, collations, conversions, domains, foreign tables, functions,
+  indexes, 

Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-05 Thread Peter Eisentraut
On 2019-07-02 10:45, Julien Rouhaud wrote:
> It just seemed wrong to me to allow a partial processing for something
> that's aimed to prevent corruption.  I'd think that if users are
> knowledgeable enough to only reindex a subset of indexes/tables in
> such cases, they can also discard indexes that don't get affected by a
> collation lib upgrade.  I'm not strongly opposed to supporting if
> though, as there indeed can be valid use cases.

We are moving in this direction.  Thomas Munro has proposed an approach
for tracking collation versions on a per-object level rather than
per-database.  So then we'd need a way to reindex not those indexes
affected by collation but only those affected by collation and not yet
fixed.

One could also imagine a behavior where not-yet-fixed indexes are simply
ignored by the planner.  So the gradual upgrading approach that Tomas
described is absolutely a possibility.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Add parallelism and glibc dependent only options to reindexdb

2019-07-05 Thread Peter Eisentraut
On 2019-07-02 10:30, Julien Rouhaud wrote:
> That's a great idea, and would make the parallelism in reindexdb much
> simpler.  There's however a downside, as users won't have a way to
> benefit from index filtering until they upgrade to this version.  OTOH
> glibc 2.28 is already there, and a hypothetical fancy reindexdb is far
> from being released.

Isn't that also the case for your proposal?  We are not going to release
a new reindexdb before a new REINDEX.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Fix runtime errors from -fsanitize=undefined

2019-07-05 Thread Peter Eisentraut
On 2019-07-05 01:33, Noah Misch wrote:
> I just saw this proposal.  The undefined behavior in question is strictly
> academic.  These changes do remove the need for new users to discover
> -fno-sanitize=nonnull-attribute, but they make the code longer and no clearer.
> Given the variety of code this touches, I expect future commits will
> reintroduce the complained-of usage patterns, prompting yet more commits to
> restore the invariant achieved here.  Hence, I'm -0 for this change.

This sanitizer has found real problems in the past.  By removing these
trivial issues we can then set up a build farm animal or similar to
automatically check for any new issues.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: mcvstats serialization code is still shy of a load

2019-07-05 Thread Tomas Vondra

On Fri, Jul 05, 2019 at 10:36:59AM +0200, Tomas Vondra wrote:

On Fri, Jul 5, 2019, 03:28 Tom Lane  wrote:


Tomas Vondra  writes:
> I was about to push into REL_12_STABLE, when I realized that maybe we
> need to do something about the catversion first. REL_12_STABLE is still
> on 201906161, while master got to 201907041 thanks to commit
> 7b925e12703.  Simply cherry-picking the commits would get us to
> 201907052 in both branches, but that'd be wrong as the catalogs do
> differ. I suppose this is what you meant by "catversion differences to
> cope with".

Yeah, exactly.

My recommendation is to use 201907051 on v12 and 201907052
on master (or whatever is $today for you).  They need to be
different now that the branches' catalog histories have diverged,
and it seems to me that the back branch should be "older".

regards, tom lane



Unfortunately, master is already using both 201907051 and 201907052 (two of
the patches I pushed touched the catalog), so we can't quite do exactly
that. We need to use 201907042 and 201907043 or something preceding 201907041
(which is the extra catversion on master).

At this point there's no perfect sequence, thanks to the extra commit on
master, so REL_12_STABLE can't be exactly "older" :-(

Barring objections, I'll go ahead with 201907042+201907043 later today,
before someone pushes another catversion-bumping patch.



I've pushed the REL_12_STABLE backpatches too, now. I've ended up using
201907031 and 201907032 - those values precede the first catversion bump
in master (201907041), so the back branch looks "older". And there's a
bit of slack for additional bumps (if the unlikely case we need them).

We might have "fixed" this by backpatching the commit with the extra
catversion bump (7b925e12) but the commit seems a bit too large for
that. It's fairly isolated though. But it seems like a bad practice.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Inconsistency between attname of index and attname of relation

2019-07-05 Thread Euler Taveira
Em sex, 5 de jul de 2019 às 07:37, Ronan Dunklau
 escreveu:

> We ran into that while using wal2json, which uses the replication id index 
> attnames to identify which columns are part of the primary key. If the 
> primary key column has been renamed, we end with no information about the 
> identity of the tuple being updated / deleted.
>
Ouch. That's a wal2json bug. I saw that you already opened an issue.


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento




Re: mcv compiler warning

2019-07-05 Thread Tomas Vondra

On Fri, Jul 05, 2019 at 10:13:25AM -0400, Jeff Janes wrote:

One of the recent mcv commits introduced an unused variable warning.

mcv.c: In function 'statext_mcv_serialize':
mcv.c:914:7: warning: unused variable 'itemlen' [-Wunused-variable]
  int   itemlen = ITEM_SIZE(dim);

The attached fixes it.



Thanks.

I think I'll just get rid of the variable entirely, and will just call
the macro from the assert directly. The variable used to be referenced
on multiple places, but that changed during the serialization code
reworks.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Inconsistency between attname of index and attname of relation

2019-07-05 Thread Ronan Dunklau
Thank you for this quick answer, I'll report the bug to wal2json then.

Le ven. 5 juil. 2019 à 16:22, Tom Lane  a écrit :

> Ronan Dunklau  writes:
> > I've noticed that renaming an indexed column produces inconsistencies in
> > the catalog. Namely, the attname of the attribute of the relation is
> > properly updated, whereas the attname of the attribute in the index is
> not,
> > and keeps the old value.
>
> If memory serves, we used to try to rename index columns, and gave up
> on that because it caused problems of its own.  That's (one reason) why
> modern versions of psql show a "definition" column in \d of an index.
>
> > I think this could be considered a bug in Postgres.
>
> It is not.
>
> > If it isn't, what
> > should be the proper way to retrieve this information ?
>
> psql uses pg_get_indexdef(), looks like.
>
> regards, tom lane
>

-- 




This e-mail message and any attachments to it are intended only for the 
named recipients and may contain legally privileged and/or confidential 
information. If you are not one of the intended recipients, do not 
duplicate or forward this e-mail message.


Re: [PATCH] Implement uuid_version()

2019-07-05 Thread Tom Lane
Alvaro Herrera  writes:
> On 2019-Jul-05, Peter Eisentraut wrote:
>> (There is also precedent for redirecting the extension function to the
>> internal one by changing the SQL-level function definition using CREATE
>> OR REPLACE FUNCTION ... LANGUAGE INTERNAL.  But that seems more
>> complicated and would require a new extension version.

> One issue with this approach is that it forces the internal function to
> remain unchanged forever.  That seems OK in this particular case.

No, what it's establishing is that the extension and core functions
will do the same thing forevermore.  Seems to me that's what we want
here.

>> It could maybe be included if the extension version is changed for
>> other reasons.)

> Maybe add a comment in the control file (?) so that we remember to do it
> then.

I'm not terribly excited about that --- we'd still need to keep the
C function redirection in place in the .so file, for benefit of
people who hadn't done ALTER EXTENSION UPGRADE.

regards, tom lane




Re: Inconsistency between attname of index and attname of relation

2019-07-05 Thread Tom Lane
Ronan Dunklau  writes:
> I've noticed that renaming an indexed column produces inconsistencies in
> the catalog. Namely, the attname of the attribute of the relation is
> properly updated, whereas the attname of the attribute in the index is not,
> and keeps the old value.

If memory serves, we used to try to rename index columns, and gave up
on that because it caused problems of its own.  That's (one reason) why
modern versions of psql show a "definition" column in \d of an index.

> I think this could be considered a bug in Postgres.

It is not.

> If it isn't, what
> should be the proper way to retrieve this information ?

psql uses pg_get_indexdef(), looks like.

regards, tom lane




Re: [PATCH] Implement uuid_version()

2019-07-05 Thread Alvaro Herrera
On 2019-Jul-05, Peter Eisentraut wrote:

> (There is also precedent for redirecting the extension function to the
> internal one by changing the SQL-level function definition using CREATE
> OR REPLACE FUNCTION ... LANGUAGE INTERNAL.  But that seems more
> complicated and would require a new extension version.

One issue with this approach is that it forces the internal function to
remain unchanged forever.  That seems OK in this particular case.

> It could maybe be included if the extension version is changed for
> other reasons.)

Maybe add a comment in the control file (?) so that we remember to do it
then.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Why vacuum_index_cleanup is needed for TOAST relations?

2019-07-05 Thread Tomas Vondra

On Fri, Jul 05, 2019 at 03:28:26PM +0300, Nikolay Shaplov wrote:

I am importing recent changes into my reloption patch and came to a question,
I did not find an answer...

vacuum_index_cleanup option exists for both heap and toast relations.

As I understand from documentation index cleanup is about is about reporting
access method that some tuples in table that were indexed are dead, and should
be cleaned.

And as far as I get, we do not index any TOAST tuples directly. They are
obtained by getting relation tuple, and then deTOAST it.

So I do not understand why do we need vacuum_index_cleanup for TOAST tables.
May be we should remove it from there??

Or if I am wrong, can you explain where it is needed?



I'm not sure I understand your question / suggestion correctly, but
each TOAST table certainly has an index on (chunk_id, chunk_seq) - in
fact it's a unique index backing a primary key.

It's not clear to me what you mean by "index any TOAST tuples directly"
or "getting relation tuple", perhaps you could explain.

IMHO it's correct to have vacuum_index_cleanup even for TOAST tables.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





mcv compiler warning

2019-07-05 Thread Jeff Janes
One of the recent mcv commits introduced an unused variable warning.

mcv.c: In function 'statext_mcv_serialize':
mcv.c:914:7: warning: unused variable 'itemlen' [-Wunused-variable]
   int   itemlen = ITEM_SIZE(dim);

The attached fixes it.

Cheers,

Jeff


mcv_assert_warning.patch
Description: Binary data


Re: POC: Cleaning up orphaned files using undo logs

2019-07-05 Thread Robert Haas
On Tue, Jun 25, 2019 at 4:00 PM Amit Kapila  wrote:
> Fair enough.  I have implemented it based on next_retry_at and use
> constant time 10s for the next retry.  I have used define instead of a
> GUC as all the other constants for similar things are defined as of
> now.  One thing to note is that we want the linger time (defined as
> UNDO_WORKER_LINGER_MS) for a undo worker to be more than failure retry
> time (defined as UNDO_FAILURE_RETRY_DELAY_MS) as, otherwise, the undo
> worker can exit before retrying the failed requests.

Uh, I think we want exactly the opposite.  We want the workers to exit
before retrying, so that there's a chance for other databases to get
processed, I think.  Am I confused?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: using explicit_bzero

2019-07-05 Thread Peter Eisentraut
On 2019-07-05 14:06, Thomas Munro wrote:
> +#ifndef HAVE_EXPLICIT_BZERO
> +#define explicit_bzero(b, len) memset(b, 0, len)
> +#endif
> 
> I noticed some other libraries use memset through a function pointer
> or at least define a function the compiler can't see.

I don't understand what you are getting at here.

> The ssl tests fail:
> 
> FATAL:  could not load private key file "server-password.key": bad decrypt
> 
> That's apparently due to the passphrase being clobbered in the output
> buffer before we've managed to use it:
> 
> @@ -118,6 +118,7 @@ run_ssl_passphrase_command(const char *prompt,
> bool is_server_start, char *buf,
> buf[--len] = '\0';
> 
>  error:
> +   explicit_bzero(buf, size);
> pfree(command.data);
> return len;
>  }

Yeah, that's a silly mistake.  New patch attached.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 2188d8455f207d378e46390ac4db59c201574229 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 5 Jul 2019 15:02:11 +0200
Subject: [PATCH v3] Use explicit_bzero

Discussion: 
https://www.postgresql.org/message-id/flat/42d26bde-5d5b-c90d-87ae-6cab875f73be%402ndquadrant.com
---
 configure| 2 +-
 configure.in | 1 +
 src/backend/libpq/be-secure-common.c | 3 +++
 src/include/pg_config.h.in   | 3 +++
 src/include/port.h   | 4 
 src/interfaces/libpq/fe-connect.c| 8 
 6 files changed, 20 insertions(+), 1 deletion(-)

diff --git a/configure b/configure
index 7a6bfc2339..dcdd362c67 100755
--- a/configure
+++ b/configure
@@ -15143,7 +15143,7 @@ fi
 LIBS_including_readline="$LIBS"
 LIBS=`echo "$LIBS" | sed -e 's/-ledit//g' -e 's/-lreadline//g'`
 
-for ac_func in cbrt clock_gettime copyfile fdatasync getifaddrs getpeerucred 
getrlimit mbstowcs_l memmove poll posix_fallocate ppoll pstat 
pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open 
strchrnul strsignal symlink sync_file_range uselocale utime utimes wcstombs_l
+for ac_func in cbrt clock_gettime copyfile explicit_bzero fdatasync getifaddrs 
getpeerucred getrlimit mbstowcs_l memmove poll posix_fallocate ppoll pstat 
pthread_is_threaded_np readlink setproctitle setproctitle_fast setsid shm_open 
strchrnul strsignal symlink sync_file_range uselocale utime utimes wcstombs_l
 do :
   as_ac_var=`$as_echo "ac_cv_func_$ac_func" | $as_tr_sh`
 ac_fn_c_check_func "$LINENO" "$ac_func" "$as_ac_var"
diff --git a/configure.in b/configure.in
index dde3eec89f..57c82b0b56 100644
--- a/configure.in
+++ b/configure.in
@@ -1591,6 +1591,7 @@ AC_CHECK_FUNCS(m4_normalize([
cbrt
clock_gettime
copyfile
+   explicit_bzero
fdatasync
getifaddrs
getpeerucred
diff --git a/src/backend/libpq/be-secure-common.c 
b/src/backend/libpq/be-secure-common.c
index 877226d377..f2deba4243 100644
--- a/src/backend/libpq/be-secure-common.c
+++ b/src/backend/libpq/be-secure-common.c
@@ -86,6 +86,7 @@ run_ssl_passphrase_command(const char *prompt, bool 
is_server_start, char *buf,
{
if (ferror(fh))
{
+   explicit_bzero(buf, size);
ereport(loglevel,
(errcode_for_file_access(),
 errmsg("could not read from command 
\"%s\": %m",
@@ -97,6 +98,7 @@ run_ssl_passphrase_command(const char *prompt, bool 
is_server_start, char *buf,
pclose_rc = ClosePipeStream(fh);
if (pclose_rc == -1)
{
+   explicit_bzero(buf, size);
ereport(loglevel,
(errcode_for_file_access(),
 errmsg("could not close pipe to external 
command: %m")));
@@ -104,6 +106,7 @@ run_ssl_passphrase_command(const char *prompt, bool 
is_server_start, char *buf,
}
else if (pclose_rc != 0)
{
+   explicit_bzero(buf, size);
ereport(loglevel,
(errcode_for_file_access(),
 errmsg("command \"%s\" failed",
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index 512213aa32..524873ba44 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -201,6 +201,9 @@
 /* Define to 1 if you have the  header file. */
 #undef HAVE_EDITLINE_READLINE_H
 
+/* Define to 1 if you have the `explicit_bzero' function. */
+#undef HAVE_EXPLICIT_BZERO
+
 /* Define to 1 if you have the `fdatasync' function. */
 #undef HAVE_FDATASYNC
 
diff --git a/src/include/port.h b/src/include/port.h
index b5c03d912b..7c8b5138ba 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -381,6 +381,10 @@ extern int isinf(double x);
 #endif /* __clang__ && 
!__cplusplus */
 #endif 

Re: Introduce MIN/MAX aggregate functions to pg_lsn

2019-07-05 Thread Fabrízio de Royes Mello
On Fri, Jul 5, 2019 at 12:22 AM Michael Paquier  wrote:
>
> On Thu, Jul 04, 2019 at 01:48:24PM -0300, Fabrízio de Royes Mello wrote:
> > On Thu, Jul 4, 2019 at 10:57 AM Robert Haas 
wrote:
> >> It would be pretty silly to have one and not the other, regardless of
> >> whether we can think of an immediate use case.
> >
> > +1
>
> OK, applied with a catalog version bump.  This is cool to have.
>

Awesome... thanks.

Att,

--
   Fabrízio de Royes Mello Timbira - http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento


Re: make installcheck-world in a clean environment

2019-07-05 Thread Alexander Lakhin
Hello Thomas,

01.07.2019 13:47, Thomas Munro wrote:
>
> A new CF is here and this is in "Needs Review".  Would you like to
> provide a rebased patch, or should it really be withdrawn?
The rebased patch is attached, but I still can't find anyone interested
in reviewing it.
So let's withdraw it.
With hope for better days,
Alexander

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7b68213266..bc34d1a80d 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -111,7 +111,12 @@ make installcheck-parallel
default port number, unless directed otherwise by PGHOST and
PGPORT environment variables.  The tests will be run in a
database named regression; any existing database by this name
-   will be dropped.
+   will be dropped.  The tests in the installcheck mode will use the executables,
+   libraries, and headers, which are already present in the installation tree,
+   not their copies in the source tree (if any).  This mode is suitable for
+   testing distribution packages of PostgreSQL installed on user systems as
+   it allows to check the installation in its entirety without rebuilding
+   any parts of it.
   
 
   
diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 321af38b0c..dd27d669c0 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -244,14 +244,16 @@ PG_SYSROOT = @PG_SYSROOT@
 
 override CPPFLAGS := $(ICU_CFLAGS) $(CPPFLAGS)
 
-ifdef PGXS
+# For PGXS and in the installcheck mode (when we use the installed assets)
+# we need to target already installed headers and libraries
+ifneq (,$(PGXS)$(USE_INSTALLED_ASSETS))
 override CPPFLAGS := -I$(includedir_server) -I$(includedir_internal) $(CPPFLAGS)
-else # not PGXS
+else # not PGXS and not USE_INSTALLED_ASSETS
 override CPPFLAGS := -I$(top_srcdir)/src/include $(CPPFLAGS)
 ifdef VPATH
 override CPPFLAGS := -I$(top_builddir)/src/include $(CPPFLAGS)
 endif
-endif # not PGXS
+endif # not PGXS and USE_INSTALLED_ASSETS
 
 CC = @CC@
 GCC = @GCC@
@@ -306,7 +308,7 @@ with_gnu_ld = @with_gnu_ld@
 # "LDFLAGS := something" anywhere, ditto for LDFLAGS_INTERNAL.
 # These initial assignments must be "=" type, and elsewhere we must only do
 # "LDFLAGS += something" or "LDFLAGS_INTERNAL += something".
-ifdef PGXS
+ifneq (,$(PGXS)$(USE_INSTALLED_ASSETS))
   LDFLAGS_INTERNAL = -L$(libdir)
 else
   LDFLAGS_INTERNAL = -L$(top_builddir)/src/port -L$(top_builddir)/src/common
@@ -379,6 +381,9 @@ endif
 # install tree just once in any recursive "make check".  The additional test
 # on abs_top_builddir prevents doing anything foolish to the root directory.
 
+# In the installcheck mode we should use what is already installed
+# in the DESTDIR (namely, pg_regress).
+
 check: temp-install
 
 .PHONY: temp-install
@@ -430,7 +435,7 @@ ifeq ($(enable_tap_tests),yes)
 define prove_installcheck
 rm -rf '$(CURDIR)'/tmp_check
 $(MKDIR_P) '$(CURDIR)'/tmp_check
-cd $(srcdir) && TESTDIR='$(CURDIR)' PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' PG_REGRESS='$(CURDIR)/$(top_builddir)/src/test/regress/pg_regress' REGRESS_SHLIB='$(abs_top_builddir)/src/test/regress/regress$(DLSUFFIX)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
+cd $(srcdir) && TESTDIR='$(CURDIR)' PATH="$(bindir):$$PATH" PGPORT='6$(DEF_PGPORT)' top_builddir='$(CURDIR)/$(top_builddir)' PG_REGRESS='$(DESTDIR)$(pgxsdir)/src/test/regress/pg_regress' REGRESS_SHLIB='$(DESTDIR)$(pgxsdir)/src/test/regress/regress$(DLSUFFIX)' $(PROVE) $(PG_PROVE_FLAGS) $(PROVE_FLAGS) $(if $(PROVE_TESTS),$(PROVE_TESTS),t/*.pl)
 endef
 
 define prove_check
@@ -574,6 +579,8 @@ endif
 #
 # Commonly used submake targets
 
+# build these libraries only when we are not going to use the installed ones
+ifndef USE_INSTALLED_ASSETS
 submake-libpq: | submake-generated-headers
 	$(MAKE) -C $(libpq_builddir) all
 
@@ -585,6 +592,7 @@ submake-libpgfeutils: | submake-generated-headers
 	$(MAKE) -C $(top_builddir)/src/port all
 	$(MAKE) -C $(top_builddir)/src/common all
 	$(MAKE) -C $(top_builddir)/src/fe_utils all
+endif
 
 .PHONY: submake-libpq submake-libpgport submake-libpgfeutils
 
@@ -636,7 +644,7 @@ pg_regress_check = \
 $(TEMP_CONF) \
 $(pg_regress_locale_flags) $(EXTRA_REGRESS_OPTS)
 pg_regress_installcheck = \
-$(top_builddir)/src/test/regress/pg_regress \
+   '$(DESTDIR)$(pgxsdir)/src/test/regress/pg_regress' \
 --inputdir=$(srcdir) \
 --bindir='$(bindir)' \
 $(pg_regress_locale_flags) $(EXTRA_REGRESS_OPTS)
diff --git a/src/bin/pg_upgrade/test.sh b/src/bin/pg_upgrade/test.sh
index 78820247b3..1a595e50dc 100644
--- a/src/bin/pg_upgrade/test.sh
+++ b/src/bin/pg_upgrade/test.sh
@@ -163,7 +163,7 @@ createdb "regression$dbname1" || createdb_status=$?
 createdb "regression$dbname2" || createdb_status=$?
 createdb "regression$dbname3" || createdb_status=$?
 
-if "$MAKE" -C "$oldsrc" installcheck-parallel; then
+if "$MAKE" -C "$oldsrc" installcheck-parallel 

Why vacuum_index_cleanup is needed for TOAST relations?

2019-07-05 Thread Nikolay Shaplov
I am importing recent changes into my reloption patch and came to a question, 
I did not find an answer...

vacuum_index_cleanup option exists for both heap and toast relations.

As I understand from documentation index cleanup is about is about reporting 
access method that some tuples in table that were indexed are dead, and should 
be cleaned.

And as far as I get, we do not index any TOAST tuples directly. They are 
obtained by getting relation tuple, and then deTOAST it.

So I do not understand why do we need vacuum_index_cleanup for TOAST tables. 
May be we should remove it from there??

Or if I am wrong, can you explain where it is needed?


-- 
Software Developer: https://www.upwork.com/freelancers/~014a87e140ff02c0da
Body-oriented Therapist: https://vk.com/nataraj_rebalancing  (Russian)




Re: Add client connection check during the execution of the query

2019-07-05 Thread Stas Kelvich



> On 5 Jul 2019, at 11:46, Thomas Munro  wrote:
> 
> On Fri, Jul 5, 2019 at 6:28 PM Tatsuo Ishii  wrote:
>>> The purpose of this patch is to stop the execution of continuous
>>> requests in case of a disconnection from the client.
>> 
>> Pgpool-II already does this by sending a parameter status message to
>> the client. It is expected that clients are always prepared to receive
>> the parameter status message. This way I believe we could reliably
>> detect that the connection to the client is broken or not.
> 
> Hmm.  If you send a message, it's basically application-level
> keepalive.  But it's a lot harder to be sure that the protocol and
> socket are in the right state to insert a message at every possible
> CHECK_FOR_INTERRUPT() location.  Sergey's proposal of recv(MSG_PEEK)
> doesn't require any knowledge of the protocol at all, though it
> probably does need TCP keepalive to be configured to be useful for
> remote connections.


Well, indeed in case of cable disconnect only way to detect it with
proposed approach is to have tcp keepalive. However if disconnection
happens due to client application shutdown then client OS should itself
properly close than connection and therefore this patch will detect
such situation without keepalives configured.

--
Stas Kelvich
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company






Re: using explicit_bzero

2019-07-05 Thread Thomas Munro
On Mon, Jun 24, 2019 at 7:57 AM Peter Eisentraut
 wrote:
> On 2019-06-23 21:55, Peter Eisentraut wrote:
> > On 2019-06-21 15:25, Tom Lane wrote:
> >> years ago (067a5cdb3).  Please use memset() for the substitute instead.
> >
> > OK, done.

+#ifndef HAVE_EXPLICIT_BZERO
+#define explicit_bzero(b, len) memset(b, 0, len)
+#endif

I noticed some other libraries use memset through a function pointer
or at least define a function the compiler can't see.

> and with patch attached

The ssl tests fail:

FATAL:  could not load private key file "server-password.key": bad decrypt

That's apparently due to the passphrase being clobbered in the output
buffer before we've managed to use it:

@@ -118,6 +118,7 @@ run_ssl_passphrase_command(const char *prompt,
bool is_server_start, char *buf,
buf[--len] = '\0';

 error:
+   explicit_bzero(buf, size);
pfree(command.data);
return len;
 }

-- 
Thomas Munro
https://enterprisedb.com




Re: Proposal to add GUC_REPORT to lc_monetary, lc_numeric and search_path

2019-07-05 Thread Shay Rojansky
> The latter is important for similar reasons. JDBC caches prepared
statements internally and if the user changes the search path without using
setSchema or uses a function to change it then internally it would be
necessary to invalidate the cache. Currently if this occurs these
statements fail.

While Npgsql specifically doesn't care about any locale/formatting (being a
binary-only driver), knowing about search_path changes would benefit Npgsql
in the same way as it would JDBC.

> This seems like a rather innocuous change as the protocol is not changed,
rather the amount of information returned on startup is increased
marginally.

Although adding these specific parameters are easy to add, we could also
think of a more generic way for clients to subscribe to parameter updates
(am not sure if this was previously discussed - I cannot see anything
obvious in the wiki TODO page). At its simplest, this could be a new
parameter containing a comma-separated list of parameters for which
asynchronous updates should be sent.  This new parameter would default to
the current hard-coded list (as documented in
https://www.postgresql.org/docs/current/protocol-flow.html#PROTOCOL-ASYNC).
Unless I'm mistaken, one issue (as in general with new parameters) is that
drivers wouldn't be able to send this new parameter in the startup package
because they don't yet know whether they're talking to a PostgreSQL version
which supports it.


Re: POC: Cleaning up orphaned files using undo logs

2019-07-05 Thread Amit Kapila
On Fri, Jul 5, 2019 at 5:20 PM Amit Kapila  wrote:
>
> On Thu, Jul 4, 2019 at 5:23 PM Amit Kapila  wrote:
> >
> > On Mon, Jul 1, 2019 at 1:24 PM Thomas Munro  wrote:
> > >
> > > Another small change/review: the function UndoLogGetNextInsertPtr()
> > > previously took a transaction ID, but I'm not sure if that made sense,
> > > I need to think about it some more.
> > >
> >
> > The changes you have made related to UndoLogGetNextInsertPtr() doesn't
> > seem correct to me.
> >
> > @@ -854,7 +854,9 @@ FindUndoEndLocationAndSize(UndoRecPtr start_urecptr,
> >   * has already started in this log then lets re-fetch the undo
> >   * record.
> >   */
> > - next_insert = UndoLogGetNextInsertPtr(slot->logno, uur->uur_xid);
> > + next_insert = UndoLogGetNextInsertPtr(slot->logno);
> > +
> > + /* TODO this can't happen */
> >   if (!UndoRecPtrIsValid(next_insert))
> >
> > I think this is a possible case.  Say while the discard worker tries
> > to register the rollback request from some log and after it fetches
> > the undo record corresponding to start location in this function,
> > another backend adds the new transaction undo.  The same is mentioned
> > in comments as well.   Can you explain what makes you think that this
> > can't happen?  If we don't want to pass the xid to
> > UndoLogGetNextInsertPtr, then I think we need to get the insert
> > location before fetching the record.  I will think more on it to see
> > if there is any other problem with the same.
> >
>
> Pushed the fixed on above lines in the undoprocessing branch.
>

Just in case anyone wants to look at the undoprocessing branch, it is
available at https://github.com/EnterpriseDB/zheap/tree/undoprocessing

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: POC: Cleaning up orphaned files using undo logs

2019-07-05 Thread Amit Kapila
On Thu, Jul 4, 2019 at 5:23 PM Amit Kapila  wrote:
>
> On Mon, Jul 1, 2019 at 1:24 PM Thomas Munro  wrote:
> >
> > Another small change/review: the function UndoLogGetNextInsertPtr()
> > previously took a transaction ID, but I'm not sure if that made sense,
> > I need to think about it some more.
> >
>
> The changes you have made related to UndoLogGetNextInsertPtr() doesn't
> seem correct to me.
>
> @@ -854,7 +854,9 @@ FindUndoEndLocationAndSize(UndoRecPtr start_urecptr,
>   * has already started in this log then lets re-fetch the undo
>   * record.
>   */
> - next_insert = UndoLogGetNextInsertPtr(slot->logno, uur->uur_xid);
> + next_insert = UndoLogGetNextInsertPtr(slot->logno);
> +
> + /* TODO this can't happen */
>   if (!UndoRecPtrIsValid(next_insert))
>
> I think this is a possible case.  Say while the discard worker tries
> to register the rollback request from some log and after it fetches
> the undo record corresponding to start location in this function,
> another backend adds the new transaction undo.  The same is mentioned
> in comments as well.   Can you explain what makes you think that this
> can't happen?  If we don't want to pass the xid to
> UndoLogGetNextInsertPtr, then I think we need to get the insert
> location before fetching the record.  I will think more on it to see
> if there is any other problem with the same.
>

Pushed the fixed on above lines in the undoprocessing branch.  It will
be available in the next set of patches we post.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Implement uuid_version()

2019-07-05 Thread Jose Luis Tallon

On 5/7/19 11:00, Peter Eisentraut wrote:

On 2019-07-05 00:08, Jose Luis Tallon wrote:

On 4/7/19 17:30, Alvaro Herrera wrote:

On 2019-Jul-04, Tom Lane wrote:


A possible option 3 is to keep the function in pgcrypto but change
its C code to call the core code.

Updated patch with this change included.

Great, thanks!




Re: how to run encoding-dependent tests by default

2019-07-05 Thread Peter Eisentraut
On 2019-06-23 21:44, Peter Eisentraut wrote:
> On 2019-06-17 18:39, Andres Freund wrote:
>> Basically something like:
>>
>> \gset SELECT my_encodings_are_compatible() AS compatible
>> \if :compatible
>> test;
>> contents;
>> \endif
> 
> Cool, that works out quite well.  See attached patch.  I flipped the
> logic around to make it \quit if not compatible.  That way the
> alternative expected file is shorter and doesn't need to be updated all
> the time.  But it gets the job done either way.

Small patch update: The collate.linux.utf8 test also needs to check in a
similar manner that all the locales it is using are installed.  This
should get the cfbot run passing.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 13de608315d495882d208012c9c55bae91dc75e6 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 5 Jul 2019 11:30:11 +
Subject: [PATCH v2] Run UTF8-requiring collation tests by default

The tests collate.icu.utf8 and collate.linux.utf8 were previously only
run when explicitly selected via EXTRA_TESTS.  They require a UTF8
database, because the error messages in the expected files refer to
that, and they use some non-ASCII characters in the tests.  Since
users can select any locale and encoding for the regression test run,
it was not possible to include these tests automatically.

To fix, use psql's \if facility to check the server encoding and quit
the tests at the very beginning if a server encoding other than UTF8
is set.  We then need to maintain alternative expected files for these
tests, but they are very tiny and never need to change after this.

These two tests are now run automatically as part of the regression
tests, given an appropriate build environment (Linux or ICU enabled,
respectively).

Discussion: 
https://www.postgresql.org/message-id/flat/052295c2-a2e1-9a21-bd36-8fbff8686cf3%402ndquadrant.com
---
 doc/src/sgml/regress.sgml  |  8 
 src/test/regress/GNUmakefile   |  8 
 src/test/regress/expected/collate.icu.utf8.out |  5 +
 src/test/regress/expected/collate.icu.utf8_1.out   |  7 +++
 src/test/regress/expected/collate.linux.utf8.out   |  6 ++
 src/test/regress/expected/collate.linux.utf8_1.out | 10 ++
 src/test/regress/sql/collate.icu.utf8.sql  |  6 ++
 src/test/regress/sql/collate.linux.utf8.sql|  7 +++
 8 files changed, 49 insertions(+), 8 deletions(-)
 create mode 100644 src/test/regress/expected/collate.icu.utf8_1.out
 create mode 100644 src/test/regress/expected/collate.linux.utf8_1.out

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 7b68213266..d98187c970 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -363,14 +363,6 @@ Extra Tests
 
 make check EXTRA_TESTS=numeric_big
 
-To run the collation tests:
-
-make check EXTRA_TESTS='collate.linux.utf8 collate.icu.utf8' LANG=en_US.utf8
-
-The collate.linux.utf8 test works only on Linux/glibc
-platforms.  The collate.icu.utf8 test only works when
-support for ICU was built.  Both tests will only succeed when run in a
-database that uses UTF-8 encoding.

   
 
diff --git a/src/test/regress/GNUmakefile b/src/test/regress/GNUmakefile
index a24cfd4e01..80d515003c 100644
--- a/src/test/regress/GNUmakefile
+++ b/src/test/regress/GNUmakefile
@@ -128,6 +128,14 @@ tablespace-setup:
 
 REGRESS_OPTS = --dlpath=. --max-concurrent-tests=20 $(EXTRA_REGRESS_OPTS)
 
+ifeq ($(PORTNAME),linux)
+EXTRA_TESTS += collate.linux.utf8
+endif
+
+ifeq ($(with_icu),yes)
+EXTRA_TESTS += collate.icu.utf8
+endif
+
 check: all tablespace-setup
$(pg_regress_check) $(REGRESS_OPTS) 
--schedule=$(srcdir)/parallel_schedule $(MAXCONNOPT) $(EXTRA_TESTS)
 
diff --git a/src/test/regress/expected/collate.icu.utf8.out 
b/src/test/regress/expected/collate.icu.utf8.out
index 01bd9fb5dd..addbc20f6f 100644
--- a/src/test/regress/expected/collate.icu.utf8.out
+++ b/src/test/regress/expected/collate.icu.utf8.out
@@ -1,6 +1,11 @@
 /*
  * This test is for ICU collations.
  */
+/* skip test if not UTF8 server encoding */
+SELECT getdatabaseencoding() <> 'UTF8' AS server_encoding_incompatible \gset
+\if :server_encoding_incompatible
+\quit
+\endif
 SET client_encoding TO UTF8;
 CREATE SCHEMA collate_tests;
 SET search_path = collate_tests;
diff --git a/src/test/regress/expected/collate.icu.utf8_1.out 
b/src/test/regress/expected/collate.icu.utf8_1.out
new file mode 100644
index 00..2afd90f34f
--- /dev/null
+++ b/src/test/regress/expected/collate.icu.utf8_1.out
@@ -0,0 +1,7 @@
+/*
+ * This test is for ICU collations.
+ */
+/* skip test if not UTF8 server encoding */
+SELECT getdatabaseencoding() <> 'UTF8' AS server_encoding_incompatible \gset
+\if :server_encoding_incompatible
+\quit
diff --git a/src/test/regress/expected/collate.linux.utf8.out 
b/src/test/regress/expected/collate.linux.utf8.out

Re: range_agg

2019-07-05 Thread Pavel Stehule
čt 4. 7. 2019 v 20:34 odesílatel Alvaro Herrera 
napsal:

> I noticed that this patch has a // comment about it segfaulting.  Did
> you ever figure that out?  Is the resulting code the one you intend as
> final?
>
> Did you make any inroads regarding Jeff Davis' suggestion about
> implementing "multiranges"?  I wonder if that's going to end up being a
> Pandora's box.


Introduction of new datatype can be better, because we can ensure so data
are correctly ordered


> Stylistically, the code does not match pgindent's choices very closely.
> I can return a diff to a pgindented version of your v0002 for your
> perusal, if it would be useful for you to learn its style.  (A committer
> would eventually run pgindent himself[1], but it's good to have
> submissions be at least close to the final form.)  BTW, I suggest "git
> format-patch -vN" to prepare files for submission.
>

The first issue is unstable regress tests - there is a problem with
opr_sanity

SELECT p1.oid, p1.proname, p2.oid, p2.proname
FROM pg_proc AS p1, pg_proc AS p2
WHERE p1.oid < p2.oid AND
p1.prosrc = p2.prosrc AND
p1.prolang = 12 AND p2.prolang = 12 AND
(p1.prokind != 'a' OR p2.prokind != 'a') AND
(p1.prolang != p2.prolang OR
 p1.prokind != p2.prokind OR
 p1.prosecdef != p2.prosecdef OR
 p1.proleakproof != p2.proleakproof OR
 p1.proisstrict != p2.proisstrict OR
 p1.proretset != p2.proretset OR
 p1.provolatile != p2.provolatile OR
 p1.pronargs != p2.pronargs)
ORDER BY p1.oid, p2.oid; -- requires explicit ORDER BY now

+   rangeTypeId = get_fn_expr_argtype(fcinfo->flinfo, 1);
+   if (!type_is_range(rangeTypeId))
+   {
+   ereport(ERROR, (errmsg("range_agg must be called with a range")));
+   }

???

+   r1Str = "lastRange";
+   r2Str = "currentRange";
+   // TODO: Why is this segfaulting?:
+   // r1Str =
DatumGetCString(DirectFunctionCall1(range_out,
RangeTypePGetDatum(lastRange)));
+   // r2Str =
DatumGetCString(DirectFunctionCall1(range_out,
RangeTypePGetDatum(currentRange)));
+   ereport(ERROR, (errmsg("range_agg: gap detected between
%s and %s", r1Str, r2Str)));
+   }

???

The patch doesn't respect Postgres formatting

+# Making 2- and 3-param range_agg polymorphic is difficult
+# because it would take an anyrange and return an anyrange[],
+# which doesn't exist.
+# As a workaround we define separate functions for each built-in range
type.
+# This is what causes the mess in src/test/regress/expected/opr_sanity.out.
+{ oid => '8003', descr => 'aggregate transition function',

This is strange. Does it means so range_agg will not work with custom range
types?

I am not sure about implementation. I think so accumulating all ranges,
sorting and processing on the end can be memory and CPU expensive.

Regards

Pavel



>
> [1] No female committers yet ... is this 2019?
>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
>


Re: benchmarking Flex practices

2019-07-05 Thread John Naylor
On Wed, Jul 3, 2019 at 5:35 AM Tom Lane  wrote:
>
> As far as I can see, the point of 0002 is to have just one set of
> flex rules for the various variants of quotecontinue processing.
> That sounds OK, though I'm a bit surprised it makes this much difference
> in the table size. I would suggest that "state_before" needs a less
> generic name (maybe "state_before_xqs"?) and more than no comment.
> Possibly more to the point, it's not okay to have static state variables
> in the core scanner, so that variable needs to be kept in yyextra.

v4-0001 is basically the same as v3-0002, with the state variable in
yyextra. Since follow-on patches use it as well, I've named it
state_before_quote_stop. I failed to come up with a nicer short name.
With this applied, the transition table is reduced from 37045 to
30367. Since that's uncomfortably close to the 32k limit for 16 bit
members, I hacked away further at UESCAPE bloat.

0002 unifies xusend and xuiend by saving the state of xui as well.
This actually causes a performance regression, but it's more of a
refactoring patch to prevent from having to create two additional
start conditions in 0003 (of course it could be done that way if
desired, but the savings won't be as great). In any case, the table is
now down to 26074.

0003 creates a separate start condition so that UESCAPE and the
expected quoted character after it are detected in separate states.
This allows us to use standard whitespace skipping techniques and also
to greatly simplify the uescapefail rule. The final size of the table
is 23696. Removing UESCAPE entirely results in 21860, so this likely
the most compact size of this feature.

Performance is very similar to HEAD. Parsing the information schema
might be a hair faster and pgbench-like queries with simple strings a
hair slower, but the difference seems within the noise of variation.
Parsing strings with UESCAPE likewise seems about the same.

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


v4-0001-Replace-the-Flex-quotestop-rules-with-a-new-exclu.patch
Description: Binary data


v4-0002-Unify-xuiend-and-xusend-into-a-single-start-condi.patch
Description: Binary data


v4-0003-Use-separate-start-conditions-for-both-UESCAPE-an.patch
Description: Binary data


Re: Index Skip Scan

2019-07-05 Thread Dmitry Dolgov
> > On Sat, Jun 22, 2019 at 12:17 PM Floris Van Nee  
> > wrote:
> > The following sql statement seems to have incorrect results - some logic in
> > the backwards scan is currently not entirely right.
>
> Thanks for testing! You're right, looks like in the current implementation in
> case of backwards scan there is one unnecessary extra step forward. It seems
> this mistake was made, since I was concentrating only on the backward scans
> with a cursor, and used not exactly correct approach to wrap up after a scan
> was finished. Give me a moment, I'll tighten it up.

Here is finally a new version of the patch, where all the mentioned issues
seems to be fixed and the corresponding new tests should keep it like that
(I've skipped all the pubs at PostgresLondon for that). Also I've addressed the
most of feedback from Tomas, except the points about planning improvements
(which is still in our todo list). By no means it's a final result (e.g. I
guess `_bt_read_closest` must be improved), but I hope making progress
step-by-step will help anyway. Also I've fixed some, how it is popular to say
here, brain fade, where I mixed up scan directions.

> On Tue, Jul 2, 2019 at 11:00 AM Thomas Munro  wrote:
>
> On Fri, Jun 21, 2019 at 1:20 AM Jesper Pedersen
>  wrote:
> > Attached is v20, since the last patch should have been v19.
>
> I took this for a quick spin today.  The DISTINCT ON support is nice
> and I think it will be very useful.  I've signed up to review it and
> will have more to say later.  But today I had a couple of thoughts
> after looking into how src/backend/optimizer/plan/planagg.c works and
> wondering how to do some more skipping tricks with the existing
> machinery.
>
> On Thu, Jul 4, 2019 at 1:00 PM Thomas Munro  wrote:
>
> I mention all this stuff not because I want us to run before we can
> walk, but because to be ready to commit the basic distinct skip scan
> feature, I think we should know approximately how it'll handle the
> future stuff we'll need.

Great, thank you! I agree with Jesper that probably some parts of this are
outside of scope for the first patch, but we definitely can take a look at what
needs to be done to make the current implementation more flexible, so the
follow up would be just natural.


v21-0001-Index-skip-scan.patch
Description: Binary data


Changing GENERATED ALWAYS AS expression

2019-07-05 Thread Will Bryant
Hello,

I’ve been testing out the 12 (beta) support for generated columns, mainly in 
order to add support for them to the database synchronisation software I 
maintain (Kitchen Sync).

So far it looks good compared to the similar functionality on MariaDB and MySQL 
(apart from VIRTUAL support which I see was pulled out because it wasn’t ready).

But I can’t see a way to change the expression for the generated column after 
its been created initially. I was looking for something like the SET/DROP NOT 
NULL alter clauses, perhaps ALTER TABLE foo ALTER bar SET GENERATED ALWAYS AS 
(…) STORED, but that doesn’t work and I can’t see anything in the docs.  
(MySQL/MariaDB don’t have anything specific, but they have the generic MODIFY 
column syntax which does allow you to change stored column definitions.)

I can remove and add back the column, but that has the undesirable effect of 
changing the column order and requires recreating all the indexes etc. which 
gets messy.

Any thoughts?  Will this be implemented later?

Cheers,
Will



Inconsistency between attname of index and attname of relation

2019-07-05 Thread Ronan Dunklau
Hello,

I've noticed that renaming an indexed column produces inconsistencies in
the catalog. Namely, the attname of the attribute of the relation is
properly updated, whereas the attname of the attribute in the index is not,
and keeps the old value.

Example:

test # create table test (id int primary key);
CREATE TABLE
test # alter table test rename id to idnew;
ALTER TABLE
test # select attrelid::regclass, attname from pg_attribute where attrelid
in ('test'::regclass, 'test_pkey'::regclass) and attnum > 0;
 attrelid  | attname
---+-
 test  | idnew
 test_pkey | id

We ran into that while using wal2json, which uses the replication id index
attnames to identify which columns are part of the primary key. If the
primary key column has been renamed, we end with no information about the
identity of the tuple being updated / deleted.

I think this could be considered a bug in Postgres. If it isn't, what
should be the proper way to retrieve this information ?

-- 




This e-mail message and any attachments to it are intended only for the 
named recipients and may contain legally privileged and/or confidential 
information. If you are not one of the intended recipients, do not 
duplicate or forward this e-mail message.


Re: Use relative rpath if possible

2019-07-05 Thread Peter Eisentraut
rebased patch attached, no functionality changes

On 2019-06-27 13:45, Peter Eisentraut wrote:
> On several popular operating systems, we can use relative rpaths, using
> the $ORIGIN placeholder, so that the resulting installation is
> relocatable.  Then we also don't need to set LD_LIBRARY_PATH during make
> check.
> 
> This implementation will use a relative rpath if bindir and libdir are
> under the same common parent directory.
> 
> Supported platforms are: freebsd, linux, netbsd, openbsd, solaris
> 
> Information from https://lekensteyn.nl/rpath.html
> 
> (Yes, something for macOS would be nice, to work around SIP issues, but
> I'll leave that as a separate future item.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 6a86e8c64d5bb416c3001dea2dae3f6365085e9b Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 5 Jul 2019 11:35:44 +0200
Subject: [PATCH v2] Use relative rpath if possible

On several popular operating systems, we can use relative rpaths, using
the $ORIGIN placeholder, so that the resulting installation is
relocatable. Then we also don't need to set LD_LIBRARY_PATH during make
check.

This implementation will use a relative rpath if bindir and libdir are
under the same common parent directory.
---
 src/Makefile.global.in |  2 +-
 src/makefiles/Makefile.freebsd |  5 +
 src/makefiles/Makefile.linux   |  5 +
 src/makefiles/Makefile.netbsd  |  5 +
 src/makefiles/Makefile.openbsd |  5 +
 src/makefiles/Makefile.solaris | 10 ++
 6 files changed, 31 insertions(+), 1 deletion(-)

diff --git a/src/Makefile.global.in b/src/Makefile.global.in
index 321af38b0c..69657e8ab8 100644
--- a/src/Makefile.global.in
+++ b/src/Makefile.global.in
@@ -422,7 +422,7 @@ ld_library_path_var = LD_LIBRARY_PATH
 # nothing.
 with_temp_install = \
PATH="$(abs_top_builddir)/tmp_install$(bindir):$$PATH" \
-   $(call add_to_path,$(strip 
$(ld_library_path_var)),$(abs_top_builddir)/tmp_install$(libdir)) \
+   $(if $(strip $(ld_library_path_var)),$(call add_to_path,$(strip 
$(ld_library_path_var)),$(abs_top_builddir)/tmp_install$(libdir))) \
$(with_temp_install_extra)
 
 ifeq ($(enable_tap_tests),yes)
diff --git a/src/makefiles/Makefile.freebsd b/src/makefiles/Makefile.freebsd
index c462e2fd58..ae92dba5ee 100644
--- a/src/makefiles/Makefile.freebsd
+++ b/src/makefiles/Makefile.freebsd
@@ -1,7 +1,12 @@
 AROPT = cr
 
 export_dynamic = -Wl,-export-dynamic
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-R'$(subst $(dir $(libdir)),$$ORIGIN/../,$(rpathdir))',-z,origin
+ld_library_path_var =
+else
 rpath = -Wl,-R'$(rpathdir)'
+endif
 
 DLSUFFIX = .so
 
diff --git a/src/makefiles/Makefile.linux b/src/makefiles/Makefile.linux
index ac58fe45de..d967eec9de 100644
--- a/src/makefiles/Makefile.linux
+++ b/src/makefiles/Makefile.linux
@@ -3,7 +3,12 @@ AROPT = crs
 export_dynamic = -Wl,-E
 # Use --enable-new-dtags to generate DT_RUNPATH instead of DT_RPATH.
 # This allows LD_LIBRARY_PATH to still work when needed.
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-rpath,'$(subst $(dir 
$(libdir)),$$ORIGIN/../,$(rpathdir))',--enable-new-dtags
+ld_library_path_var =
+else
 rpath = -Wl,-rpath,'$(rpathdir)',--enable-new-dtags
+endif
 
 DLSUFFIX = .so
 
diff --git a/src/makefiles/Makefile.netbsd b/src/makefiles/Makefile.netbsd
index 15695fb65c..4a96be9121 100644
--- a/src/makefiles/Makefile.netbsd
+++ b/src/makefiles/Makefile.netbsd
@@ -1,7 +1,12 @@
 AROPT = cr
 
 export_dynamic = -Wl,-E
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-R'$(subst $(dir $(libdir)),$$ORIGIN/../,$(rpathdir))'
+ld_library_path_var =
+else
 rpath = -Wl,-R'$(rpathdir)'
+endif
 
 DLSUFFIX = .so
 
diff --git a/src/makefiles/Makefile.openbsd b/src/makefiles/Makefile.openbsd
index 15695fb65c..36b7420057 100644
--- a/src/makefiles/Makefile.openbsd
+++ b/src/makefiles/Makefile.openbsd
@@ -1,7 +1,12 @@
 AROPT = cr
 
 export_dynamic = -Wl,-E
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-R'$(subst $(dir $(libdir)),$$ORIGIN/../,$(rpathdir))',-z,origin
+ld_library_path_var =
+else
 rpath = -Wl,-R'$(rpathdir)'
+endif
 
 DLSUFFIX = .so
 
diff --git a/src/makefiles/Makefile.solaris b/src/makefiles/Makefile.solaris
index a7f5652f0c..b61fdabbad 100644
--- a/src/makefiles/Makefile.solaris
+++ b/src/makefiles/Makefile.solaris
@@ -4,10 +4,20 @@ AROPT = crs
 
 ifeq ($(with_gnu_ld), yes)
 export_dynamic = -Wl,-E
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-rpath,'$(subst $(dir $(libdir)),$$ORIGIN/../,$(rpathdir))'
+ld_library_path_var =
+else
 rpath = -Wl,-rpath,'$(rpathdir)'
+endif
+else
+ifeq ($(dir $(bindir)),$(dir $(libdir)))
+rpath = -Wl,-R'$(subst $(dir $(libdir)),$$ORIGIN/../,$(rpathdir))'
+ld_library_path_var =
 else
 rpath = -Wl,-R'$(rpathdir)'
 endif
+endif
 
 DLSUFFIX = .so
 ifeq ($(GCC), yes)

base-commit: 594df378ffb04a72b713a13cc0a7166b3bced7b7
-- 
2.22.0



Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

2019-07-05 Thread Nikolay Shaplov
В письме от понедельник, 1 июля 2019 г. 23:52:13 MSK пользователь Thomas Munro 
написал:

> > > This patch does not apply.
> > 
> > Oh... Sorry... here goes new version
> 
> 
> Hi Nikolay,
> 
> Could we please have a new rebase?

Sorry, a new reloptions have been introduced, and I need some time to 
understand how do they work and to fit them into new option structures. And I 
do not have much dev-time right now. If we are lucky, and changes will be 
obvious, will do it today or tomorrow.

Re: [PATCH] Implement uuid_version()

2019-07-05 Thread Peter Eisentraut
On 2019-07-05 00:08, Jose Luis Tallon wrote:
> On 4/7/19 17:30, Alvaro Herrera wrote:
>> On 2019-Jul-04, Tom Lane wrote:
>>
>>> A possible option 3 is to keep the function in pgcrypto but change
>>> its C code to call the core code.

Updated patch with this change included.

(There is also precedent for redirecting the extension function to the
internal one by changing the SQL-level function definition using CREATE
OR REPLACE FUNCTION ... LANGUAGE INTERNAL.  But that seems more
complicated and would require a new extension version.  It could maybe
be included if the extension version is changed for other reasons.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From d40033da0b485ce129f95bca80b3803d638ea1d9 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 5 Jul 2019 10:32:08 +0200
Subject: [PATCH v2] Add gen_random_uuid function

This adds a built-in function to generate UUIDs.

PostgreSQL hasn't had a built-in function to generate a UUID yet,
relying on external modules such as uuid-ossp and pgcrypto to provide
one.  Now that we have a strong random number generator built-in, we
can easily provide a version 4 (random) UUID generation function.

This patch takes the existing function gen_random_uuid() from pgcrypto
and makes it a built-in function.  The pgcrypto implementation now
internally redirects to the built-in one.

Discussion: 
https://www.postgresql.org/message-id/6a65610c-46fc-2323-6b78-e8086340a...@2ndquadrant.com
---
 contrib/pgcrypto/pgcrypto.c  | 16 ++-
 doc/src/sgml/datatype.sgml   | 12 ++-
 doc/src/sgml/func.sgml   | 26 
 doc/src/sgml/pgcrypto.sgml   |  3 ++-
 doc/src/sgml/uuid-ossp.sgml  | 11 +++---
 src/backend/utils/adt/uuid.c | 20 ++
 src/include/catalog/pg_proc.dat  |  3 +++
 src/test/regress/expected/opr_sanity.out |  1 +
 src/test/regress/expected/uuid.out   | 10 +
 src/test/regress/sql/uuid.sql|  6 ++
 10 files changed, 75 insertions(+), 33 deletions(-)

diff --git a/contrib/pgcrypto/pgcrypto.c b/contrib/pgcrypto/pgcrypto.c
index c558767909..f69ae107c3 100644
--- a/contrib/pgcrypto/pgcrypto.c
+++ b/contrib/pgcrypto/pgcrypto.c
@@ -446,20 +446,8 @@ PG_FUNCTION_INFO_V1(pg_random_uuid);
 Datum
 pg_random_uuid(PG_FUNCTION_ARGS)
 {
-   uint8  *buf = (uint8 *) palloc(UUID_LEN);
-
-   /* Generate random bits. */
-   if (!pg_strong_random(buf, UUID_LEN))
-   px_THROW_ERROR(PXE_NO_RANDOM);
-
-   /*
-* Set magic numbers for a "version 4" (pseudorandom) UUID, see
-* http://tools.ietf.org/html/rfc4122#section-4.4
-*/
-   buf[6] = (buf[6] & 0x0f) | 0x40;/* "version" field */
-   buf[8] = (buf[8] & 0x3f) | 0x80;/* "variant" field */
-
-   PG_RETURN_UUID_P((pg_uuid_t *) buf);
+   /* redirect to built-in function */
+   return gen_random_uuid(fcinfo);
 }
 
 static void *
diff --git a/doc/src/sgml/datatype.sgml b/doc/src/sgml/datatype.sgml
index 35ecd48ed5..9b6d6878eb 100644
--- a/doc/src/sgml/datatype.sgml
+++ b/doc/src/sgml/datatype.sgml
@@ -4195,16 +4195,8 @@ UUID Type

 

-PostgreSQL provides storage and comparison
-functions for UUIDs, but the core database does not include any
-function for generating UUIDs, because no single algorithm is well
-suited for every application.  The  module
-provides functions that implement several standard algorithms.
-The  module also provides a generation
-function for random UUIDs.
-Alternatively, UUIDs could be generated by client applications or
-other libraries invoked through a server-side function.
+See  for how to generate a UUID in
+PostgreSQL.

   
 
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 47cefd8020..466f24f0f1 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -10291,6 +10291,32 @@ Text Search Debugging Functions
 
  
 
+ 
+  UUID Functions
+
+  
+   UUID
+   generating
+  
+
+  
+   gen_random_uuid
+  
+
+  
+   PostgreSQL includes one function to generate a 
UUID:
+
+gen_random_uuid() returns uuid
+
+   This function returns a version 4 (random) UUID.  This is the most commonly
+   used type of UUID and is appropriate for most applications.
+  
+
+  
+   The  module provides additional functions that
+   implement other standard algorithms for generating UUIDs.
+  
+ 
 
  
 
diff --git a/doc/src/sgml/pgcrypto.sgml b/doc/src/sgml/pgcrypto.sgml
index 5c7954..0acd11ed55 100644
--- a/doc/src/sgml/pgcrypto.sgml
+++ b/doc/src/sgml/pgcrypto.sgml
@@ -1132,7 +1132,8 @@ Random-Data Functions
 gen_random_uuid() returns uuid
 
   
-   Returns a version 4 (random) UUID.
+   Returns a version 4 (random) UUID. (Obsolete, this function is now also
+   included in core PostgreSQL.)
   
  
 
diff --git 

Re: Add client connection check during the execution of the query

2019-07-05 Thread Thomas Munro
On Fri, Jul 5, 2019 at 6:28 PM Tatsuo Ishii  wrote:
> > The purpose of this patch is to stop the execution of continuous
> > requests in case of a disconnection from the client.
>
> Pgpool-II already does this by sending a parameter status message to
> the client. It is expected that clients are always prepared to receive
> the parameter status message. This way I believe we could reliably
> detect that the connection to the client is broken or not.

Hmm.  If you send a message, it's basically application-level
keepalive.  But it's a lot harder to be sure that the protocol and
socket are in the right state to insert a message at every possible
CHECK_FOR_INTERRUPT() location.  Sergey's proposal of recv(MSG_PEEK)
doesn't require any knowledge of the protocol at all, though it
probably does need TCP keepalive to be configured to be useful for
remote connections.

-- 
Thomas Munro
https://enterprisedb.com




Re: mcvstats serialization code is still shy of a load

2019-07-05 Thread Tomas Vondra
On Fri, Jul 5, 2019, 03:28 Tom Lane  wrote:

> Tomas Vondra  writes:
> > I was about to push into REL_12_STABLE, when I realized that maybe we
> > need to do something about the catversion first. REL_12_STABLE is still
> > on 201906161, while master got to 201907041 thanks to commit
> > 7b925e12703.  Simply cherry-picking the commits would get us to
> > 201907052 in both branches, but that'd be wrong as the catalogs do
> > differ. I suppose this is what you meant by "catversion differences to
> > cope with".
>
> Yeah, exactly.
>
> My recommendation is to use 201907051 on v12 and 201907052
> on master (or whatever is $today for you).  They need to be
> different now that the branches' catalog histories have diverged,
> and it seems to me that the back branch should be "older".
>
> regards, tom lane
>

Unfortunately, master is already using both 201907051 and 201907052 (two of
the patches I pushed touched the catalog), so we can't quite do exactly
that. We need to use 201907042 and 201907043 or something preceding 201907041
(which is the extra catversion on master).

At this point there's no perfect sequence, thanks to the extra commit on
master, so REL_12_STABLE can't be exactly "older" :-(

Barring objections, I'll go ahead with 201907042+201907043 later today,
before someone pushes another catversion-bumping patch.

regards

>


Re: Add client connection check during the execution of the query

2019-07-05 Thread Thomas Munro
On Fri, Jul 5, 2019 at 6:42 PM Tatsuo Ishii  wrote:
> > This seems like a reasonable idea to me.  There is no point in running
> > a monster 24 hour OLAP query if your client has gone away.  It's using
> > MSG_PEEK which is POSIX, and I can't immediately think of any reason
> > why it's not safe to try to peek at a byte in that socket at any time.
>
> I am not familiar with Windows but I accidentally found this article
> written by Microsoft:
>
> https://support.microsoft.com/en-us/help/192599/info-avoid-data-peeking-in-winsock
>
> It seems using MSG_PEEK is not recommended by Microsoft.

Hmm, interesting.  Using it very infrequently just as a way to detect
that the other end has gone away doesn't seem too crazy based on
anything in that article though, does it?  What they're saying
actually applies to every operating system, not just Windows, AFAICS.
Namely, don't use MSG_PEEK frequently because it's a syscall and takes
locks in the kernel, and don't use it to wait for full messages to
arrive, or you might effectively deadlock if internal buffers are
full.  But Sergey's patch only uses it to check if we could read 1
single byte, and does so very infrequently (the GUC should certainly
be set to at least many seconds).

What else could we do?  Assuming the kernel actually knows the
connection has gone away, the WaitEventSetWait() interface is no help
on its own, I think, because it'll just tell you the socket is read
for reading when it's closed, you still have to actually try to read
to distinguish closed from a data byte.

I tried this patch using a real network with two machines.  I was able
to get the new "connection to client lost" error by shutting down a
network interface (effectively yanking a cable), but only with TCP
keepalive configured.  That's not too surprising; without that and
without trying to write, there is no way for the kernel to know that
the other end has gone.

-- 
Thomas Munro
https://enterprisedb.com




Extending PostgreSQL with a Domain-Specific Language (DSL) - Development

2019-07-05 Thread Tom Mercha
Dear Hackers

I am interested in implementing my own Domain Specific Language (DSL) using 
PostgreSQL internals. Originally, the plan was not to use PostgreSQL and I had 
developed a grammar and used ANTLRv4 for parser work and general early 
development.

Initially, I was hoping for a scenario where I could have PostgreSQL's parser 
to change grammar (e.g. SET parser_language=SQL vs. SET parser_language=myDSL) 
in which case my ANTLRv4 project would override the PostgreSQL parser module. I 
guess another direction that my project could take is to extend PostgreSQL's 
SQL parser to factor in my DSL keywords and requirements.

To make matters more complicated, this version of ANTLR does not support code 
generation to C, but it does support generation to C++. Integrating the 
generated C++ code requires making it friendly to PostgreSQL e.g. using Plain 
Old Data Structures as described here 
https://www.postgresql.org/docs/9.0/extend-cpp.html, which seems to be 
suggesting to me that I may be using the wrong approach towards my goal.

I would be grateful if anyone could provide any general advice or pointers 
regarding my approach, for example regarding development effort, so that the 
development with PostgreSQL internals can be smooth and of a high quality. 
Maybe somebody has come across another DSL attempt which used PostgreSQL and 
that I could follow as a reference?

Thanks in advance.

Best,
Tom


[https://ipmcdn.avast.com/images/icons/icon-envelope-tick-green-avg-v1.png]
  Virus-free. 
www.avg.com


Re: make libpq documentation navigable between functions

2019-07-05 Thread Peter Eisentraut
On 2019-05-12 11:02, Fabien COELHO wrote:
> While writing some libpq code, I found it quite irritating that the 
> documentation is not navigable, so when a function appears in a 
> description of another function and you are interested, there is no direct 
> way to find it, you have to go to the index or to guess in which section 
> it is going to appear.
> 
> Attached:
>   - a first patch to add a few missing "id"
>   - a script which adds the references
>   - a second patch which is the result of applying the script
> on top of the first patch, so that all PQ* functions are
> replaced by links to their documentation.

I think this is a good idea.

The rendering ends up a bit inconsistent depending on the context of the
link target.  Sometimes it's monospaced, sometimes it's not, sometimes
in the same sentence.  I think we should improve that a bit.  One
approach for making the currently non-monospaced ones into monospace
would be to make the xref targets point to  elements but
*don't* put xreflabels on those.  This will currently produce a warning

Don't know what gentext to create for xref to: "function"

but we can write a template



and then we can control the output format of that.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




[Season of Docs] Technical writer applications and proposals for PostgreSQL

2019-07-05 Thread Andrew Chen
Hello open source administrators and mentors,

Thank you again for your patience. We received over 440 technical writer
applications, so there is quite a lot of enthusiasm from the technical
writing community. Every single organization received multiple technical
writer project proposals!

We have been busy processing and reviewing the applications to make sure
they follow our guidelines and rules. Consequently, there were a few that
we had to remove. There are also some that don't violate any of our
guidelines or rules, but they contain irrelevant content or perhaps the
applicant may have misunderstood the purpose of Season of Docs. We have
decided not to censor those and are leaving it up to you to dismiss or
reroute them as appropriate for your organization.

Attached is a link to a Google Sheets export of the technical writer
applications to your specific organization. We decided that a spreadsheet
would be the easiest format for you to manipulate and consume the form
data; however, if you would like us to generate a separate email message
per project proposal, please contact us on the support list,
season-of-docs-supp...@googlegroups.com.

 Season of Docs - PostgreSQL (16)


When assessing a project proposal, org admins and mentors should assess the
proposal based on the content of the fields in the form, without examining
any linked docs. (It's fine to follow links for tech writing experience or
resumes.) The reasons are (a) we must avoid any potential for people to add
to their proposal after the deadline date has passed, and (b) we just
ensure all applicants are on equal footing. The tech writer guidelines
state clearly that the proposal description should be in the form itself:
https://developers.google.com/season-of-docs/docs/tech-writer-application-hints#project_proposal

Later, we will send out a form for you to enter in your organization's
technical writer/project proposal choice and alternates in order of
preference.

If you have any questions, contact us at
season-of-docs-supp...@googlegroups.com.

Cheers,


--Andrew Chen and Sarah Maddox


Andrew Chen |  Program Manager, Open Source Strategy |  cheno...@google.com
 |  650-495-4987


Re: improve PQexec documentation

2019-07-05 Thread Peter Eisentraut
On 2019-04-12 17:51, Fabien COELHO wrote:
> Hmmm. I obviously agree that PQexec is beyond awkward.
> 
> Now I'm not sure how anyone is expected to guess the actual function 
> working from the available documentation, and without this knowledge I 
> cannot see how to write meaningful code for the multiple query case.

But you're not really supposed to use it for multiple queries or
multiple result sets anyway.  There are other functions for this.

If a source code comment in libpq or psql would help explaining some of
the current code, then we could add that.  But I am also not sure that
enshrining the current behavior on the API documentation is desirable.

> Basically it seems to have been designed for simple queries, and then 
> accomodated somehow for the multiple case but with a strange non 
> systematic approach.

probably

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Add client connection check during the execution of the query

2019-07-05 Thread Tatsuo Ishii
> This seems like a reasonable idea to me.  There is no point in running
> a monster 24 hour OLAP query if your client has gone away.  It's using
> MSG_PEEK which is POSIX, and I can't immediately think of any reason
> why it's not safe to try to peek at a byte in that socket at any time.

I am not familiar with Windows but I accidentally found this article
written by Microsoft:

https://support.microsoft.com/en-us/help/192599/info-avoid-data-peeking-in-winsock

It seems using MSG_PEEK is not recommended by Microsoft.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: [HACKERS] Partition-wise aggregation/grouping

2019-07-05 Thread Andrey Lepikhov

> Regarding apply_scanjoin_target_to_paths in 0001 and 0007, it seems
> like what happens is: we first build an Append path for the topmost
> scan/join rel.  That uses paths from the individual relations that
> don't necessarily produce the final scan/join target.  Then we mutate
> those relations in place during partition-wise aggregate so that they
> now do produce the final scan/join target and generate some more paths
> using the results.  So there's an ordering dependency, and the same
> pathlist represents different things at different times.  That is, I
> suppose, not technically any worse than what we're doing for the
> scan/join rel's pathlist in general, but here there's the additional
> complexity that the paths get used both before and after being
> mutated.  The UPPERREL_TLIST proposal would clean this up, although I
> realize that has unresolved issues.

I discouraged by this logic.
Now I use set_rel_pathlist_hook and make some optimizations at partition 
scanning paths. But apply_scanjoin_target_to_paths() deletes pathlist 
and violates all optimizations.
May be it is possible to introduce some flag, that hook can set to 
prevent pathlist cleaning?


--
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company




Re: Add client connection check during the execution of the query

2019-07-05 Thread Tatsuo Ishii
> The purpose of this patch is to stop the execution of continuous
> requests in case of a disconnection from the client.

Pgpool-II already does this by sending a parameter status message to
the client. It is expected that clients are always prepared to receive
the parameter status message. This way I believe we could reliably
detect that the connection to the client is broken or not.

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp




Re: Add client connection check during the execution of the query

2019-07-05 Thread Thomas Munro
On Fri, Jul 5, 2019 at 5:36 PM Thomas Munro  wrote:
> On Sat, Feb 9, 2019 at 6:16 AM  wrote:
> > The purpose of this patch is to stop the execution of continuous
> > requests in case of a disconnection from the client. In most cases, the
> > client must wait for a response from the server before sending new data
> > - which means there should not remain unread data on the socket and we
> > will be able to determine a broken connection.
> > Perhaps exceptions are possible, but I could not think of such a use
> > case (except COPY). I would be grateful if someone could offer such
> > cases or their solutions.
> > I added a test for the GUC variable when the client connects via SSL,
> > but I'm not sure that this test is really necessary.
>
> [review]

Email to Sergey is bouncing back.  I've set this to "Waiting on
author" in the Commitfest app.


--
Thomas Munro
https://enterprisedb.com