[HACKERS] Migration to pglister - Before

2017-11-13 Thread Stephen Frost
Greetings,

We will be migrating these lists to pglister in the next few minutes.

This final email on the old list system is intended to let you know
that future emails will have different headers and you will need to
adjust your filters.

The changes which we expect to be most significant to users can be found
on the wiki here: https://wiki.postgresql.org/wiki/PGLister_Announce

Once the migration of these lists is complete, an 'after' email will be
sent out.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-10 Thread Stephen Frost
Michael, Tom,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Nov 10, 2017 at 10:00 AM, Tom Lane  wrote:
> > Stephen Frost  writes:
> >> I'm guessing no, which essentially means that *we* consider access to
> >> lo_import/lo_export to be equivilant to superuser and therefore we're
> >> not going to implement anything to try and prevent the user who has
> >> access to those functions from becoming superuser.  If we aren't willing
> >> to do that, then how can we really say that there's some difference
> >> between access to these functions and being a superuser?
> >
> > We seem to be talking past each other.  Yes, if a user has malicious
> > intentions, it's possibly to parlay lo_export into obtaining a superuser
> > login (I'm less sure that that's necessarily true for lo_import).
> > That does NOT make it "equivalent", except perhaps in the view of someone
> > who is only considering blocking malevolent actors.  It does not mean that
> > there's no value in preventing a task that needs to run lo_export from
> > being able to accidentally destroy any data in the database.  There's a
> > range of situations where you are concerned about accidents and errors,
> > not malicious intent; but your argument ignores those use-cases.
> 
> That will not sound much as a surprise as I spawned the original
> thread, but like Robert I understand that getting rid of all superuser
> checks is a goal that we are trying to reach to allow admins to have
> more flexibility in handling permissions to a subset of objects.
> Forcing an admin to give full superuser rights to one user willing to
> work only on LOs import and export is a wrong concept.

The problem I have with this is that 'full superuser rights' actually
are being granted by this.  You're able to read any file and write any
file on the filesystem that the PG OS user can.  How is that not 'full
superuser rights'?  The concerns about a mistake being made are just as
serious as they are with someone who has superuser rights as someone
could pretty easily end up overwriting the control file or various other
extremely important bits of the system.  This isn't just about what
'blackhats' can do with this.

As I mentioned up-thread, if we want to make it so that non-superusers
can use lo_import/lo_export, then we should do that in a way that
allows the administrator to specify exactly the rights they really want
to give, based on a use-case for them.  There's no use-case for allowing
someone to use lo_export() to overwrite pg_control.  The use-case would
be to allow them to export to a specific directory (or perhaps a set of
directories), or to read from same.  The same is true of COPY.  Further,
I wonder what would happen if we allow this and then someone comes along
and re-proposes the 'CREATE DIRECTORY' concept that I had once proposed
(ideally with things cleaned up and tightened up to avoid there being
issues using it) to address the actual use-case for these functions to
be available to a non-superuser.  We wouldn't be able to change these
functions to start checking the 'directory' rights or we would break
existing non-superuser usage of them.  I imagine we could create
additional functions which check the 'directory' privileges, but that's
hardly ideal, imv.

I'm disappointed to apparently be in the minority on this, but that
seems to be the case unless someone else wishes to comment.  While I
appreciate the discussion, I'm certainly no more convinced today than I
was when I first saw this go in that allowing these functions to be
granted to non-superusers does anything but effectively make them into
superusers who aren't explicitly identified as superusers.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 2:56 PM, Stephen Frost  wrote:
> > Further, I agree entirely that we
> > shouldn't be deciding that certain capabilities are never allowed to be
> > given to a user- but that's why superuser *exists* and why it's possible
> > to give superuser access to multiple users.  The question here is if
> > it's actually sensible for us to make certain actions be grantable to
> > non-superusers which allow that role to become, or to essentially be, a
> > superuser.  What that does, just as it does in the table case, from my
> > viewpoint at least, is make it *look* to admins like they're grant'ing
> > something less than superuser when, really, they're actually giving the
> > user superuser-level access.  That violates the POLA because the admin
> > didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
> > EXECUTE ON lo_import() TO joe;'.
> 
> This is exactly the kind of thing that I'm talking about.  Forcing an
> administrator to hand out full superuser access instead of being able
> to give just lo_import() makes life difficult for users for no good
> reason.  The existence of a theoretically-exploitable security
> vulnerability is not tantamount to really having access, especially on
> a system with a secured logging facility.

This is where I disagree.  The administrator *is* giving the user
superuser-level access, they're just doing it in a different way from
explicitly saying 'ALTER USER .. WITH SUPERUSER' and that's exactly the
issue that I have.  This isn't a theoretical exploit- the user with
lo_import/lo_export access is able to read and write any file on the
filesystem which the PG OS has access to, including things like
pg_hba.conf in most configurations, the file backing pg_authid, the OS
user's .bashrc, the Kerberos principal, the CA public key, the SSL keys,
tables owned by other users that the in-database role doesn't have
access to, et al.  Further, will we consider these *actual*,
non-theoretical, methods to gain superuser access, or to bypass the
database authorization system, to be security issues or holes to plug?

I'm guessing no, which essentially means that *we* consider access to
lo_import/lo_export to be equivilant to superuser and therefore we're
not going to implement anything to try and prevent the user who has
access to those functions from becoming superuser.  If we aren't willing
to do that, then how can we really say that there's some difference
between access to these functions and being a superuser?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 1:52 PM, Stephen Frost  wrote:
> > This is not unlike the discussions we've had in the past around allowing
> > non-owners of a table to modify properties of a table, where the
> > argument has been successfully and clearly made that if you make the
> > ability to change the table a GRANT'able right, then that can be used to
> > become the role which owns the table without much difficulty, and so
> > there isn't really a point in making that right independently
> > GRANT'able.  We have some of that explicitly GRANT'able today due to
> > requirements of the spec, but that's outside of our control.
> 
> I don't think it's quite the same thing.  I wouldn't go out of my way
> to invent grantable table rights that just let you usurp the table
> owner's permissions, but I still think it's better to have a uniform
> rule that functions we ship don't contain hard-coded superuser checks.

Just to be clear, we should certainly be thinking about more than just
functions but also about things like publications and subscriptions and
other bits of the system.  I haven't done a detailed analysis quite yet,
but I'm reasonably confident that a number of things in this last
release cycle have resulted in quite a few additional explicit superuser
checks, which I do think is an issue to be addressed.

> One problem is that which functions allow an escalation to superuser
> that is easy enough or reliable enough may not actually be a very
> bright line in all cases.  But more than that, I think it's a
> legitimate decision to grant to a user a right that COULD lead to a
> superuser escalation and trust them not to use that way, or prevent
> them from using it that way by some means not known to the database
> system (e.g. route all queries through pgbouncer and send them to a
> teletype; if a breach is detected, go to the teletype room, read the
> paper contained therein, and decide who to fire/imprison/execute at
> gunpoint).  It shouldn't be our job to decide that granting a certain
> right is NEVER ok.

I agree that it may not be obvious which cases lead to a relatively easy
way to obtain superuser and which don't, and that's actually why I'd
much rather it be something that we're considering and looking out for
because, frankly, we're in a much better position generally to realize
those cases than our users are.  Further, I agree entirely that we
shouldn't be deciding that certain capabilities are never allowed to be
given to a user- but that's why superuser *exists* and why it's possible
to give superuser access to multiple users.  The question here is if
it's actually sensible for us to make certain actions be grantable to
non-superusers which allow that role to become, or to essentially be, a
superuser.  What that does, just as it does in the table case, from my
viewpoint at least, is make it *look* to admins like they're grant'ing
something less than superuser when, really, they're actually giving the
user superuser-level access.  That violates the POLA because the admin
didn't write 'ALTER USER joe WITH SUPERUSER', they just wrote 'GRANT
EXECUTE ON lo_import() TO joe;'.

We can certainly argue about if the admin should have just known that
lo_import()/lo_export() or similar functions were equivilant to
grant'ing a user superuser access, but it's not entirely clear to me
that it's actually advantageous to go out of our way to provide multiple
ways for superuser-level access to be grant'ed out to users.  Let's
provide one very clear way to do that and strive to avoid building in
others, either intentionally or unintentionally.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Nov 9, 2017 at 1:16 PM, Stephen Frost  wrote:
> > While we have been working to reduce the number of superuser() checks in
> > the backend in favor of having the ability to GRANT explicit rights, one
> > of the guideing principles has always been that capabilities which can
> > be used to gain superuser rights with little effort should remain
> > restricted to the superuser, which is why the lo_import/lo_export hadn't
> > been under consideration for superuser-check removal in the analysis I
> > provided previously.
> 
> I disagree that that is, or should be, a guiding principle.  

It was what I was using as the basis of the work which I did in this
area and, at least at that time, there seemed to be little issue with
that.

> I'm not
> sure that anyone other than you and your fellow employees at Crunchy
> has ever agreed that this is some kind of principle.  You make it
> sound like there's a consensus about this, but I think there isn't.

It's unclear to me why you're bringing up employers in this discussion,
particularly since Tom is the one who just moved things in the direction
that you're evidently advocating for.  Clearly there isn't consensus if
you and others disagree.  Things certainly can change over time as well,
but if we're going to make a change here then we should make it
willfully, plainly, and consistently.

> I think our guiding principle should be to get rid of ALL of the
> hard-coded superuser checks and let people GRANT what they want.  If
> they grant a permission that results in somebody escalating to
> superuser, they get to keep both pieces.  Such risks might be worth
> documenting, but we shouldn't obstruct people from doing it.

I would certainly like to see the many additional hard-coded superuser
checks introduced into PG10 removed and replaced with more fine-grained
GRANT-based privileges (either as additional GRANT rights or perhaps
through the default roles system).  What I dislike about allowing
GRANT's of a privilege which allows someone to be superuser is that it
makes it *look* like you're only GRANT'ing some subset of reasonable
rights when, in reality, you're actually GRANT'ing a great deal more.

This is not unlike the discussions we've had in the past around allowing
non-owners of a table to modify properties of a table, where the
argument has been successfully and clearly made that if you make the
ability to change the table a GRANT'able right, then that can be used to
become the role which owns the table without much difficulty, and so
there isn't really a point in making that right independently
GRANT'able.  We have some of that explicitly GRANT'able today due to
requirements of the spec, but that's outside of our control.

> In the same way, Linux will not prevent you from making a binary
> setuid regardless of what the binary does.  If you make a binary
> setuid root that lets someone hack root, that's your fault, not the
> operating system.

This is actually one of the things that SELinux is intended to address,
so I don't agree that it's quite this cut-and-dry.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Simplify ACL handling for large objects and removal of superuser() checks

2017-11-09 Thread Stephen Frost
Tom, Michael,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Michael Paquier  writes:
> > On Thu, Nov 9, 2017 at 6:05 AM, Tom Lane  wrote:
> >> Another idea would be to invent a new external flag bit "INV_WRITE_ONLY",
> >> so that people who wanted true write-only could get it, without breaking
> >> backwards-compatible behavior.  But I'm inclined to wait for some field
> >> demand to show up before adding even that little bit of complication.
> 
> > Demand that may never show up, and the current behavior on HEAD does
> > not receive any complains either. I am keeping the patch simple for
> > now. That's less aspirin needed for everybody.
> 
> Looks good to me, pushed.

While we have been working to reduce the number of superuser() checks in
the backend in favor of having the ability to GRANT explicit rights, one
of the guideing principles has always been that capabilities which can
be used to gain superuser rights with little effort should remain
restricted to the superuser, which is why the lo_import/lo_export hadn't
been under consideration for superuser-check removal in the analysis I
provided previously.

As such, I'm not particularly thrilled to see those checks simply just
removed.  If we are going to say that it's acceptable to allow
non-superusers arbitrary access to files on the filesystem as the OS
user then we would also be removing similar checks from adminpack,
something I've also been against quite clearly in past discussions.

This also didn't update the documentation regarding these functions
which clearly states that superuser is required.  If we are going to
allow non-superusers access to these functions, we certainly need to
remove the claim stating that we don't do that and further we must make
it abundantly clear that these functions are extremely dangerous and
could be trivially used to make someone who has access to them into a
superuser.

I continue to feel that this is something worthy of serious
consideration and discussion, and not something to be done because we
happen to be modifying the code in this area.  I'm tempted to suggest we
should have another role attribute or some other additional check when
it comes to functions like these.

The right way to allow users to be able to pull in data off the
filesystem, imv, would be by providing a way to define specific
locations on the filesystem which users can be granted access to import
data from, as I once wrote a patch to do.  That's certainly quite tricky
to get correct, but that's much better than allowing non-superusers
arbitrary access, which is what this does and what users may start using
if we continue to remove these restrictions without providing a better
option.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [pgsql-www] Schedule for migration to pglister

2017-11-06 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > Each list will receive an email with a link to the wiki about the
> > migration after the list has been migrated.
> 
> I suggest doing that the other way 'round.  Otherwise, the email
> about the change will inevitably go into a lot of peoples' bit
> buckets if they haven't adjusted their mail filters yet.

My thought had been to do before-and-after, but I got complaints from
others that we'd then be spamming a lot of people with email.

We definitely need one after the migration because the new mail *won't*
end up caught in people's filters, and for those who intentionally
filter the list traffic into the garbage because they couldn't figure
out how to unsubscribe, this is going to be most annoying (this is what
we saw with the pgadmin lists and it was quite painful).

Thanks!

Stephen


signature.asc
Description: Digital signature


[HACKERS] Schedule for migration to pglister

2017-11-06 Thread Stephen Frost
Greetings,

The PostgreSQL Infrastructure team is working to migrate the project's
mailing lists from the existing system (an ancient and unmaintained
piece of software called "majordomo2") to a newly developed mailing list
system (known as "PGLister"), which better addresses the needs of the
PostgreSQL community and is updated to work with recent improvements in
email technology and spam filtering. These changes will impact certain
aspects of the system but we are hopeful that these changes will have a
minimal impact on users, although everyone will notice the differences.

The changes which we expect to be most significant to users can be found
on the wiki here: https://wiki.postgresql.org/wiki/PGLister_Announce

Our planned migration schedule is as follows:

Nov 6 -
  pgsql-www

Nov 13 -
  pgsql-hackers
  pgsql-bugs
  pgsql-committers

Nov 20 -
  pgsql-admin
  pgsql-general
  pgsql-sql
  pgsql-jobs
  pgsql-novice

Nov 27 -
  pgsql-announce

After -
  the rest

We will be starting the migration of pgsql-www shortly.

Each list will receive an email with a link to the wiki about the
migration after the list has been migrated.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Early locking option to parallel backup

2017-11-06 Thread Stephen Frost
Lucas,

* Lucas (luca...@gmail.com) wrote:
> pg_dump was taking more than 24 hours to complete in one of my databases. I
> begin to research alternatives. Parallel backup reduced the backup time to
> little less than a hour, but it failed almost every time because of
> concurrent queries that generated exclusive locks. It is difficult to
> guarantee that my applications will not issue queries such as drop table,
> alter table, truncate table, create index or drop index for a hour. And I
> prefer not to create controls mechanisms to that end if I can work around
> it.

I certainly understand the value of pg_dump-based backups, but have you
considered doing file-based backups?  That would avoid the need to do
any in-database locking at all, and would give you the ability to do
PITR too.  Further, you could actually restore that backup to another
system and then do a pg_dump there to get a logical representation (and
this would test your physical database backup/restore process too...).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Linking libpq statically to libssl

2017-11-03 Thread Stephen Frost
* Stephen Frost (sfr...@snowman.net) wrote:
> and we've certainly not spent effort that I've seen to try to actually
> make libpq work when multiple versions of libpq are linked into the same
> running backend.

... errr, same running application, that is, not backend.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-03 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Nov 3, 2017 at 1:05 PM, Simon Riggs  wrote:
> > We seem to have a few options for PG11
> >
> > 1. Do nothing, we reject MERGE
> >
> > 2. Implement MERGE for unique index situations only, attempting to
> > avoid errors (Simon OP)
> >
> > 3. Implement MERGE, but without attempting to avoid concurrent ERRORs 
> > (Peter)
> >
> > 4. Implement MERGE, while attempting to avoid concurrent ERRORs in
> > cases where that is possible.
> >
> > Stephen, Robert, please say which option you now believe we should pick.
> 
> I think Peter has made a good case for #3, so I lean toward that
> option.  I think #4 is too much of a non-obvious behavior difference
> between the cases where we can avoid those errors and the cases where
> we can't, and I don't see where #2 can go in the future other than #4.

Agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Linking libpq statically to libssl

2017-11-02 Thread Stephen Frost
Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 10/27/17 08:24, Daniele Varrazzo wrote:
> > I have a problem building binary packages for psycopg2. Binary
> > packages ship with their own copies of libpq and libssl;
> 
> Aside from the advice of "don't do that" ...
> 
> > however if
> > another python package links to libssl the library will be imported
> > twice with conflicting symbols, likely resulting in a segfault (see
> > https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
> > a python script both connects to postgres and opens an https resource.
> 
> ... the standard solutions to this problem are symbol versioning and
> linker flags to avoid making all symbols globally available.  libpq has
> symbol versioning.  Maybe the libssl you are using does not.  Also, for
> example, if you dlopen() with RTLD_LOCAL, the symbols will not be
> globally available, so there should be no conflicts.

Uh, libpq doesn't actually have symbol versioning, at least not on the
installed Ubuntu packages for PG10, as shown by objdump -T :

00011d70 gDF .text  0054  Base PQconnectStart

and we've certainly not spent effort that I've seen to try to actually
make libpq work when multiple versions of libpq are linked into the same
running backend.

I addressed that in my comments also and I don't believe you could
guarantee that things would operate correctly even with symbol
versioning being used.  Perhaps if you versioned your symbols with
something wildly different from what's on the system and hope that
what's on the system never ends up with that version then maybe it'd
work, but that's quite far from the intended use-case of symbol
versioning.

> This needs cooperation from various different parties, and the details
> will likely be platform dependent.  But it's generally a solved problem.

Right, it's solved when there's one source of truth for the library
which is being maintained consistently and carefully.  That allows
multiple versions of libraries to be installed concurrently and linked
into a running process at the same time, because the group maintaining
those different versions of the library make sure that the symbols are
versioned with appropriate versions and they make sure to not break ABI
for those symbols unless they are also changing the version and putting
out a new version.

This was all solved with GLIBC a very long time ago, but it's a heck of
a lot of work to get it right and correct, and that's when you've got a
bunch of people who work on libraries all the time and carefully control
all of the versions which are installed on the OS in coordination.
Trying to do so when you can't control what's happing with the other
library strikes me as highly likely to result in a whole lot of
difficulties.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] ArrayLists instead of List (for some things)

2017-11-02 Thread Stephen Frost
David,

* David Rowley (david.row...@2ndquadrant.com) wrote:
> Our List implementation internally uses linked lists which are
> certainly good for some things, but pretty bad at other things. Linked
> lists are pretty bad when you want to fetch the Nth element, as it
> means looping ever each previous element until you get to the Nth.
> They're also pretty bad for a modern CPU due to their memory access
> pattern.

I can certainly understand this complaint.

> We could get around a few of these problems if Lists were implemented
> internally as arrays, however, arrays are pretty bad if we want to
> delete an item out the middle as we'd have to shuffle everything over
> one spot. Linked lists are much better here... at least they are when
> you already have the cell you want to remove... so we can't expect
> that we can just rewrite List to use arrays instead.

This actually depends on just how you delete the item out of the array,
though there's a trade-off there.  If you "delete" the item by marking
it as "dead" then you don't need to re-shuffle everything, but, of
course, then you have to make sure to 'skip' over those by running down
the array for each list_nth() call- but here's the thing, with a small
list that's all in a tighly packed array, that might not be too bad.
Certainly far better than having to grab random memory on each step and
I'm guessing you'd only actually store the "data" item for a given list
member in the array if it's a integer/oid/etc list, not when it's
actually a pointer being stored in the list, so you're always going to
be working with pretty tightly packed arrays where scanning the list on
the list_nth call really might be darn close to as fast as using an
offset to the actual entry in the array, unless it's a pretty big list,
but I don't think we've actually got lots of multi-hundred-deep lists.

Of course, you'd have to watch out for cases where there's a lot of
deletes happening, since that would effectively make this list longer.

Anyway, just something to consider.  Having to actually store the
"delete" flag would have some cost to it also, unless you happen to
already be storing something per entry and have an extra bit to spare.

Others who are more versed in CPU magics than I am are certainly welcome
to comment on if they think this makes sense to consider, I'm no CPU
architecture guru.

> Anyway, please don't debate the usages of the new type here. As for
> all the above plans, I admit to not having a full handle on them yet.

+1.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-11-02 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Andres Freund  writes:
> > Do we care about people upgrading to unreleased versions? We could do
> > nothing, document it in the release notes, or ???
> 
> Do nothing.

Agreed.  Not much we can do there.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Remove inbound links to sql-createuser

2017-10-31 Thread Stephen Frost
David,

* Stephen Frost (sfr...@snowman.net) wrote:
> * David G. Johnston (david.g.johns...@gmail.com) wrote:
> > Since CREATE USER is officially an alias for CREATE ROLE other parts of the
> > documentation should point to CREATE ROLE, not CREATE USER.  Most do but I
> > noticed when looking at CREATE DATABASE that it did not.  Further searching
> > turned up the usage in client-auth.sgml.  That one is questionable since we
> > are indeed talking about LOGIN roles there but we are already pointing to
> > sql-alterrole instead of sql-alteruser and so changing the create variation
> > to point to sql-createrole seems warranted.
> 
> +1.
> 
> Barring objections, I'll commit this in a bit.

Pushed to master, with a small bit of word-smithing in the CREATE ROLE
docs also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Query regarding permission on table_column%type access

2017-10-31 Thread Stephen Frost
Greetings,

* Neha Sharma (neha.sha...@enterprisedb.com) wrote:
> I have observed that even if the user does not have permission on a
> table(created in by some other user),the function parameter still can have
> a parameter of that table_column%type.

This is because the creation of the table also creates a type of the
same name and the type's permissions are independent of the table's.  I
imagine that you could REVOKE USAGE ON TYPE from the type and deny
access to that type if you wanted to.

I'm not sure that we should change the REVOKE on the table-level to also
mean to REVOKE access to the type automatically (and what happens if you
GRANT the access back for the table..?  Would we need to track that
dependency?) considering that's been the behavior for a very long time.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Adding column_constraint description in ALTER TABLE synopsis

2017-10-31 Thread Stephen Frost
Greetings,

* Lætitia Avrot (laetitia.av...@gmail.com) wrote:
> As Amit Langot pointed out, the column_constraint definition is missing
> whereas it is used in ALTER TABLE synopsis. It can be easily found in the
> CREATE TABLE synopsis, but it's not very user friendly.

Agreed.

> You will find enclosed my patch.

Thanks, this looks pretty reasonable, but did you happen to look for any
other keywords in the ALTER TABLE that should really be in ALTER TABLE
also?

I'm specifically looking at, at least, partition_bound_spec.  Maybe you
could propose an updated patch which addresses that also, and any other
cases you find?

Thanks again!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Remove inbound links to sql-createuser

2017-10-31 Thread Stephen Frost
David,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> Since CREATE USER is officially an alias for CREATE ROLE other parts of the
> documentation should point to CREATE ROLE, not CREATE USER.  Most do but I
> noticed when looking at CREATE DATABASE that it did not.  Further searching
> turned up the usage in client-auth.sgml.  That one is questionable since we
> are indeed talking about LOGIN roles there but we are already pointing to
> sql-alterrole instead of sql-alteruser and so changing the create variation
> to point to sql-createrole seems warranted.

+1.

Barring objections, I'll commit this in a bit.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-31 Thread Stephen Frost
Simon,

* Simon Riggs (si...@2ndquadrant.com) wrote:
> On 30 October 2017 at 19:55, Stephen Frost  wrote:
> > I don't think MERGE should be radically different from other database
> > systems and just syntax sugar over a capability we have.
> 
> I've proposed a SQL Standard compliant implementation that would do
> much more than be new syntax over what we already have.
> 
> So these two claims aren't accurate: "radical difference" and "syntax
> sugar over a capability we have".

Based on the discussion so far, those are the conclusions I've come to.
Saying they're isn't accurate without providing anything further isn't
likely to be helpful.

> > Time changes
> > many things, but I don't think anything's changed in this from the prior
> > discussions about it.
> 
> My proposal is new, that is what has changed.

I'm happy to admit that I've missed something in the discussion, but
from what I read the difference appeared to be primairly that you're
proposing it, and doing so a couple years later.

> At this stage, general opinions can be misleading.

I'd certainly love to see a MERGE capability that meets the standard and
works in much the same way from a user's perspective as the other RDBMS'
which already implement it.  From prior discussions with Peter on
exactly that subject, I'm also convinced that having that would be a
largely independent piece of work from the INSERT .. ON CONFLICT
capability that we have (just as MERGE is distinct from similar UPSERT
capabilities in other RDBMSs).

The goal here is really just to avoid time wasted developing MERGE based
on top of the INSERT .. ON CONFLICT system only to have it be rejected
later because multiple other committers and major contributors have said
that they don't agree with that approach.  If, given all of this
discussion, you don't feel that's a high risk with your approach then by
all means continue on with what you're thinking and we can review the
patch once it's posted.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-30 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Sun, Oct 29, 2017 at 1:19 AM, Simon Riggs  wrote:
> > Nothing I am proposing blocks later work.
> 
> That's not really true.  Nobody's going to be happy if MERGE has one
> behavior in one set of cases and an astonishingly different behavior
> in another set of cases.  If you adopt a behavior for certain cases
> that can't be extended to other cases, then you're blocking a
> general-purpose MERGE.
> 
> And, indeed, it seems that you're proposing an implementation that
> adds no new functionality, just syntax compatibility.  Do we really
> want or need two syntaxes  for the same thing in core?  I kinda think
> Peter might have the right idea here.  Under his proposal, we'd be
> getting something that is, in a way, new.

+1.

I don't think MERGE should be radically different from other database
systems and just syntax sugar over a capability we have.  The
downthread comparison to partitioning isn't accurate either.

There's a reason that we have INSERT .. ON CONFLICT and not MERGE and
it's because they aren't the same thing, as Peter's already explained,
both now and when he and I had exactly this same discussion years ago
when he was working on implementing INSERT .. ON CONFLICT.  Time changes
many things, but I don't think anything's changed in this from the prior
discussions about it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Linking libpq statically to libssl

2017-10-30 Thread Stephen Frost
Daniele,

* Daniele Varrazzo (daniele.varra...@gmail.com) wrote:
> On Fri, Oct 27, 2017 at 2:37 PM, Tom Lane  wrote:
> > Daniele Varrazzo  writes:
> >> I have a problem building binary packages for psycopg2. Binary
> >> packages ship with their own copies of libpq and libssl; however if
> >> another python package links to libssl the library will be imported
> >> twice with conflicting symbols, likely resulting in a segfault (see
> >> https://github.com/psycopg/psycopg2/issues/543). This happens e.g. if
> >> a python script both connects to postgres and opens an https resource.
> >
> > Basically, you're doing it wrong.  Shipping your own copy of libssl,
> > rather than depending on whatever packaging the platform provides,
> > is just asking for pain --- and not only of this sort.  You're also
> > now on the hook to update your package whenever libssl fixes a bug
> > or a security vulnerability, which happens depressingly often.
> >
> > The same applies to libpq, really.  You don't want to be in the
> > business of shipping bits that you are not the originator of.
> >
> > When I worked at Red Hat, there was an ironclad policy against
> > building packages that incorporated other packages statically.
> > I would imagine that other distros have similar policies for
> > similar reasons.  Just because you *can* ignore those policies
> > doesn't mean you *should*.
> 
> Distros do compile the library from source and against the system
> package, and everyone using the package directly can still do so; the
> binary packages are only installed by the Python package manager.

Which, frankly, is why everyone having their own package manager that
ignores the OS package manager is actually rather horrible.  Obviously,
it's done extensively, but it's outright horrible.

> Psycopg is more complex to install than the average Python package (it
> needs python and postgres dev files, pg_config available somewhere
> etc) and a no-dependencies package provides a much smoother
> experience. It also happens that the libpq and libssl versions used
> tend to be more up-to-date than the system one (people can already use
> the new libpq 10 features without waiting for debian packages).

Having randomly different versions of libraries installed through two
different package managers on the same system is not an improvement for
users.  Further, it's not like there aren't already properly built PG10
packages- they were there from the day PG10 was released, and users can
install them using the OS package manager and against the OS provided
version of all the various libraries.

> I am confronted with the reality of Python developers as of 201x's,
> and shipping binary packages has proven generally a positive feature,
> even factoring in the need of shipping updated binary packages when
> the need arises. The benefit of a simple-to-use library is for the
> Postgres project at large, it is not for my personal gain. So while I
> know the shortcomings of binary packages and static libraries, I would
> still be interested in knowing the best way to fix the problem in the
> subject. Feel free to write in private if you want to avoid the public
> shaming.

The way to fix the problem is to not use two different versions of the
same library in the same binary, which you aren't going to be able to
accomplish when you're forcibly pulling in a different version through
your own binary package.  At best you could try to use symbol versioning
to try and differentiate your symbols from those of the system one, but
that would depend on if the system level library is doing symbol
versioning or not and then you'd still have to figure out what to do
when the OS level package updates and possibly ends up with symbols
*and* versions conflicting.  Of course, these issues are generally (or
should be) handled already by the OS level library package managers and
it's no simple thing to do, but it's how things like multiple libdb
versions can be available and even linked into the same running binaries
and things still work there.  Basically, if you can't control both
versions you're just setting yourself (and our users) up for failure, if
not now then in the future.

In short, figure out how to have a completely different OS that's only
provided through your package manager, or get along with the packages
and versions as provided through the OS system (or provide your own
updated versions of the OS packages and get them installed that matches
what your packages are built against).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] replace GrantObjectType with ObjectType

2017-10-12 Thread Stephen Frost
Peter,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> It seems to me that having ACL_OBJECT_* symbols alongside OBJECT_*
> symbols is not useful and leads to duplication.  Digging around in the
> past suggests that we used to have a lot of these command-specific
> symbols but got rid of them over time, except that the ACL stuff was
> never touched.  The attached patch accomplishes that.

I'm generally supportive of this, but I'm not entirely thrilled with how
this ends up conflating TABLEs and RELATIONs.  From the GRANT
perspective, there's no distinction, and that was clear from the
language used and how things were handled, but the OBJECT enum has that
distinction.  This change just makes VIEWs be OBJECT_TABLE even though
they actually aren't tables and there even is an OBJECT_VIEW value.
This commit may be able to grok that and manage it properly, but later
hackers might miss that.

I would also suggest that the naming be consistent with the other bits
of the GRANT system (eg: ACL_ALL_RIGHTS_NAMESPACE would be changed to
ACL_ALL_RIGHTS_SCHEMA, to match OBJECT_SCHEMA).

I also echo the concern raised by Alvaro.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Discussion on missing optimizations

2017-10-12 Thread Stephen Frost
Laurenz,

* Laurenz Albe (laurenz.a...@cybertec.at) wrote:
> Robert Haas wrote:
> > One trick that some system use is avoid replanning as much as we do
> > by, for example, saving plans in a shared cache and reusing them even
> > in other sessions.  That's hard to do in our architecture because the
> > controlling GUCs can be different in every session and there's not
> > even any explicit labeling of which GUCs control planner behavior. But
> > if you had it, then extra planning cycles would be, perhaps, more
> > tolerable.
> 
> >From my experience with Oracle I would say that that is a can of worms.
> 
> Perhaps it really brings the performance benefits they claim, but
> a) there have been a number of bugs where the wrong plan got used
>(you have to keep several plans for the same statement around,
>since - as you say - different sessions have different environments)

I'm not sure this is really a fair strike against this concept- bugs
happen, even bugs in planning, and what we need is more testing, imv, to
reduce the number and minimize the risk as much as we can.

> b) it is a frequent problem that this shared memory area grows
>too large if the application does not use prepared statements
>but dynamic SQL with varying constants.

This just requires that the memory area be managed somehow, not unlike
how our shared buffers have to deal with evicting out old pages.
There's a challenge there around making sure that it doesn't make the
performance of the system be much worse than not having a cache at all,
naturally, but given that a lot of people use pg_stat_statements to good
effect and without much in the way of complaints, it seems like it might
be possible make it work reasonably (just imagining a generic plan being
attached to pg_stat_statements with some information about if the
generic plan works well or not, blah blah, hand waving goes here).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] postgres_fdw super user checks

2017-10-12 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Oct 5, 2017 at 1:02 PM, Jeff Janes  wrote:
> > I don't see a reason to block a directly-logged-in superuser from using a
> > mapping.  I asked in the closed list whether the current (released)
> > behavior was a security bug, and the answer was no.  And I don't know why
> > else to block superusers from doing something other than as a security bug.
> > Also it would create a backwards compatibility hazard to revoke the ability
> > now.
> 
> Well, my thought was that we ought to be consistent about whose
> authorization matters.  If we're using the view owner's credentials in
> general, then we also (defensibly, anyway) ought to use the view
> owner's superuser-ness to decide whether to enforce this restriction.
> Using either the view owner's superuser-ness or the session user's
> superuser-ness kind of puts you halfway in the middle.  The view
> owner's rights are what matters mostly, but your own rights also
> matter a little bit around the edges.  That's a little strange.
> 
> I don't have violently strong opinions about this - does anyone else
> have a view?

I haven't been following this closely, but I tend to agree with you- if
we're using the view owner's privileges then that's what everything
should be based on, not whether the caller is a superuser or not.

Consider a security-definer function.  Clearly, such a function should
always run as the owner of the function, even if the caller is a
superuser.  Running as the caller instead of the owner of the function
when the caller is a superuser because that would allow the function to
access more clearly isn't a good idea, imv.

Yes, that means that sometimes when superusers run things they get
permission denied errors.  That's always been the case, and is correct.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] On markers of changed data

2017-10-10 Thread Stephen Frost
Alvaro,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Greg Stark wrote:
> 
> > The general shape of what I would like to see is some log which lists
> > where each checkpoint starts and ends and what blocks are modified
> > since the previous checkpoint. Then to generate an incremental backup
> > from any point in time to the current you union all the block lists
> > between them and fetch those blocks. There are other ways of using
> > this aside from incremental backups on disk too -- you could imagine a
> > replica that has fallen behind requesting the block lists and then
> > fetching just those blocks instead of needing to receive and apply all
> > the wal.
> 
> Hmm, this sounds pretty clever.  And we already have the blocks touched
> by each record thanks to the work for pg_rewind (so we don't have to do
> any nasty tricks like the stuff Suzuki-san did for pg_lesslog, where
> each WAL record had to be processed individually to know what blocks it
> referenced), so it shouldn't be *too* difficult ...

Yeah, it sounds interesting, but I was just chatting w/ David about it
and we were thinking about how checkpoints are really rather often done,
so you end up with quite a few of these lists being out there.

Now, if the lists were always kept in a sorted fashion, then perhaps we
would be able to essentially merge-sort them all back together and
de-dup that way, but even then, you're talking about an awful lot if
you're looking at daily incrementals- that's 288 standard 5-minute
checkpoints, each with some 128k pages changed, assuming max_wal_size of
1GB, and I think we can all agree that the default max_wal_size is for
rather small systems.  That ends up being something around 2MB per
checkpoint to store the pages in or half a gig per day just to keep
track of the pages which changed in each checkpoint across that day.

There's a bit of hand-waving in there, but I don't think it's all that
much to reach a conclusion that this might not be the best approach.
David and I were kicking around the notion of a 'last LSN' which is kept
on a per-relation basis, but, of course, that ends up not really being
granular enough, and would likely be a source of contention unless we
could work out a way to make it "lazy" updated somehow, or similar.

> > It would also be useful for going in the reverse direction: look up
> > all the records (or just the last record) that modified a given block.
> 
> Well, a LSN map is what I was suggesting.

Not sure I entirely followed what you were getting at here..?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] On markers of changed data

2017-10-08 Thread Stephen Frost
Andrey,

* Andrey Borodin (x4...@yandex-team.ru) wrote:
> But my other question still seems unanswered: can I use LSN logic for 
> incrementing FSM and VM? Seems like most of the time there is valid LSN

I haven't gone and audited it myself, but I would certainly expect you
to be able to use the LSN for everything which is WAL'd.  If you have
cases where that's not the case, it'd be useful to see them.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] On markers of changed data

2017-10-07 Thread Stephen Frost
Alvaro, Michael,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Michael Paquier wrote:
> > That’s actually what pg_rman is doing for what it calls incremental
> > backups (perhaps that would be differential backup in PG
> > terminology?), and the performance is bad as you can imagine. We could
> > have a dedicated LSN map to do such things with 4 bytes per page. I am
> > still not convinced that this much facility and the potential bug
> > risks are worth it though, Postgres already knows about differential
> > backups if you shape it as a delta of WAL segments. I think that, in
> > order to find a LSN map more convincing, we should find first other
> > use cases where it could become useful. Some use cases may pop up with
> > VACUUM, but I have not studied the question hard enough...
> 
> The case I've discussed with barman developers is a large database
> (couple dozen of TBs should be enough) where a large fraction (say 95%)
> is read-only but there are many changes to the active part of the data,
> so that WAL is more massive than size of active data.

Yes, we've seen environments like that also.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] On markers of changed data

2017-10-06 Thread Stephen Frost
Tom, Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Oct 6, 2017 at 11:22 PM, Tom Lane  wrote:
> > Andrey Borodin  writes:
> >> Is it safe to use file modification time to track that file were changes
> >> since previous backup?
> >
> > I'd say no:
> >
> > 1. You don't know the granularity of the filesystem's timestamps, at least
> > not without making unportable assumptions.
> >
> > 2. There's no guarantee that the system clock can't be set backwards.
> >
> > 3. It's not uncommon for filesystems to have optimizations whereby they
> > skip or delay some updates of file mtimes.  (I think this is usually
> > optional, but you couldn't know whether it's turned on.)
> >
> > #2 is probably the worst of these problems.
> 
> Or upwards. A simple example of things depending on clock changes is
> for example VM snapshotting. Any logic not depending on monotonic
> timestamps, with things like clock_gettime(CLOCK_MONOTONIC) is a lot
> of fun to investigate until you know that they are not using any
> monotonic logic... So the answer is *no*, do not depend on FS-level
> timestamps. The only sane method for Postgres is really to scan the
> page header LSNs, and of course you already know that.

Really, these comments appear, at least to me, to be based on an
incorrect assumption that's only considering how tools like rsync use
mtime.

No, you can't trust rsync-based backups that look at mtime and only copy
if the mtime of the source file is currently 'more recent' than the
mtime of the destination file.

That doesn't mean that mtime can't be used to perform incremental
backups using PG, but care has to be taken when doing so to minimize the
risk of a file getting skipped that should have been copied.

There's a few things to do to minimize that risk:

Use mtime only as an indication of if the file changed from the last
time you looked at it- doesn't matter if the mtime on the file is newer
or older.  If the mtime is *different*, then you can't trust that the
contents are the same and you need to include it in the backup.  Of
course, combine this with checking the file size has changed, but in PG
there's lots of files of the same size, so that's not a terribly good
indicator.

Further, you have to get the mtime of all the files *before* you start
backing them up.  If you take the mtime of the file at the time you
start actually copying it, then it could possibly be modified while you
copy it but without the mtime being updated from when you initially
pulled it (and that's not even talking about the concerns around the
clock time moving back and forth).  To address the granularity concern,
you should also be sure to wait after you collect all the mtimes but
before actually starting the backup to the level of granularity.  Any
optimization which delays setting the mtime would, certainly, still get
around to updating the mtime before the next backup runs and therefore
that file might get copied even though it hadn't changed, but that's
still technically correct, just slightly more work.  Lastly, don't trust
any times which are from after the time that you collected the mtimes-
either during the initial backup or when you are doing the subsequent
incremental.  Any file whose mtime is different *or* is from after the
time the mtimes were collected should be copied.

This isn't to say that there isn't some risk to using mtime, there still
is- if a backup is made of a file and its mtime collected, and then time
moves backwards, and the file is modified again at the *exact* same
time, leading the 'new' mtime to be identical to the 'old' mtime while
the file's contents are different, and that file is not subsequently
modified before the next backup happens, then the file might not be
included in the backup even though it should be.  

Other risks are just blatent corruption happening in the mtime field, or
a kernel-level bug that doesn't update mtime when it should, or the
kernel somehow resetting the mtime back after the file has been changed,
or someone explicitly setting the mtime back after changing a file, or
perhaps other such attacks, though eliminating all of those risks isn't
possible (regardless of solution- someone could go change the LSN on a
page too, for example, and foil a tool which was based on that).

These are risks which I'd love to remove, but they also strike me as
quite small and ones which practical users are willing to accept for
their incremental and differential backups, though it's a reason to also
take full backups regularly.

As Alvaro notes downthread, it's also the only reasonable option
available today.  It'd be great to have a better solution, and perhaps
one which summarizes the LSNs in each file would work and be better, but
that would also only be available for PG11, at the earliest.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [COMMITTERS] pgsql: Fix freezing of a dead HOT-updated tuple

2017-10-06 Thread Stephen Frost
Robert,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Michael Paquier wrote:
> > On Fri, Oct 6, 2017 at 10:18 PM, Alvaro Herrera  
> > wrote:
> > > Wood, Dan wrote:
> > >> Yes, I’ve been testing 9.6.  I’ll try Alvaro’s patch today.
> > >>
> > >> I would prefer to focus on either latest 9X or 11dev.
> > >
> > > I tested my patch on 9.4 and 9.5 today and it seems to close the problem
> > > (with the patch, I waited 10x as many iterations as it took for the
> > > problem to occur ~10 times without the patch), but I can reproduce a
> > > problem in 9.6 with my patch installed.  There must be something new in
> > > 9.6 that is causing the problem to reappear.
> > 
> > The freeze visibility map has been introduced in 9.6... There could be
> > interactions on this side.
> 
> Ah, thanks for the tip.  I hope the authors of that can do the gruntwork
> of researching this problem there, then.  I'll go commit what I have
> now.

I don't doubt you're watching this thread too, but just to be 110% sure
that we don't end up with the November releases still having this issue,
I'm adding you to the CC on this thread as the one who did the freeze
visibility map work.  Depending on hope here is a bit too squishy for me
when we're talking about corruption issues.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Possible SSL improvements for a newcomer to tackle

2017-10-03 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Magnus Hagander  writes:
> > On Tue, Oct 3, 2017 at 6:33 AM, Tom Lane  wrote:
> >> I'm not an SSL expert, so insert appropriate grain of salt, but AIUI the
> >> question is what are you going to verify against?
> 
> > One way to do it would be to default to the "system global certificate
> > store", which is what most other SSL apps do. For example on a typical
> > debian/ubuntu, that'd be the store in /etc/ssl/certs/ca-certificates.crt.
> > Exactly where to find them would be distribution-specific though, and we
> > would need to actually add support for a second certificate store. But that
> > would probably be a useful feature in itself.
> 
> Maybe.  The impression I have is that it's very common for installations
> to use a locally-run CA to generate server and client certs.  I would not
> expect them to put such certs into /etc/ssl/certs.  But I suppose there
> might be cases where you would actually pay for a universally-valid cert
> for a DB server ...

In many larger enterprises, they actually deploy systems with their own
CA installed into the system global certificate store (possibly removing
certain other CAs from that set too, and distributing their own version
of the relevant package that maintains the CA set).

I agree with Magnus that most other SSL apps do default to the system
global cert store and it's generally what's expected.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Logging idle checkpoints

2017-10-03 Thread Stephen Frost
Greetings,

* Kyotaro HORIGUCHI (horiguchi.kyot...@lab.ntt.co.jp) wrote:
> At Tue, 3 Oct 2017 10:23:08 +0900, Michael Paquier 
>  wrote in 
> 
> > On Tue, Oct 3, 2017 at 12:01 AM, Stephen Frost  wrote:
> > > I certainly don't care for the idea of adding log messages saying we
> > > aren't doing anything just to match a count that's incorrectly claiming
> > > that checkpoints are happening when they aren't.
> > >
> > > The down-thread suggestion of keeping track of skipped checkpoints might
> > > be interesting, but I'm not entirely convinced it really is.  We have
> > > time to debate that, of course, but I don't really see how that's
> > > helpful.  At the moment, it seems like the suggestion to add that column
> > > is based on the assumption that we're going to start logging skipped
> > > checkpoints and having that column would allow us to match up the count
> > > between the new column and the "skipped checkpoint" messages in the logs
> > > and I can not help but feel that this is a ridiculous amount of effort
> > > being put into the analysis of something that *didn't* happen.
> > 
> > Being able to look at how many checkpoints are skipped can be used as
> > a tuning indicator of max_wal_size and checkpoint_timeout, or in short
> > increase them if those remain idle.
> 
> We ususally adjust the GUCs based on how often checkpoint is
> *executed* and how many of the executed checkpoints have been
> triggered by xlog progress (or with shorter interval than
> timeout). It seems enough. Counting skipped checkpoints gives
> just a rough estimate of how long the system was getting no
> substantial updates. I doubt that users get something valuable by
> counting skipped checkpoints.

Yeah, I tend to agree.  I don't really see how counting skipped
checkpoints helps to size max_wal_size or even checkpoint_timeout.  The
whole point here is that nothing is happening and if nothing is
happening then there's no real need to adjust max_wal_size or
checkpoint_timeout or, well, much of anything really..

> > Since their introduction in
> > 335feca4, m_timed_checkpoints and m_requested_checkpoints track the
> > number of checkpoint requests, not if a checkpoint has been actually
> > executed or not, I am not sure that this should be changed after 10
> > years. So, to put it in other words, wouldn't we want a way to track
> > checkpoints that are *executed*, meaning that we could increment a
> > counter after doing the skip checks in CreateRestartPoint() and
> > CreateCheckPoint()?
> 
> This sounds reasonable to me.

I agree that tracking executed checkpoints is valuable, but, and perhaps
I'm missing something, isn't that the same as tracking non-skipped
checkpoints?  I suppose we could have both, if we really feel the need,
provided that doesn't result in more work or effort being done than
simply keeping the count.  I'd hate to end up in a situation where we're
writing things out unnecessairly just to keep track of checkpoints that
were requested but ultimately skipped because there wasn't anything to
do.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] list of credits for release notes

2017-10-02 Thread Stephen Frost
Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Fri, Sep 29, 2017 at 12:00:05PM -0400, Peter Eisentraut wrote:
> > On 9/29/17 11:35, Robert Haas wrote:
> > > On Wed, Sep 27, 2017 at 8:29 PM, Michael Paquier
> > >  wrote:
> > >> Looking at this list, the first name is followed by the family name,
> > >> so there are inconsistencies with some Japanese names:
> > >> - Fujii Masao should be Masao Fujii.
> > >> - KaiGai Kohei should be Kohei Kaigai.
> > > 
> > > But his emails say Fujii Masao, not Masao Fujii.
> > > 
> > > KaiGai's case is a bit trickier, as his email name has changed over time.
> > 
> > Yes, I used the form that the person used in their emails.
> 
> How should this be handled for the Postgres 11 release notes?

Ideally, we would let the individuals choose how to be recognized in
release notes, and anywhere else we recognize them.  We have the start
of that in https://postgresql.org/account/profile but that isn't very
easily tied to things in the commit history or elsewhere, yet.  I'd
suggest that we try to improve on that by:

- Allowing anyone to include contributor information in their .Org
  account, even if they aren't listed on the Contributors page.

- Add in a field along the lines of "Name to be used publicly" and let
  them decide what they'd like, possibly even with the option to not be
  publicly listed at all.

- Add tracking of multiple email addresses into the .Org profile
  (somehow sync'd with the pglister system)

- Start including contributor email addresses in commit messages along
  with names.

I don't think we're that far off from this being possible to do in a
more formal way that minimizes the risk of getting things incorrect,
missing someone, or mis-attributing something.  This all involves mostly
work on the .Org system, which we do have some folks working on now but
is also open source and it certainly wouldn't hurt to have more people
involved, if there are others who are interested.  The place to actually
start the discussion of such changes would be -www though.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Logging idle checkpoints

2017-10-02 Thread Stephen Frost
Vik, all,

* Vik Fearing (vik.fear...@2ndquadrant.com) wrote:
> I recently had a sad because I noticed that checkpoint counts were
> increasing in pg_stat_bgwriter, but weren't accounted for in my logs
> with log_checkpoints enabled.

> After some searching, I found that it was the idle checkpoints that
> weren't being logged.  I think this is a missed trick in 6ef2eba3f57.

> Attached is a one-liner fix.  I realize how imminent we are from
> releasing v10 but I hope there is still time for such a minor issue as this.

Idle checkpoints aren't, well, really checkpoints though.  If anything,
seems like we shouldn't be including skipped checkpoints in the
pg_stat_bgwriter count because we aren't actually doing a checkpoint.

I certainly don't care for the idea of adding log messages saying we
aren't doing anything just to match a count that's incorrectly claiming
that checkpoints are happening when they aren't.

The down-thread suggestion of keeping track of skipped checkpoints might
be interesting, but I'm not entirely convinced it really is.  We have
time to debate that, of course, but I don't really see how that's
helpful.  At the moment, it seems like the suggestion to add that column
is based on the assumption that we're going to start logging skipped
checkpoints and having that column would allow us to match up the count
between the new column and the "skipped checkpoint" messages in the logs
and I can not help but feel that this is a ridiculous amount of effort
being put into the analysis of something that *didn't* happen.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] [BUGS] BUG #14825: enum type: unsafe use?

2017-09-26 Thread Stephen Frost
Tom, all,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Bruce Momjian  writes:
> > On Tue, Sep 26, 2017 at 04:07:02PM -0400, Tom Lane wrote:
> >> Any other votes out there?
> 
> > Well, I was concerned yesterday that we had a broken build farm so close
> > to release. (I got consistent regression failures.)  I think PG 11 would
> > be better for this feature change, so I support reverting this.
> 
> I'll take the blame for (most of) yesterday's failures in the v10
> branch, but they were unrelated to this patch --- they were because
> of that SIGBUS patch I messed up.  So that doesn't seem like a very
> applicable argument.  Still, it's true that this seems like the most
> consequential patch that's gone into v10 post-RC1, certainly so if
> you discount stuff that was back-patched further than v10.

I've not been following along very closely- are we sure that ripping
this out won't be worse than dealing with it in-place?  Will pulling it
out also require a post-RC1 catversion bump?

If we can pull it out without bumping catversion and with confidence
that it won't cause more problems then, as much as I hate it, I'm
inclined to say we pull it out and come back to it in v11.  I really
don't like the idea of a post-rc1 catversion bump and it doesn't seem
like there's a good solution here that doesn't involve more changes and
most likely a catversion bump.  If it was reasonably fixable with only
small/local changes and without a catversion bump then I'd be more
inclined to keep it, but I gather from the discussion that's not the
case.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Row Level Security Documentation

2017-09-26 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 26 September 2017 at 00:42, Stephen Frost  wrote:
> > That's a relatively minor point, however, and I do feel that this patch
> > is a definite improvement.  Were you thinking of just applying this for
> > v10, or back-patching all or part of it..?
> 
> I was planning on back-patching it to 9.5, taking out the parts
> relating the restrictive policies as appropriate. Currently the 9.5
> and 9.6 docs are identical, as are 10 and HEAD, and 9.5/9.6 only
> differs from 10/HEAD in a couple of places where they mention
> restrictive policies. IMO we should stick to that, making any
> improvements available in the back-branches. I was also thinking the
> same about the new summary table, although I haven't properly reviewed
> that yet.

Makes sense to me.

+1

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Row Level Security Documentation

2017-09-25 Thread Stephen Frost
Dean,

* Dean Rasheed (dean.a.rash...@gmail.com) wrote:
> On 5 August 2017 at 10:03, Fabien COELHO  wrote:
> > Patch applies cleanly, make html ok, new table looks good to me.
> 
> So I started looking at this patch, but before even considering the
> new table proposed, I think there are multiple issues that need to be
> resolved with the current docs:
> 
> Firstly, there are 4 separate places in the current CREATE POLICY docs
> that say that multiple policies are combined using OR, which is of
> course no longer correct, since PG10 added RESTRICTIVE policy support.

Ah, indeed, you're right there, those should have been updated to
indicate that it was the default or perhaps clarify the difference, but
I agree that doing so all over the place isn't ideal.

> In fact, it wasn't even strictly correct in 9.5/9.6, because if
> multiple different types of policy apply to a single command (e.g.
> SELECT and UPDATE policies, being applied to an UPDATE command), then
> they are combined using AND. Rather than trying to make this correct
> in 4 different places, I think there should be a single new section of
> the docs that explains how multiple policies are combined. This will
> also reduce the number of places where the pre- and post-v10 docs
> differ, making them easier to maintain.

Agreed, that makes a lot of sense to me.

> The proposed patch distinguishes between UPDATE/DELETE commands that
> have WHERE or RETURNING clauses, and those that don't, claiming that
> the former require SELECT permissions and the latter don't. That's
> based on a couple of statements from the current docs, but it's not
> entirely accurate. For example, "UPDATE foo SET a=NULL WHERE true
> RETURNING now()" does not require SELECT permissions despite having
> both WHERE and RETURNING clauses (OK, I admit that's a rather
> contrived example, but still...), and conversely (a more realistic
> example) "UPDATE foo SET a=b+c" does require SELECT permissions
> despite not having WHERE or RETURNING clauses. I think we need to try
> to be more precise about when SELECT permissions are required.

Agreed.

> In the notes section, there is a note about there needing to be at
> least one permissive policy for restrictive policies to take effect.
> That's well worth pointing out, however, I fear that this note is
> buried so far down the page (amongst some other very wordy notes) that
> readers will miss it. I suggest moving it to the parameters section,
> where permissive and restrictive policies are first introduced,
> because it's a pretty crucial fact to be aware of when using these new
> types of policy.

Ok.

> Attached is a proposed patch to address these issues, along with a few
> other minor tidy-ups.

I've taken a look through this and generally agree with it.  I'll note
that the bits inside  ...  tags should be
consistently capitalized (you had one case of 'All' that I noticed) and
I wonder if it'd be better to have the "simple" description *first*
instead of later on, eg, where you have "Thus the overall result is
that" we might want to try and reword things to decribe it as "Overall,
it works thusly, ..., and the specifics are ...".

That's a relatively minor point, however, and I do feel that this patch
is a definite improvement.  Were you thinking of just applying this for
v10, or back-patching all or part of it..?  For my 2c, at least, I'm
pretty open to clarifying things in the back-branches (and we have
technically had restrictive policies for a while, they just required
using an extension, so even those pieces are relevant for older
versions, but might need additional caveats...).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reading backup label file for checkpoint and redo location during crash recovery

2017-09-25 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Mon, Sep 25, 2017 at 7:43 PM, Stephen Frost  wrote:
> > * Satyanarayana Narlapuram (satyanarayana.narlapu...@microsoft.com) wrote:
> > > During crash recovery, last checkpoint record information is obtained
> > from the backup label if present, instead of getting it from the control
> > file. This behavior is causing PostgreSQL database cluster not to come up
> > until the backup label file is deleted (as the error message says).
> > >
> > > if (checkPoint.redo < checkPointLoc)
> > >   {
> > >  if (!ReadRecord(xlogreader,
> > checkPoint.redo, LOG, false))
> > > ereport(FATAL,
> > >   (errmsg("could not
> > find redo location referenced by checkpoint record"),
> > >   errhint("If you are
> > not restoring from a backup, try removing the file \"%s/backup_label\".",
> > DataDir)));
> > >   }
> > >
> > > If we are recovering from a dump file, reading from the backup label
> > files makes sense as the control file could be archived after a few
> > checkpoints. But this is not the case for crash recovery, and is always
> > safe to read the checkpoint record information from the control file.
> > > Is this behavior kept this way as there is no clear way to distinguish
> > between the recovery from the dump and the regular crash recovery?
> >
> > This is why the exclusive backup method has been deprecated in PG10 in
> > favor of the non-exclusive backup method, which avoids this by not
> > creating a backup label file (it's up to the backup software to store
> > the necessary information and create the file for use during recovery).
> 
> Actally, it was deprecated already in 9.6, not just 10.

Whoops, right.  Thanks for the clarification. :)

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Built-in plugin for logical decoding output

2017-09-25 Thread Stephen Frost
Andres, all,

* Andres Freund (and...@anarazel.de) wrote:
> On 2017-09-25 19:32:29 +0200, Petr Jelinek wrote:
> > On 25/09/17 19:26, Tom Lane wrote:
> > > Alvaro Hernandez  writes:
> > >> In my opinion, logical decoding plugins that don't come with core 
> > >> are close to worthless (don't get me wrong):
> > > 
> > >> - They very unlikely will be installed in managed environments (an area 
> > >> growing significantly).
> > >> - As anything that is not in core, raises concerns by users.
> > >> - Distribution and testing are non-trivial: many OS/archs combinations.
> > > 
> > > The problem with this type of argument is that it leads directly to the
> > > conclusion that every feature users want must be in core.  We can't
> > > accept that conclusion, because we simply do not have the manpower or
> > > infrastructure to maintain a kitchen-sink, Oracle-sized code base.
> > > I think we're already more or less at the limit of the feature set we can
> > > deal with :-(.  The entire point of the output-plugin design was to allow
> > > useful replication stuff to be done outside of core; we need to make use
> > > of that.  (If that means we need better docs, then yeah, we'd better get
> > > that part done.)
> 
> I obvoiusly agree that not every possible output plugin should be put
> into core. A number of them have significant external dependencies
> and/or are specialized for a certain environment.
> 
> However I don't think that should mean that there's no possible output
> plugin that could/should be integrated into core. And I think Petr's
> right here:
> 
> > There is already about 3 million output plugins out there so I think we
> > did reasonable job there. The fact that vast majority of that are
> > various json ones gives reasonable hint that we should have that one in
> > core though.

I tend to agree with Andres & Petr here as well.  No, we certainly don't
want to add features that aren't going to be used much to core or which
stretch the resources we have beyond what we're able to handle, but
having a good, solid, output plugin that works well and can be built
upon should be included into core.  PG is often deployed in complex
ecosystems where we need to work with other systems and this is an
important part of that.

Also, to some extent, I'm hopeful that being both open to new features,
when they make sense, and providing more ways for other systems to
work with PG, will lead to more contributions and hopefully regular
contributors who can help us maintain the code base as it continues to
grow.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Reading backup label file for checkpoint and redo location during crash recovery

2017-09-25 Thread Stephen Frost
Greetings Satya,

* Satyanarayana Narlapuram (satyanarayana.narlapu...@microsoft.com) wrote:
> During crash recovery, last checkpoint record information is obtained from 
> the backup label if present, instead of getting it from the control file. 
> This behavior is causing PostgreSQL database cluster not to come up until the 
> backup label file is deleted (as the error message says).
> 
> if (checkPoint.redo < checkPointLoc)
>   {
>  if (!ReadRecord(xlogreader, checkPoint.redo, 
> LOG, false))
> ereport(FATAL,
>   (errmsg("could not find 
> redo location referenced by checkpoint record"),
>   errhint("If you are not 
> restoring from a backup, try removing the file \"%s/backup_label\".", 
> DataDir)));
>   }
> 
> If we are recovering from a dump file, reading from the backup label files 
> makes sense as the control file could be archived after a few checkpoints. 
> But this is not the case for crash recovery, and is always safe to read the 
> checkpoint record information from the control file.
> Is this behavior kept this way as there is no clear way to distinguish 
> between the recovery from the dump and the regular crash recovery?

This is why the exclusive backup method has been deprecated in PG10 in
favor of the non-exclusive backup method, which avoids this by not
creating a backup label file (it's up to the backup software to store
the necessary information and create the file for use during recovery).

Please see:

https://www.postgresql.org/docs/10/static/continuous-archiving.html

In particular, section 25.3.3.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] [HACKERS] USER Profiles for PostgreSQL

2017-09-20 Thread Stephen Frost
Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Tue, Sep 19, 2017 at 01:28:11PM -0400, Stephen Frost wrote:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > > chiru r  writes:
> > > > We are looking  for User profiles in ope source PostgreSQL.
> > > > For example, If a  user password failed n+ times while login ,the user
> > > > access has to be blocked few seconds.
> > > > Please let us know, is there any plan to implement user profiles in 
> > > > feature
> > > > releases?.
> > > 
> > > Not particularly.  You can do that sort of thing already via PAM,
> > > for example.
> > 
> > Ugh, hardly and it's hokey and a huge pain to do, and only works on
> > platforms that have PAM.
> > 
> > Better is to use an external authentication system (Kerberos, for
> > example) which can deal with this, but I do think this is also something
> > we should be considering for core, especially now that we've got a
> > reasonable password-based authentication method with SCRAM.
> 
> Does LDAP do this too?

Active Directory does this, with Kerberos as the authentication
mechanism.  Straight LDAP might also support it, but I wouldn't
recommend it because it's really insecure as the PG server will see the
user's password in the cleartext (and it may be sent in cleartext across
the network too unless careful steps are taken to make sure that the
client only ever connects over SSL to a known trusted and verified
server).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [GENERAL] [HACKERS] USER Profiles for PostgreSQL

2017-09-19 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> chiru r  writes:
> > We are looking  for User profiles in ope source PostgreSQL.
> > For example, If a  user password failed n+ times while login ,the user
> > access has to be blocked few seconds.
> > Please let us know, is there any plan to implement user profiles in feature
> > releases?.
> 
> Not particularly.  You can do that sort of thing already via PAM,
> for example.

Ugh, hardly and it's hokey and a huge pain to do, and only works on
platforms that have PAM.

Better is to use an external authentication system (Kerberos, for
example) which can deal with this, but I do think this is also something
we should be considering for core, especially now that we've got a
reasonable password-based authentication method with SCRAM.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Trouble with amcheck

2017-09-14 Thread Stephen Frost
Andres,

* Andres Freund (and...@anarazel.de) wrote:
> On 2017-09-15 02:22:49 +, Douglas Doole wrote:
> > Thanks all. Making and installing the contribs got me rolling again. (I
> > tried "make world" but ran into trouble with the XML docs. But that's pain
> > and suffering for another day.)
> > 
> > I'd agree that "make installcheck-world" should imply that all prereqs are
> > met - that's certainsly the normal behaviour for make.
> 
> I'm very unconvinced by this, given that one use of installcheck is to
> run against an existing server. For which one might not even have access
> to the relevant directories to install extensions into.

Sure, but if the extensions aren't in place and you're trying to run
make installcheck-world, it's not like it's somehow going to succeed.

Failing earlier on the install seems like a reasonable thing to do
rather than failing later halfway through the check process.

Now, that said, perhaps a bit more smarts would be in order here to,
instead, check that the extensions are available before trying to run
the checks for them.  I'm thinking about something like this: check if
the extension is available and, if not, skip the check of that module,
with a warning or notification that it was skipped because it wasn't
available.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-14 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Sep 15, 2017 at 10:21 AM, Stephen Frost  wrote:
> > No, one of the baseline requirements of pg_upgrade is to *not* screw
> > with the existing cluster.  Removing its WAL or "cleaning it up"
> > definitely seems like it's violating that principle.
> 
> Not necessarily. Using --link it is game over for rollback once the
> new cluster has started. 

Yes, but not *before* the new cluster has started.

> My reference to clean up the contents of
> pg_xlog refers to the moment the new cluster has been started. 

Alright, that's technically beyond the scope of pg_upgrade then...

> This
> could be achieved with a pre-upgrade flag present on-disk that the
> startup process looks at to perform actions needed. This flag of
> course needs to depend on the version of the binary used to start
> Postgres, and is written by pg_upgrade itself which links the new
> cluster's pg_wal into the location of the old cluster. In short, if an
> upgrade to PG version N is done, and that the flag related to the
> cleanup of N is found, then actions happen. If Postgres is started
> with a binary version N-1, nothing happens, and existing flags are
> cleaned up. What I am not sure is if such extra machinery is worth
> doing as things can be saved by just moving the soft link position of
> the cluster after running pg_upgrade and before starting the new
> cluster.

Ugh.  That strikes me as far more complication than would be good for
this..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Trouble with amcheck

2017-09-14 Thread Stephen Frost
Peter, Douglas,

* Peter Geoghegan (p...@bowt.ie) wrote:
> On Thu, Sep 14, 2017 at 5:03 PM, Douglas Doole  wrote:
> > I just cloned PostgreSQL to a new machine today (Ubuntu 17.04). "make
> > install" and "make check-world" run fine but "make installcheck-world" is
> > having trouble with amcheck:
> >
> > In contrib/amcheck/results:
> >
> > CREATE EXTENSION amcheck;
> > ERROR:  could not open extension control file
> > "/home/doole/pgCommunity/install/share/postgresql/extension/amcheck.control":
> > No such file or directory
> >
> > I expect I'm missing something in the machine set up, but I'm stumped as to
> > what.
> 
> I think you need to build and install contrib, so that it is available
> to the server that you're running an installcheck against. amcheck is
> alphabetically first among contrib modules that have tests, IIRC.

Yes, I was working with someone earlier today who ran into exactly the
same issue.  If you don't 'make world' or make the individual contrib
modules, then 'make installcheck-world' isn't going to work.

I do think it'd be nice if we could provide a better error message in
such a case..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Process startup infrastructure is a mess

2017-09-14 Thread Stephen Frost
Andres, Simon,

* Andres Freund (and...@anarazel.de) wrote:
> On 2017-09-15 01:06:54 +0100, Simon Riggs wrote:
> > If we add something to an area then its a good time to refactor it
> > since we were going to get bugs anyway.
> 
> We've added something to the area on a regular basis. As in last in
> v10. The fundamental problem with your argument is that that makes it
> impossible for most contributors to get feature in that touch these
> parts of the code - many neither have the time nor the knowledge to do
> such a refactoring. By your argument we should have rejected logical
> replication for v10 because it complicated this further, without
> cleaning it up.
> 
> Besides that, it also makes it harder to develop new features, not just
> to get them in.

I'm definitely in agreement with Andres on this one.  This isn't
refactoring of little-to-never changed code, it's refactoring bits of
the system which are changed with some regularity and looks likely to
continue to need change as we add more features moving forward, and
perhaps add greater controls over process startup.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-14 Thread Stephen Frost
Michael, all,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Fri, Sep 15, 2017 at 8:23 AM, Andreas Joseph Krogh
>  wrote:
> > I tested upgrading from 9.6 to 10 now, using pg_upgrade, and pg_upgrade
> > creates the new data-dir with pg_wal "in it" (just like regular initdb), so
> > pg_upgrade seems not to care about where the old version's pg_xlog was. You
> > have to move (by symlinking) pg_wal to a separate location manually *after*
> > running pg_upgrade on the master.
> 
> That's true, this should definitely be mentioned in the documentation.

Uh, this seems like something that should be *fixed*, not documented.
That initdb accepts an alternative location for pg_xlog/pg_wal now
raises that to a first-class feature, in my view, and other tools should
recognize the case and deal with it correctly.

Of course, that having been said, there's some technical challenges
there.  One being that we don't really want to mess with the old
cluster's WAL during pg_upgrade.  Frustratingly, we have a way to deal
with that case today for tablespaces, it was just never applied to WAL
when we started allowing WAL to be stored elsewhere (formally).  That
seems like it was a mistake to me.

Then again, the more I think about this, the more I wonder about putting
more-or-less everything in PGDATA into per-catalog-version directories
and making everything self-contained.  Part of the ugly bit with the
rsync is exactly that we have to go up an extra level for the data
directories themselves, and users can actually put them anywhere so
there might not even *be* a common parent directory to use.

> An improvement could be done as well here for pg_upgrade: when using
> --link, the new PGDATA created could consider as well the source
> pg_wal and create a link to it, and then clean up its contents. I am
> not completely sure if this would be worth doing as people are likely
> used to the current flow though. The documentation needs to outline
> the matter at least.

No, one of the baseline requirements of pg_upgrade is to *not* screw
with the existing cluster.  Removing its WAL or "cleaning it up"
definitely seems like it's violating that principle.

I tend to agree that it'd be good for the documentation to address this,
but this is all really getting to be a bit much for a manpage to be able
to handle, I think..

Thaks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Re: [COMMITTERS] pgsql: Use MINVALUE/MAXVALUE instead of UNBOUNDED for range partition b

2017-09-14 Thread Stephen Frost
Robert, all,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Wed, Sep 13, 2017 at 10:49 PM, Noah Misch  wrote:
> >> > Both Oracle and MySQL allow finite values after MAXVALUE (usually
> >> > listed as "0" in code examples, e.g. see [1]). Oracle explicitly
> >> > documents the fact that values after MAXVALUE are irrelevant in [1].
> >> > I'm not sure if MySQL explicitly documents that, but it does behave
> >> > the same.
> >> >
> >> > Also, both Oracle and MySQL store what the user entered (they do not
> >> > canonicalise), as can be seen by looking at ALL_TAB_PARTITIONS in
> >> > Oracle, or "show create table" in MySQL.
> >>
> >> OK, thanks.  I still don't really like allowing this, but I can see
> >> that compatibility with other systems has some value here, and if
> >> nobody else is rejecting these cases, maybe we shouldn't either.  So
> >> I'll hold my nose and change my vote to canonicalizing rather than
> >> rejecting outright.
> >
> > I vote for rejecting it.  DDL compatibility is less valuable than other
> > compatibility.  The hypothetical affected application can change its DDL to
> > placate PostgreSQL and use that modified DDL for all other databases, too.
> 
> OK.  Any other votes?

I think I side with Noah on this one.  I agree that DDL compatibility is
less valuable than other compatibility.  I had also been thinking that
perhaps it would be good to still be compatible for the sake of DBAs not
being confused if they got an error, but this seems straight-forward
enough that it wouldn't take much for a DBA who is building such
partitions to understand their mistake and to fix it.

I haven't been as close to this as others, so perhaps my vote is only
0.5 on this specific case, but that's my feeling on it.

Also, I don't think we should be adding compatibility for the sake of
compatibility alone.  If there's more than one way to do something and
they're all correct and reasonable, then I could see us choosing the
route that matches what others in the industry do, but I don't see
simply ignoring user input in this specific case as really correct and
therefore it's better to reject it.

Basically, for my 2c, the reason Oracle does this is something more of a
historical artifact than because it was deemed sensible, and everyone
else just followed suit, but I don't believe we need to or should.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-13 Thread Stephen Frost
Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> I have applied the attached patch to show examples of using rsync on
> PGDATA and tablespaces, documented that rsync is only useful when in
> link mode, and explained more clearly how rsync handles links.  You can
> see the results here:
> 
>   http://momjian.us/pgsql_docs/pgupgrade.html
> 
> Any more improvements?

First off, I'd strongly suggest that we make "Step 1" in the pg_upgrade
process be "take a full backup and verify that you're able to restore it
successfully and without corruption."

I don't particularly care for how this seems to imply that the Rsync
method is "the" method to use when --link mode is used with pg_upgrade.

I'd reword the section title to be along these lines:

If you have streaming replicas or log-shipping standby servers then they
will also need to be updated.  The simplest way to accomplish this is to
simply rebuild the replicas from scratch once the primary is back
online.  Unfortunately, that can take a long time for larger systems as
the data has to be copied from the primary to each replica in the
environment.  If --link mode was used with pg_upgrade, the Latest
checkpoint location matches between the primary and the replica(s) (as
discussed in Step 8), the rsync utility is available, and the existing
data directory and new data directory on the replica are able to be in a
common directory on the same filesystem (as is required on the primary
for --link mode to be used), then an alternative method may be used to
update the replicas using rsync which will generally require much less
time.

Note that this method will end up needlessly copying across temporary
files and unlogged tables.  If these make up a large portion of your
database, then rebuilding the replicas from scratch may be a better
option.

With this method, you will not be running pg_upgrade on the standby
servers, but rather rsync on the primary to sync the replicas to match
the results of the pg_upgrade on the primary.  Do not start any servers
yet.  If you did not use link mode, skip the instructions in this
section and simply recreate the standby servers.

This method requires that the *old* data directory on the replica be in
place as rsync will be creating a hard-link tree between the old data
files on the replica and the new data directory on the replica (as was
done by pg_upgrade on the primary).

a. Install the new PostgreSQL binaries on standby servers.

...

b. Make sure the new standby data directories do not exist

If initdb was run on the replica to create a new data directory, remove
that new data directory (the rsync will recreate it).  Do *not* remove
the existing old data directory.

c. Install custom shared object files

 ** I would probably move this up to be step 'b' instead, and make step
 'b' be step 'c' instead.

d. Stop standby servers

...

*new*
e. Verify/re-verify that Latest checkpoint location in pg_controldata
   on the replica matches that of the primary (from before the primary
   was upgraded with pg_upgrade).

f. Save configuration files

  ** this should have a caveat that it's only necessary if the config
  files are in the data directory.

g. Run rsync

  ** I am having a hard time figuring out why --delete makes sense here.
  There shouldn't be anything in the new data directory, and we don't
  actually need to delete anything in the old data directory on the
  replica, so what are we doing suggesting --delete be used?  Strikes me
  as unnecessairly adding risk, should someone end up doing the wrong
  command.  Also, again, if I was doing this, I'd absolutely run rsync
  with --dry-run for starters and review what it is going to do and make
  sure that's consistent with what I'd expect.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-09-13 Thread Stephen Frost
Tom, all,

* Stephen Frost (sfr...@snowman.net) wrote:
> Alright, here's an updated patch which cleans things up a bit and adds
> comments to explain what's going on.  I also updated the comments in
> acl.h to explain that ordering actually does matter.

Getting back to this, here's rebased patches for master/v10 and 9.6
(which only had whitespace differences, probably pgindent to blame
there..).

I'm going to push these later today unless there's other comments on it.

Thanks!

Stephen
From ac86eb5451492bcb72f25cceab4ae467716ea073 Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 3 Aug 2017 21:23:37 -0400
Subject: [PATCH] Fix ordering in pg_dump of GRANTs

The order in which GRANTs are output is important as GRANTs which have
been GRANT'd by individuals via WITH GRANT OPTION GRANTs have to come
after the GRANT which included the WITH GRANT OPTION.  This happens
naturally in the backend during normal operation as we only change
existing ACLs in-place, only add new ACLs to the end, and when removing
an ACL we remove any which depend on it also.

Also, adjust the comments in acl.h to make this clear.

Unfortunately, the updates to pg_dump to handle initial privileges
involved pulling apart ACLs and then combining them back together and
could end up putting them back together in an invalid order, leading to
dumps which wouldn't restore.

Fix this by adjusting the queries used by pg_dump to ensure that the
ACLs are rebuilt in the same order in which they were originally.

Back-patch to 9.6 where the changes for initial privileges were done.
---
 src/bin/pg_dump/dumputils.c | 51 -
 src/include/utils/acl.h | 14 ++---
 2 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index dfc611848b..e4c95feb63 100644
--- a/src/bin/pg_dump/dumputils.c
+++ b/src/bin/pg_dump/dumputils.c
@@ -722,21 +722,36 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	 * We always perform this delta on all ACLs and expect that by the time
 	 * these are run the initial privileges will be in place, even in a binary
 	 * upgrade situation (see below).
+	 *
+	 * Finally, the order in which privileges are in the ACL string (the order
+	 * they been GRANT'd in, which the backend maintains) must be preserved to
+	 * ensure that GRANTs WITH GRANT OPTION and subsequent GRANTs based on
+	 * those are dumped in the correct order.
 	 */
-	printfPQExpBuffer(acl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(acl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+	  "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS perm(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+	  "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "AS init(init_acl) WHERE acl = init_acl)) as foo)",
 	  acl_column,
 	  obj_kind,
 	  acl_owner,
 	  obj_kind,
 	  acl_owner);
 
-	printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(racl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+	  "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS initp(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+	  "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "AS permp(orig_acl) WHERE acl = orig_acl)) as foo)",
 	  obj_kind,
 	  acl_owner,
 	  acl_column,
@@ -761,19 +776,25 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	{
 		printfPQExpBuffer(init_acl_subquery,
 		  "CASE WHEN privtype = 'e' THEN "
-		  "(SELECT pg_catalog.array_agg(acl) FROM "
-		  "(SELECT pg_catalog.unnest(pip.initprivs) AS acl "
-		  "EXCEPT "
-		  "SELECT pg_catalog.unnest(pg_catalog.acldefault(%s,%s))) as foo) END",
+		  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+		  "(SELECT acl, row_n FROM pg_catalog.unnest(pip.in

Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-12 Thread Stephen Frost
Andreas,

* Andreas Joseph Krogh (andr...@visena.com) wrote:
> I have to ask; Why not run pg_upgrade on standby, after verifying that it's 
> in 
> sync with primary and promoting it to primary if necessary and then making it 
> standby again after pg_upgrade is finished?

I don't think that we could be guaranteed that the catalog tables would
be the same on the replica as on the primary if they were actually
created by pg_upgrade.

The catalog tables *must* be identical between the primary and the
replica because they are updated subsequently through WAL replay, not
through SQL commands (which is how pg_upgrade creates them in the first
place).

Perhaps we could have some mode for pg_upgrade where it handles the
update to replicas (with the additional checks that I outlined and using
the methodology discussed for rsync --hard-links), but that would still
require solving the communicate-over-the-network problem between the
primary and the replicas, which is the hard part.  Whether it's an
independent utility or something built into pg_upgrade isn't really that
big of a distinction, though it doesn't seem to me like there'd be much
code reuse there.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Clarification in pg10's pgupgrade.html step 10 (upgrading standby servers)

2017-09-12 Thread Stephen Frost
ull rsync command, e.g.:
> > 
> >   rsync --archive --delete --hard-links --size-only \
> >       /opt/PostgreSQL/9.5 /opt/PostgreSQL/9.6 standby:/opt/PostgreSQL
> > 
> > Does that help?
> > 
> >  
> >  
> > It seems some non-obvious assumptions (to me at least) are made here.
> > This example seems only valid when using pg_upgrade --link, correct? If so 
> > it
> > would be clearer to the reader if explicitly stated.
> 
> Well, as I stated above, --hard-links is only going to recreate hard
> links on the standby that exist on the primary, and if you didn't use
> pg_upgrade's --link mode, there will be none, so it is harmless if
> pg_upgrade --link mode was not used.

The rsync will recreate the hard links *and* copy the new catalog data
files over from the upgraded primary.  It will specifically *not* copy
over into the new cluster anything from the old data dir, and that's
important.

I agree that --hard-links should be harmless if you're not using --link,
but as I say above, this approach doesn't really make sense if you're
not using --link and it can clearly be confusing to people to not have
this method caveated in that way.

> > 1. Why do you have to rsync both /opt/PostgreSQL/9.5 AND 
> > /opt/PostgreSQL/9.6,
> > wouldn't /opt/PostgreSQL/9.6 suffice? Or does this assume "pg_upgrade 
> > --link"
> > AND "rsync --hard-links" and therefore it somewhat needs to transfer less 
> > data?
> 
> As I stated above, rsync has to see _both_ hard links on the primary to
> recreate them on the standby.  I thought the doc patch was clear on
> that, but obviously not.  :-(  Suggestions?  (Yes, I admit that using
> rsync in this way is super-crafty, and I would _love_ to take credit for
> the idea, but I think the award goes to Stephen Frost.)

Indeed, this is a method I've used previously, with good success, to
speed up getting a replica back online following a pg_upgrade.  There's
some additional caveats on it that we haven't even discussed yet here:

Unlogged tables will end up getting copied by this rsync.  That's not
the end of the world and won't harm anything, afaik, but having all the
unlogged data copied to the replicas ends up using space on the replicas
unnecessairly and will make the transfer of data take longer as well.

> > 2. What would the rsync command look like if pg_upgrade wasn't issued with
> > --link?
> 
> It would look like:
> 
>   rsync --archive /opt/PostgreSQL/9.6 standby:/opt/PostgreSQL/9.6

Right.

> but effectively there isn't anything _in_ standby:/opt/PostgreSQL/9.6,
> so you are really just using rsync as cp, and frankly I have found 'cp'
> is faster than rsync when nothing exists on the other side so it really
> becomes "just copy the cluster when the server is down", but I don't
> think people even need instructions for that.

Well, the above rsync would go over the network whereas a traditional
'cp' won't.

I tend to agree that we don't really need a lot of documentation around
"copy the resulting cluster to the replica while the server is down",
but that then goes against your argument above that this approach is
good for both --link and without --link.

> Maybe we should recommend rsync only for pg_upgrade --link mode?

Yes, I think we should.

Further, really, I think we should provide a utility to do all of the
above instead of using rsync- and that utility should do some additional
things, such as:

- Check that the control file on the primary and replica show that they
  reached the same point prior to the pg_upgrade.  If they didn't, then
  things could go badly as there's unplayed WAL that the primary got
  through and the replica didn't.

- Not copy over unlogged data, or any other information that shouldn't
  be copied across.

- Allow the directory structures to be more different between the
  primary and the replica than rsync allows (wouldn't have to have a
  common subdirectory on the replica).

- Perhaps other validation checks or similar.

Unfortunately, this is a bit annoying as it necessairly involves running
things on both the primary and the replica from the same tool, without
access to PG, meaning we'd have to work through something else (such as
SSH, like rsync does, but then what would we do for Windows...?).

> > 3. What if the directory-layout isn't the same on primary and standby, ie.
> > tablespaces are located differently?
> 
> The way we reconfigured the location of tablespaces in PG 9.0 is that
> each major version of Postgres places its tablespace in a subdirectory
> of the tablespace directory, so there is tbldir/9.5 and tbldir/9.6.  If
> your tbldir is diffe

Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump

2017-09-10 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> As there begins to be many switches of this kind and much code
> duplication, I think that some refactoring into a more generic switch
> infrastructure would be nicer.

I have been thinking about this also and agree that it would be nicer,
but then these options would just become "shorthand" for the generic
switch.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Not listed as committer

2017-08-25 Thread Stephen Frost
* Magnus Hagander (mag...@hagander.net) wrote:
> On Fri, Aug 25, 2017 at 3:47 PM, Michael Meskes 
> wrote:
> > I just committed a patch that is listed in the CF and had to put
> > somebody (I chose you Bruce :)) in as a committer because I was not
> > listed. Do I have to register somewhere?
> >
> 
> Ha, that's interesting.
> 
> Should be fixed now, please try again.

Almost certainly because he hadn't logged into the commitfest app at the
time that the initial set of committers were selected, so he didn't have
an account on the CF system yet.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-24 Thread Stephen Frost
* Noah Misch (n...@leadboat.com) wrote:
> On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> > I've tested the new \gx against 10beta and current git HEAD. Actually one 
> > of my favourite features of PostgreSQL 10! However in my environment it was 
> > behaving strangely. After some debugging I found that \gx does not work if 
> > you have \set FETCH_COUNT n before. Please find attached a patch that fixes 
> > this incl. new regression test.

Fixed in 0cdc3e4.

Thanks for the report!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-23 Thread Stephen Frost
* Noah Misch (n...@leadboat.com) wrote:
> On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> > I've tested the new \gx against 10beta and current git HEAD. Actually one 
> > of my favourite features of PostgreSQL 10! However in my environment it was 
> > behaving strangely. After some debugging I found that \gx does not work if 
> > you have \set FETCH_COUNT n before. Please find attached a patch that fixes 
> > this incl. new regression test.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Stephen,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.

Just to preempt the coming nagging email, yes, I'm aware I didn't get to
finish this today (though at least today, unlike yesterday, I looked at
it and started to play with it..  I blame the celestial bodies that
caused an impressive traffic jam on Monday night causing me to lose
basically all of Tuesday too..).

I'll provide an update tomorrow.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Updating line length guidelines

2017-08-22 Thread Stephen Frost
Craig, all,

* Craig Ringer (cr...@2ndquadrant.com) wrote:
> On 21 August 2017 at 10:36, Michael Paquier 
> wrote:
> > On Mon, Aug 21, 2017 at 11:30 AM, Robert Haas 
> > wrote:
> > > On Sun, Aug 20, 2017 at 10:49 AM, Andres Freund 
> > wrote:
> > >> We currently still have the guideline that code should fit into an 80
> > >> character window. But an increasing amount of the code, and code
> > >> submissions, don't adhere to that (e.g. copy.c, which triggered me to
> > >> write this email). And I mean outside of accepted "exceptions" like
> > >> error messages.  And there's less need for such a relatively tight limit
> > >> these days.  Perhaps we should up the guideline to 90 or 100 chars?
> > >
> > > Or maybe we should go the other way and get a little more rigorous
> > > about enforcing that limit.  I realize 80 has nothing on its side but
> > > tradition, but I'm a traditionalist -- and I still do use 80 character
> > > windows a lot of the time.
> >
> > +1. FWIW, I find the non-truncation of some error messages a bit
> > annoying when reading them. And having a 80-character makes things
> > readable. For long URLs this enforcement makes little sense as those
> > strings cannot be split, but more effort could be done.
> > 
> 
> The main argument for not wrapping error messages is grep-ableness.

One option for this would be to move the long error messages into one
place and have a placeholder instead that's actually in-line with the
code, allowing one to grep for the message to pull the placeholder and
then it's a pretty quick cscope to find where it's used (or another
grep, I suppose).

> Personally I'd be fine with 100 or so, but when I'm using buffers side by
> side, or when I'm working in poor conditions where I've set my terminal to
> "giant old people text" sizes, I remember the advantages of a width limit.

I wouldn't be against 100 either really, but I don't really feel all
that strongly either way.  Then again, there is the back-patching pain
which would ensue to consider..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-19 Thread Stephen Frost
Noah, all,

* Noah Misch (n...@leadboat.com) wrote:
> On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> > I've tested the new \gx against 10beta and current git HEAD. Actually one 
> > of my favourite features of PostgreSQL 10! However in my environment it was 
> > behaving strangely. After some debugging I found that \gx does not work if 
> > you have \set FETCH_COUNT n before. Please find attached a patch that fixes 
> > this incl. new regression test.
> 
> [Action required within three days.  This is a generic notification.]

I'll address this on Tuesday, 8/22.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] One-shot expanded output in psql using \gx

2017-08-16 Thread Stephen Frost
Noah, all,

On Wed, Aug 16, 2017 at 22:24 Noah Misch  wrote:

> On Tue, Aug 15, 2017 at 10:24:34PM +0200, Tobias Bussmann wrote:
> > I've tested the new \gx against 10beta and current git HEAD. Actually
> one of my favourite features of PostgreSQL 10! However in my environment it
> was behaving strangely. After some debugging I found that \gx does not work
> if you have \set FETCH_COUNT n before. Please find attached a patch that
> fixes this incl. new regression test.
>
> [Action required within three days.  This is a generic notification.]
>
> The above-described topic is currently a PostgreSQL 10 open item.  Stephen,
> since you committed the patch believed to have created it, you own this
> open
> item.  If some other commit is more relevant or if this does not belong as
> a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days
> of
> this message.  Include a date for your subsequent status update.  Testers
> may
> discover new open items at any time, and I want to plan to get them all
> fixed
> well in advance of shipping v10.  Consequently, I will appreciate your
> efforts
> toward speedy resolution.  Thanks.


I'm in Boston this week but should be able to address this no later than
Monday, August 21st,  and I will provide an update then.

Thanks!

Stephen


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-14 Thread Stephen Frost
Robert, Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Sun, Aug 6, 2017 at 3:22 AM, Robert Haas  wrote:
> > All of (1)-(3) are legitimate user choices, although not everyone will
> > make them.  (4) is unfortunately the procedure recommended by our
> > documentation, which is where the problem comes in.  I think it's
> > pretty lame that the documentation recommends ignoring the return
> > value of pg_stop_backup(); ISTM that it would be very wise to
> > recommend cross-checking the return value against the WAL files you're
> > keeping for the backup.  Even if we thought the waiting logic was
> > working perfectly in every case, having a cross-check on something as
> > important as backup integrity seems like an awfully good idea.
> 
> I would got a little bit more far and say that this is mandatory as
> the minimum recovery point that needs to be reached is the LSN
> returned by pg_stop_backup(). For backups taken from primaries, this
> is a larger deal because XLOG_BACKUP_END may not be present if the
> last WAL segment switched is not in the archive. For backups taken
> from standbys, the thing is more tricky as the control file should be
> backed up last. I would think that the best thing we could do is to
> extend pg_stop_backup a bit more so as it returns the control file to
> write in the backup using a bytea to hold the data for example.

Indeed, getting this all correct isn't trivial and it's really
unfortunate that our documentation in this area is really lacking.
Further, having things not actually work as the documentation claims
isn't exactly helping.

> > I think the root of this problem is that commit
> > 7117685461af50f50c03f43e6a622284c8d54694 did only a pretty perfunctory
> > update of the documentation, as the commit message itself admits:
> >
> > Only reference documentation is included. The main section on
> > backup still needs
> > to be rewritten to cover this, but since that is already scheduled
> > for a separate
> > large rewrite, it's not included in this patch.
> >
> > But it doesn't look like that ever really got done.
> > https://www.postgresql.org/docs/10/static/continuous-archiving.html#backup-lowlevel-base-backup
> > really doesn't talk at all about standbys or differences in required
> > procedures between masters and standbys.  It makes statements that are
> > flagrantly riduculous in the case of a standby, like claiming that
> > pg_start_backup() always performs a checkpoint and that pg_stop_backup
> > "terminates the backup mode and performs an automatic switch to the
> > next WAL segment."  Well, obviously not.
> 
> Yep.

Indeed, that was something that I had also heard was being worked on,
but it's really unfortunate that it didn't happen.

> > And at least to me, that's the real bug here.  Somebody needs to go
> > through and fix this documentation properly.
> 
> So, Magnus? :)

I continue to be off the opinion that rewriting the documentation in
this case to match what we actually do is pretty unfriendly to users who
have built tools using the documentation at the time.  Evidently, that
ship has sailed, however, but we didn't help things here by only
changing how PG10 works and not also at least updating the documentation
to be accurate.

I've poked Magnus... more directly regarding this and offered to have
someone help with the development of better documentation in this area.
Hopefully we can get the docs in the back-branches fixed for the next
round of minor releases, and call out those updates in the release
notes, at least.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-04 Thread Stephen Frost
Robert,

On Fri, Aug 4, 2017 at 23:17 Robert Haas  wrote:

> On Thu, Aug 3, 2017 at 9:49 PM, Stephen Frost  wrote:
> > Thanks for the patches.  I'm planning to push them tomorrow morning
> > after a bit more review and testing.  I'll provide an update tomorrow.
>
> Obviously, the part about pushing them Friday morning didn't happen,
> and you're running out of time for providing an update on Friday, too.
> The release is due to wrap in less than 72 hours.


Unfortunately the day got away from me due to some personal... adventures
(having to do with lack of air conditioning first and then lack of gas,
amongst a lot of other things going on right now...). I just got things
back online but, well, my day tomorrow is pretty well packed solid.  I
wouldn't complain if someone has a few cycles to push these, they look
pretty good to me but I won't be able to work on PG stuff again until
tomorrow night at the earliest.

Thanks!

Stephen


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Aug 3, 2017 at 4:29 AM, Stephen Frost  wrote:
> > I'll provide another update tomorrow.  Hopefully Michael is able to produce
> > a 9.6 patch, otherwise I'll do it.
> 
> I have sent an updated version of the patch, with something that can
> be used for 9.6 as well. It would be nice to get something into the
> next set of minor releases.

Thanks for the patches.  I'm planning to push them tomorrow morning
after a bit more review and testing.  I'll provide an update tomorrow.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-03 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Wed, Aug 2, 2017 at 7:58 PM, Stephen Frost  wrote:
> > * Michael Paquier (michael.paqu...@gmail.com) wrote:
> >> Do you need a back-patchable version for 9.6? I could get one out of
> >> my pocket if necessary.
> >
> > I was just trying to find a bit of time to generate exactly that- if
> > you have a couple spare cycles, it would certainly help.
> 
> OK, here you go. Even if archive_mode = always has been introduced in
> 9.5, but non-exclusive mode is a 9.6 feature, so here are patches down
> to this version. I am pretty satisfied by this, and I included all the
> comments and error message corrections reviewed up to now. I noticed
> some incorrect comments, doc typos and an incorrect indentation as
> well for the WARNING reported to the user when waiting for the
> archiving.

Thanks much for this.  I've looked over it some and it looks pretty good
to me.  I'm going to play with it a bit more and, barring issues, will
push them tomorrow morning.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-08-03 Thread Stephen Frost
Tom, all,

* Stephen Frost (sfr...@snowman.net) wrote:
> This needs more cleanup, testing, and comments explaining why we're
> doing this (and then perhaps comments, somewhere.. in the backend ACL
> code that explains that the ordering needs to be preserved), but the
> basic idea seems sound to me and the case you presented does work with
> this patch (for me, at least) whereas what's in master didn't.

Alright, here's an updated patch which cleans things up a bit and adds
comments to explain what's going on.  I also updated the comments in
acl.h to explain that ordering actually does matter.

I've tried a bit to break the ordering in the backend a bit but there
could probably be more effort put into that, if I'm being honest.
Still, this definitely fixes the case which was being complained about
and therefore is a step in the right direction.

It's a bit late here, so I'll push this in the morning and watch the
buildfarm.

Thanks!

Stephen
From cbca78a387ecfed9570bd079541ec7906c990f0f Mon Sep 17 00:00:00 2001
From: Stephen Frost 
Date: Thu, 3 Aug 2017 21:23:37 -0400
Subject: [PATCH] Fix ordering in pg_dump of GRANTs

The order in which GRANTs are output is important as GRANTs which have
been GRANT'd by individuals via WITH GRANT OPTION GRANTs have to come
after the GRANT which included the WITH GRANT OPTION.  This happens
naturally in the backend during normal operation as we only change
existing ACLs in-place, only add new ACLs to the end, and when removing
an ACL we remove any which depend on it also.

Also, adjust the comments in acl.h to make this clear.

Unfortunately, the updates to pg_dump to handle initial privileges
involved pulling apart ACLs and then combining them back together and
could end up putting them back together in an invalid order, leading to
dumps which wouldn't restore.

Fix this by adjusting the queries used by pg_dump to ensure that the
ACLs are rebuilt in the same order in which they were originally.

Back-patch to 9.6 where the changes for initial privileges were done.
---
 src/bin/pg_dump/dumputils.c | 51 -
 src/include/utils/acl.h | 14 ++---
 2 files changed, 47 insertions(+), 18 deletions(-)

diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
index dfc6118..67049a6 100644
--- a/src/bin/pg_dump/dumputils.c
+++ b/src/bin/pg_dump/dumputils.c
@@ -722,21 +722,36 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	 * We always perform this delta on all ACLs and expect that by the time
 	 * these are run the initial privileges will be in place, even in a binary
 	 * upgrade situation (see below).
+	 *
+	 * Finally, the order in which privileges are in the ACL string (the order
+	 * they been GRANT'd in, which the backend maintains) must be preserved to
+	 * ensure that GRANTs WITH GRANT OPTION and subsequent GRANTs based on
+	 * those are dumped in the correct order.
 	 */
-	printfPQExpBuffer(acl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(acl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+			  "pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS perm(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+	  "pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "AS init(init_acl) WHERE acl = init_acl)) as foo)",
 	  acl_column,
 	  obj_kind,
 	  acl_owner,
 	  obj_kind,
 	  acl_owner);
 
-	printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
-	  "(SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) AS acl "
-	  "EXCEPT "
-	  "SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s as foo)",
+	printfPQExpBuffer(racl_subquery,
+	  "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
+	  "(SELECT acl, row_n FROM "
+	"pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) "
+	  "WITH ORDINALITY AS initp(acl,row_n) "
+	  "WHERE NOT EXISTS ( "
+	  "SELECT 1 FROM "
+			"pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) "
+	  "AS permp(orig_acl) WHERE acl = orig_acl)) as foo)",
 	  obj_kind,
 	  acl_owner,
 	  acl_column,
@@ -761,19 +776,25 @@ buildACLQueries(PQExpBuffer acl_subquery, PQExpBuffer racl_subquery,
 	{
 		printfPQExpBuffer(init_acl_subquery,

Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-02 Thread Stephen Frost
Noah,

On Tue, Aug 1, 2017 at 20:52 Noah Misch  wrote:

> On Thu, Jul 27, 2017 at 10:27:36AM -0400, Stephen Frost wrote:
> > Noah, all,
> >
> > * Noah Misch (n...@leadboat.com) wrote:
> > > This PostgreSQL 10 open item is past due for your status update.
> Kindly send
> > > a status update within 24 hours, and include a date for your
> subsequent status
> > > update.  Refer to the policy on open item ownership:
> >
> > Based on the ongoing discussion, this is really looking like it's
> > actually a fix that needs to be back-patched to 9.6 rather than a PG10
> > open item.  I don't have any issue with keeping it as an open item
> > though, just mentioning it.  I'll provide another status update on or
> > before Monday, July 31st.
>
> This PostgreSQL 10 open item is past due for your status update.  Kindly
> send
> a status update within 24 hours, and include a date for your subsequent
> status
> update.


I'll provide another update tomorrow.  Hopefully Michael is able to produce
a 9.6 patch, otherwise I'll do it.

Thanks!

Stephen

>


Re: [HACKERS] Subscription code improvements

2017-08-02 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Alvaro Herrera  writes:
> > I wish you'd stop splitting error message strings across multiple lines.
> > I've been trapped by a faulty grep not matching a split error message a
> > number of times :-(  I know by now to remove words until I get a match,
> > but it seems an unnecessary trap for the unwary.
> 
> Yeah, that's my number one reason for not splitting error messages, too.
> It's particularly nasty if similar strings appear in multiple places and
> they're not all split alike, as you can get misled into thinking that a
> reported error must have occurred in a place you found, rather than
> someplace you didn't.

+1.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-08-02 Thread Stephen Frost
* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Mon, Jul 31, 2017 at 9:13 PM, Stephen Frost  wrote:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> On Thu, Jul 27, 2017 at 10:27 AM, Stephen Frost  wrote:
> >> > * Noah Misch (n...@leadboat.com) wrote:
> >> >> This PostgreSQL 10 open item is past due for your status update.  
> >> >> Kindly send
> >> >> a status update within 24 hours, and include a date for your subsequent 
> >> >> status
> >> >> update.  Refer to the policy on open item ownership:
> >> >
> >> > Based on the ongoing discussion, this is really looking like it's
> >> > actually a fix that needs to be back-patched to 9.6 rather than a PG10
> >> > open item.  I don't have any issue with keeping it as an open item
> >> > though, just mentioning it.  I'll provide another status update on or
> >> > before Monday, July 31st.
> >> >
> >> > I'll get to work on the back-patch and try to draft up something to go
> >> > into the release notes for 9.6.4.
> >>
> >> Whether this is going to be back-patched or not, you should do
> >> something about it quickly, because we're wrapping a new beta and a
> >> full set of back-branch releases next week.  I'm personally hoping
> >> that what follows beta3 will be rc1, but if we have too much churn
> >> after beta3 we'll end up with a beta4, which could end up slipping the
> >> whole release cycle.
> >
> > Yes, I've been working on this and the other issues with pg_dump today.
> 
> Do you need a back-patchable version for 9.6? I could get one out of
> my pocket if necessary.

I was just trying to find a bit of time to generate exactly that- if
you have a couple spare cycles, it would certainly help.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-07-31 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> AFAICT, pg_dump has no notion that it needs to be careful about the order
> >> in which permissions are granted.  I did
> 
> > I'm afraid that's correct, though I believe that's always been the case.
> > I spent some time looking into this today and from what I've gathered so
> > far, there's essentially an implicit (or at least, I couldn't find any
> > explicit reference to it) ordering in ACLs whereby a role which was
> > given a GRANT OPTION always appears in the ACL list before an ACL entry
> > where that role is GRANT'ing that option to another role, and this is
> > what pg_dump was (again, implicitly, it seems) depending on to get this
> > correct in prior versions.
> 
> Yeah, I suspected that was what made it work before.  I think the ordering
> just comes from the fact that new ACLs are added to the end, and you can't
> add an entry as a non-owner unless there's a grant WGO there already.

Right.

> I suspect it would be easier to modify the backend code that munges ACL
> lists so that it takes care to preserve that property, if we ever find
> there are cases where it does not.  It seems plausible to me that
> pg_dump is not the only code that depends on that ordering.

Hm, well, if we're alright with that then I think the attached would
address the pg_dump issue, and I believe this would work as this code is
only run for 9.6+ servers anyway.

This needs more cleanup, testing, and comments explaining why we're
doing this (and then perhaps comments, somewhere.. in the backend ACL
code that explains that the ordering needs to be preserved), but the
basic idea seems sound to me and the case you presented does work with
this patch (for me, at least) whereas what's in master didn't.

Thoughts?

Thanks!

Stephen
diff --git a/src/bin/pg_dump/dumputils.c b/src/bin/pg_dump/dumputils.c
new file mode 100644
index dfc6118..8686ed0
*** a/src/bin/pg_dump/dumputils.c
--- b/src/bin/pg_dump/dumputils.c
*** buildACLQueries(PQExpBuffer acl_subquery
*** 723,742 
  	 * these are run the initial privileges will be in place, even in a binary
  	 * upgrade situation (see below).
  	 */
! 	printfPQExpBuffer(acl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
! 	  "(SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) AS acl "
! 	  "EXCEPT "
! 	  "SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s as foo)",
  	  acl_column,
  	  obj_kind,
  	  acl_owner,
  	  obj_kind,
  	  acl_owner);
  
! 	printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl) FROM "
! 	  "(SELECT pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) AS acl "
! 	  "EXCEPT "
! 	  "SELECT pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s as foo)",
  	  obj_kind,
  	  acl_owner,
  	  acl_column,
--- 723,742 
  	 * these are run the initial privileges will be in place, even in a binary
  	 * upgrade situation (see below).
  	 */
! 	printfPQExpBuffer(acl_subquery, "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
! 	  "(SELECT acl, row_n FROM pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) WITH ORDINALITY AS perm(acl,row_n) "
! 	  "WHERE NOT EXISTS ( "
! 	  "SELECT 1 FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) AS init(init_acl) WHERE acl = init_acl)) as foo)",
  	  acl_column,
  	  obj_kind,
  	  acl_owner,
  	  obj_kind,
  	  acl_owner);
  
! 	printfPQExpBuffer(racl_subquery, "(SELECT pg_catalog.array_agg(acl ORDER BY row_n) FROM "
! 	  "(SELECT acl, row_n FROM pg_catalog.unnest(coalesce(pip.initprivs,pg_catalog.acldefault(%s,%s))) WITH ORDINALITY AS initp(acl,row_n) "
! 	  "WHERE NOT EXISTS ( "
! 	  "SELECT 1 FROM pg_catalog.unnest(coalesce(%s,pg_catalog.acldefault(%s,%s))) AS permp(orig_acl) WHERE acl = orig_acl)) as foo)",
  	  obj_kind,
  	  acl_owner,
  	  acl_column,
*** buildACLQueries(PQExpBuffer acl_subquery
*** 761,779 
  	{
  		printfPQExpBuffer(init_acl_subquery,
  		  "CASE WHEN privtype = 'e' THEN "
! 		  "(SELECT pg_catalog.array_agg(acl) FROM "
! 		  "(SELECT pg_catalog.unnest(pip.initprivs) AS acl "
! 		  "EXCEPT "
! 		  "SELECT pg_catalog.unnest(pg_catalog.acldefault(%s,%s))) as foo) END",
  		  obj_kind,
  		  acl_owner);
  
  		printfPQExpBuffer(init_racl_subquery,
  		  "CASE WHEN privtype = 'e' THEN "

Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-31 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Jul 27, 2017 at 10:27 AM, Stephen Frost  wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> >> This PostgreSQL 10 open item is past due for your status update.  Kindly 
> >> send
> >> a status update within 24 hours, and include a date for your subsequent 
> >> status
> >> update.  Refer to the policy on open item ownership:
> >
> > Based on the ongoing discussion, this is really looking like it's
> > actually a fix that needs to be back-patched to 9.6 rather than a PG10
> > open item.  I don't have any issue with keeping it as an open item
> > though, just mentioning it.  I'll provide another status update on or
> > before Monday, July 31st.
> >
> > I'll get to work on the back-patch and try to draft up something to go
> > into the release notes for 9.6.4.
> 
> Whether this is going to be back-patched or not, you should do
> something about it quickly, because we're wrapping a new beta and a
> full set of back-branch releases next week.  I'm personally hoping
> that what follows beta3 will be rc1, but if we have too much churn
> after beta3 we'll end up with a beta4, which could end up slipping the
> whole release cycle.

Yes, I've been working on this and the other issues with pg_dump today.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-27 Thread Stephen Frost
Noah, all,

* Noah Misch (n...@leadboat.com) wrote:
> This PostgreSQL 10 open item is past due for your status update.  Kindly send
> a status update within 24 hours, and include a date for your subsequent status
> update.  Refer to the policy on open item ownership:

Based on the ongoing discussion, this is really looking like it's
actually a fix that needs to be back-patched to 9.6 rather than a PG10
open item.  I don't have any issue with keeping it as an open item
though, just mentioning it.  I'll provide another status update on or
before Monday, July 31st.

I'll get to work on the back-patch and try to draft up something to go
into the release notes for 9.6.4.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-07-26 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> AFAICT, pg_dump has no notion that it needs to be careful about the order
> in which permissions are granted.  I did

I'm afraid that's correct, though I believe that's always been the case.
I spent some time looking into this today and from what I've gathered so
far, there's essentially an implicit (or at least, I couldn't find any
explicit reference to it) ordering in ACLs whereby a role which was
given a GRANT OPTION always appears in the ACL list before an ACL entry
where that role is GRANT'ing that option to another role, and this is
what pg_dump was (again, implicitly, it seems) depending on to get this
correct in prior versions.

Pulling apart the ACL list and rebuilding it to handle adding/revoking
of default options on objects ends up, in some cases, changing the
ordering around for the ACLs and that's how pg_dump comes to emit the
GRANT commands in the wrong order.

Looks like what is needed is an explicit ordering to the ACLs in
pg_dump to ensure that it emits the GRANTs in the correct order, which
should be that a given GRANTOR's rights must be before any ACL which
that GRATOR granted.  Ideally, that could be crafted into the SQL query
which is sent to the server, but I haven't quite figured out if that'll
be possible or not.  If not, it shouldn't be too hard to implement in
pg_dump directly.

I don't, at the moment, think we actually need to do any checks in the
backend code to make sure that the implicit ordering is always held,
assuming we make this change in pg_dump.  I do wonder if it might be
possible, with the correct set of GRANTs (perhaps having role
memberships coming into play also, as discussed in the header of
select_best_grantor()) to generate an ACL list with an "incorrect"
ordering which would end up causing issues in older releases with
pg_dump.  We've had precious little complaints from the field about this
and so, even if we were to generate such a case, I'm not sure that we'd
want to add all the code necessary to avoid it and then back-patch it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-26 Thread Stephen Frost
All,

* Masahiko Sawada (sawada.m...@gmail.com) wrote:
> On Tue, Jul 25, 2017 at 4:43 AM, Michael Paquier
>  wrote:
> > On Mon, Jul 24, 2017 at 6:45 PM, Stephen Frost  wrote:
> >> What the change would do is make the pg_stop_backup() caller block until
> >> the last WAL is archvied, and perhaps that ends up taking hours, and
> >> then the connection is dropped for whatever reason and the backup fails
> >> where it otherwise what?  wouldn't have been valid anyway at that
> >> point, since it's not valid until the last WAL is actually archived.
> >> Perhaps eventually it would be archived and the caller was planning for
> >> that and everything is fine, but, well, that feels like an awful lot of
> >> wishful thinking.
> >
> > Letting users taking unconsciously inconsistent backups is worse than
> > potentially breaking scripts that were actually not working as
> > Postgres would expect. So I am +1 for back-patching a lighter version
> > of the proposed patch that makes the wait happen on purpose.
> >
> >>> > I'd hate to have to do it, but we could technically add a GUC to address
> >>> > this in the back-branches, no?  I'm not sure that's really worthwhile
> >>> > though..
> >>>
> >>> That would be mighty ugly.
> >>
> >> Oh, absolutely agreed.
> >
> > Yes, let's avoid that. We are talking about a switch aimed at making
> > backups potentially inconsistent.
> 
> Thank you for the review comments!
> Attached updated the patch. The noting in pg_baseback doc will be
> necessary for back branches if we decided to not back-patch it to back
> branches. So it's not contained in this patch for now.
> 
> Regarding back-patching this to back branches, I also vote for
> back-patching to back branches. Or we can fix the docs of back
> branches and fix the code only in PG10. I expect that the user who
> wrote a backup script has done enough functional test and dealt with
> this issue somehow, but since the current doc clearly says that
> pg_stop_backup() waits for all WAL to be archived we have to make a
> consideration about there are users who wrote a wrong backup script.
> So I think we should at least notify it in the minor release.

That sounds like we have at least three folks thinking that this should
be back-patched, and one who doesn't.  Does anyone else wish to voice an
opinion regarding back-patching this..?

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-07-25 Thread Stephen Frost
Thom,

On Tue, Jul 25, 2017 at 20:29 Thom Brown  wrote:

> On 26 July 2017 at 00:52, Stephen Frost  wrote:
> > Thom,
> >
> > * Thom Brown (t...@linux.com) wrote:
> >> This is the culprit:
> >>
> >> commit 23f34fa4ba358671adab16773e79c17c92cbc870
> >> Author: Stephen Frost 
> >> Date:   Wed Apr 6 21:45:32 2016 -0400
> >
> > Thanks!  I'll take a look tomorrow.
>
> I should point out that this commit was made during the 9.6 cycle, and
> I get the same issue with 9.6.


Interesting that Tom didn't. Still, that does make more sense to me.

Thanks!

Stephen

>
>


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-07-25 Thread Stephen Frost
Thom,

* Thom Brown (t...@linux.com) wrote:
> This is the culprit:
> 
> commit 23f34fa4ba358671adab16773e79c17c92cbc870
> Author: Stephen Frost 
> Date:   Wed Apr 6 21:45:32 2016 -0400

Thanks!  I'll take a look tomorrow.

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_dump does not handle indirectly-granted permissions properly

2017-07-25 Thread Stephen Frost
Tom,

On Tue, Jul 25, 2017 at 16:43 Tom Lane  wrote:

> AFAICT, pg_dump has no notion that it needs to be careful about the order
> in which permissions are granted.  I did
>
> regression=# create user joe;
> CREATE ROLE
> regression=# create user bob;
> CREATE ROLE
> regression=# create user alice;
> CREATE ROLE
> regression=# \c - joe
> You are now connected to database "regression" as user "joe".
> regression=> create table joestable(f1 int);
> CREATE TABLE
> regression=> grant select on joestable to alice with grant option;
> GRANT
> regression=> \c - alice
> You are now connected to database "regression" as user "alice".
> regression=> grant select on joestable to bob;
> GRANT
>
> and then pg_dump'd that.  The ACL entry for joestable looks like this:
>
> --
> -- TOC entry 5642 (class 0 OID 0)
> -- Dependencies: 606
> -- Name: joestable; Type: ACL; Schema: public; Owner: joe
> --
>
> SET SESSION AUTHORIZATION alice;
> GRANT SELECT ON TABLE joestable TO bob;
> RESET SESSION AUTHORIZATION;
> GRANT SELECT ON TABLE joestable TO alice WITH GRANT OPTION;
>
> Unsurprisingly, that fails to restore:
>
> db2=# SET SESSION AUTHORIZATION alice;
> SET
> db2=> GRANT SELECT ON TABLE joestable TO bob;
> ERROR:  permission denied for relation joestable
>
>
> I am not certain whether this is a newly introduced bug or not.
> However, I tried it in 9.2-9.6, and all of them produce the
> GRANTs in the right order:
>
> GRANT SELECT ON TABLE joestable TO alice WITH GRANT OPTION;
> SET SESSION AUTHORIZATION alice;
> GRANT SELECT ON TABLE joestable TO bob;
> RESET SESSION AUTHORIZATION;
>
> That might be just chance, but my current bet is that Stephen
> broke it sometime in the v10 cycle --- especially since we
> haven't heard any complaints like this from the field.


Hmmm. I'll certainly take a look when I get back to a laptop, but I can't
recall offhand any specific code for handling that, nor what change I might
have made in the v10 cycle which would have broken it (if anything, I would
have expected an issue from the rework in 9.6...).

I should be able to look at this tomorrow though.

Thanks!

Stephen


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-24 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Jul 24, 2017 at 12:31 PM, Stephen Frost  wrote:
> > Those backup scripts might very well be, today, producing invalid
> > backups though, which would be much less good..
> 
> True.  However, I suspect that depends on what procedure is actually
> being followed.  If *everyone* who is using this is getting corrupt
> backups, then of course changing the behavior is a no-brainer.  But if
> *some* people are getting correct backups and *some* people are
> getting incorrect backups, depending on procedure, then I think
> changing it is unwise.  We should optimize for the case of a user who
> is currently doing something smart, not one who is doing something
> dumb.

I'm not sure that we can call "writing code that depends on what the
docs say" dumb, unfortunately, and if even one person is getting an
invalid backup while following what the docs say then that's a strong
case to do *something*.  Of course, we also don't want to break the
scripts of people who are doing things correctly, but I'm trying to
figure out exactly how this change would break such scripts.

What the change would do is make the pg_stop_backup() caller block until
the last WAL is archvied, and perhaps that ends up taking hours, and
then the connection is dropped for whatever reason and the backup fails
where it otherwise what?  wouldn't have been valid anyway at that
point, since it's not valid until the last WAL is actually archived.
Perhaps eventually it would be archived and the caller was planning for
that and everything is fine, but, well, that feels like an awful lot of
wishful thinking.

And that's assuming that the script author made sure to write additional
code that didn't mark the backup as valid until the ending WAL segment
from pg_stop_backup() was confirmed to have been archived.

Last, but perhaps not least, this is at least just an issue going back
to where pg_start/stop_backup was allowed on replicas, which is only 9.6
and therefore it's been out less than a year and anyone's script which
might break due to this would at least be relatively new code.

> > I'd hate to have to do it, but we could technically add a GUC to address
> > this in the back-branches, no?  I'm not sure that's really worthwhile
> > though..
> 
> That would be mighty ugly.

Oh, absolutely agreed.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-24 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Jul 24, 2017 at 11:40 AM, David Steele  wrote:
> > Before reviewing the patch, I would note that it looks like this issue was
> > introduced in b6a323a8c before the 9.6 release.  The documentation states
> > that the pg_stop_backup() function will wait for all required WAL to be
> > archived, which corresponds to the default in the new patch of
> > waitforarchive = true.  The commit notes that the documentation needs
> > updating, but since that didn't happen I think we should consider this a bug
> > in 9.6 and back patch.
> >
> > While this patch brings pg_stop_backup() in line with the documentation, it
> > also introduces a behavioral change compared to 9.6.  Currently, the default
> > 9.6 behavior on a standby is to return immediately from pg_stop_backup(),
> > but this patch will cause it to block by default instead.  Since action on
> > the master may be required to unblock the process, I see this as a pretty
> > significant change.  I still think we should fix and backpatch, but I wanted
> > to point this out.
> 
> Hmm.  That seems to me like it could break backup scripts that are
> currently working, which seems like a *very* unfriendly thing to do in
> a minor release, especially since 9.6 has no option to override the
> default behavior (nor can we add one, since it would require a change
> to catalog contents).

Those backup scripts might very well be, today, producing invalid
backups though, which would be much less good..

I'd hate to have to do it, but we could technically add a GUC to address
this in the back-branches, no?  I'm not sure that's really worthwhile
though..

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-24 Thread Stephen Frost
David,

* David Steele (da...@pgmasters.net) wrote:
> On 7/23/17 11:48 PM, Masahiko Sawada wrote:
> >On Sat, Jul 22, 2017 at 8:04 AM, Stephen Frost  wrote:
> >>
> >>I started discussing this with David off-list and he'd like a chance to
> >>review it in a bit more detail (he's just returning from being gone for
> >>a few weeks).  That review will be posted to this thread on Monday, and
> >>I'll wait until then to move forward with the patch.
> 
> Before reviewing the patch, I would note that it looks like this
> issue was introduced in b6a323a8c before the 9.6 release.  The
> documentation states that the pg_stop_backup() function will wait
> for all required WAL to be archived, which corresponds to the
> default in the new patch of waitforarchive = true.  The commit notes
> that the documentation needs updating, but since that didn't happen
> I think we should consider this a bug in 9.6 and back patch.

I tend to agree with this.  The documentation clearly says that
pg_stop_backup() waits for all WAL to be archived, but, today, if it's
run on a standby then it doesn't wait, which might lead to invalid
backups if the backup software is depending on that.

> While this patch brings pg_stop_backup() in line with the
> documentation, it also introduces a behavioral change compared to
> 9.6.  Currently, the default 9.6 behavior on a standby is to return
> immediately from pg_stop_backup(), but this patch will cause it to
> block by default instead.  Since action on the master may be
> required to unblock the process, I see this as a pretty significant
> change.  I still think we should fix and backpatch, but I wanted to
> point this out.

This will need to be highlighted in the release notes for 9.6.4 also,
assuming there is agreement to back-patch this, and we'll need to update
the documentation in 9.6 also to be clearer about what happens on a
standby.

> The patch looks sensible to me.  A few comments:
> 
> 1) I would change:
> 
> "Check if the WAL segment needed for this backup have been
> completed, in which case a forced segment switch may be needed on
> the primary."
> 
> To something like:
> 
> "If the WAL segments required for this backup have not been archived
> then it might be necessary to force a segment switch on the
> primary."

I'm a bit on the fence regarding the distinction here for the
backup-from-standby and this errhint().  The issue is that the WAL for
the backup hasn't been archived yet and that may be because of either:

archive_command is failing
OR
No WAL is getting generated

In either case, both of these apply to the primary and the standby.  As
such, I'm inclined to try to mention both in the original errhint()
instead of making two different ones.  I'm open to suggestions on this,
of course, but my thinking would be more like:

-
Either archive_command is failing or not enough WAL has been generated
to require a segment switch.  Run pg_switch_wal() to request a WAL
switch and monitor your logs to check that your archive_command is
executing properly.
-

And then change pg_switch_wal()'s errhint for when it's run on a replica
to be:


HINT: WAL control functions cannont be executed during recovery; they
should be executed on the primary instead.


(or similar, again, open to suggestions here).

> 2) At backup.sgml L1015 it says that pg_stop_backup() will
> automatically switch the WAL segment.  There should be a caveat here
> for standby backups, like:
> 
> When called on a master it terminates the backup mode and performs
> an automatic switch to the next WAL segment.  The reason for the
> switch is to arrange for the last WAL segment written during the
> backup interval to be ready to archive.  When called on a standby
> the WAL segment switch must be performed manually on the master if
> it does not happen due to normal write activity.

s/master/primary/g

Otherwise it looks alright..  Might try to reword to use similar
language as to what we have today in 25.3.3.1.

> 3) The fact that this fix requires "archive_mode = always" seems
> like a pretty big caveat, thought I don't have any ideas on how to
> make it better without big code changes.  Maybe it would be help to
> change:
> 
> + the backup is taken on a standby, pg_stop_backup waits
> + for WAL to be archived when archive_mode
> 
> To:
> 
> + the backup is taken on a standby, pg_stop_backup waits
> + for WAL to be archived *only* when archive_mode

I'm thinking of rewording that a bit to say "When archive_mode = always"
instead, as I think that might be a little clearer.

> Perhaps this should be noted in the pg_basebackup docs as well?

That seems like it's probably a good idea too, as people might wonder
why pg_basebackup hasn't finished yet if it's waiting for WAL to be
archived.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-21 Thread Stephen Frost
Masahiko, all,

* Masahiko Sawada (sawada.m...@gmail.com) wrote:
> On Tue, Jul 18, 2017 at 1:28 PM, Stephen Frost  wrote:
> > Masahiko, Michael,
> >
> > * Masahiko Sawada (sawada.m...@gmail.com) wrote:
> >> > This is beginning to shape.
> >>
> >> Sorry, I missed lots of typo in the last patch. All comments from you
> >> are incorporated into the attached latest patch and I've checked it
> >> whether there is other typos. Please review it.
> >
> > I've taken an initial look through the patch and it looks pretty
> > reasonable.  I need to go over it in more detail and work through
> > testing it myself next.
> >
> > I expect to commit this (with perhaps some minor tweaks primairly around
> > punctuation/wording), barring any issues found, on Wednesday or Thursday
> > of this week.
> 
> I understood. Thank you for looking at this!

I started discussing this with David off-list and he'd like a chance to
review it in a bit more detail (he's just returning from being gone for
a few weeks).  That review will be posted to this thread on Monday, and
I'll wait until then to move forward with the patch.

Next update will be before Tuesday, July 25th, COB.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Increase Vacuum ring buffer.

2017-07-20 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Jul 20, 2017 at 3:04 PM, Stephen Frost  wrote:
> > I agree that it's a common problem for VACUUM to go too fast, or for
> > VACUUM to go too slow, but that's really what the vacuum_cost_limit
> > mechanism is for.
> 
> I think that's a valid point.  There are also other concerns here -
> e.g. whether instead of adopting the patch as proposed we ought to (a)
> use some smaller size, or (b) keep the size as-is but reduce the
> maximum fraction of shared_buffers that can be consumed, or (c) divide
> the ring buffer size through by autovacuum_max_workers.  Personally,
> of those approaches, I favor (b).  I think a 16MB ring buffer is
> probably just fine if you've got 8GB of shared_buffers but I'm
> skeptical about it when you've got 128MB of shared_buffers.

Right, agreed on that and that (b) looks to be a good option there.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Increase Vacuum ring buffer.

2017-07-20 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Thu, Jul 20, 2017 at 12:16 PM, Sokolov Yura
>  wrote:
> > But in fact, vacuum process performs FSYNC! It happens, cause vacuum
> > evicts dirty pages from its ring buffer. And to evict dirty page, it
> > has to be sure WAL record about its modification is FSYNC-ed to WAL.
> > Because ring buffer is damn small, vacuum almost always have to perform
> > FSYNC by itself and have to do it very frequently (cause ring is damn
> > small).
> >
> > With greater ring buffer, there is greater chance that fsync were
> > performed by wal_writer process, or other backend. And even if
> > vacuum does fsync by itself, it syncs records about more modified
> > pages from its ring, so evicting next page is free.
> 
> OK, but I have helped *many* customers whose problem was that vacuum
> ran too fast and blew data out of the OS cache causing query response
> times to go through the roof.  That's a common problem.  Making VACUUM
> run faster will presumably make it more common.  I've also run into
> many customers whose problem that vacuum ran too slowly, and generally
> raising vacuum_cost_limit fixes that problem just fine.  So I don't
> think it's nearly as clear as you do that making VACUUM run faster is
> desirable.

I agree that it's a common problem for VACUUM to go too fast, or for
VACUUM to go too slow, but that's really what the vacuum_cost_limit
mechanism is for.

I can see an argument that existing tuned systems which have been
expecting the small ring-buffer to help slow down VACUUM may have to be
adjusted to handle a change, though I would think that other changes
we've made might also require changes to vacuum costing, so I'm not sure
that this is particularly different in that regard.

What I don't agree with is holding off on improving VACUUM in the case
where cost delay is set to zero because we think people might be
depending on it only going so fast in that case.  If the cost delay is
set to zero, then VACUUM really should be going as fast as it can and we
should welcome improvments in that area in much the same way that we
welcome performance improvements in sorting and other backend
algorithms.

> > If some fears that increasing vacuum ring buffer will lead to
> > decreasing transaction performance, then why it were not exhaustively
> > tested?
> 
> If you want something changed, it's your job to do that testing.
> Asking why nobody else tested the effects of changing the thing you
> want changed is like asking why nobody else wrote the patch you want
> written.

I do agree with this.  Asking for others to also test is fine, but it's
the patch submitter who needs to ensure that said testing actually
happens and that results are provided to -hackers to support the change.

In particular, multiple different scenarios (DB all in shared_buffers,
DB all in OS cache, DB not able to fit in memory at all, etc) should be
tested.

> > And possible community will decide, that it should be GUC variable:
> > - if one prefers to keep its database unbloated, he could increase
> > vacuum ring buffer,
> > - otherwise just left it in "safe-but-slow" default.
> 
> That's a possible outcome, but I don't think this discussion is really
> going anywhere unless you are willing to admit that increasing VACUUM
> performance could have some downsides.  If you're not willing to admit
> that, there's not a lot to talk about here.

I'd rather we encourage people to use the existing knobs for tuning
VACUUM speed rather than adding another one that ends up being actually
only a proxy for speed.  If there's a memory utilization concern here,
then having a knob for that might make sense, but it sounds like the
concern here is more about the speed and less about coming up with a
reasonable way to scale the size of the ring buffer.

Of course, I'm all for coming up with a good way to size the ring
buffer, and providing a knob if we aren't able to do so, I just don't
want to add unnecessary knobs if we don't need them.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-20 Thread Stephen Frost
Greetings,

* Sokolov Yura (y.soko...@postgrespro.ru) wrote:
> I wrote two days ago about vacuum ring buffer:
> https://www.postgresql.org/message-id/8737e9bddb82501da1134f021bf4929a%40postgrespro.ru
> 
> Increasing Vacuum's ring buffer to size of Bulk Writer's one reduces
> autovacuum time in 3-10 times.
> (for both patched and unpatched version I used single non-default
> setting
> 'autovacuum_cost_delay=2ms').
> 
> This is single line change, and it improves things a lot.

Right- when the database fits in the OS cache but not in shared_buffers.

I do agree that's a useful improvement to make based on your testing.

It's not clear off-hand how much that would improve this case, as
the database size appears to pretty quickly get beyond the OS memory
size (and only in the first test is the DB starting size less than
system memory to begin with).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] autovacuum can't keep up, bloat just continues to rise

2017-07-19 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> "Joshua D. Drake"  writes:
> > At PGConf US Philly last week I was talking with Jim and Jan about 
> > performance. One of the items that came up is that PostgreSQL can't run 
> > full throttle for long periods of time. The long and short is that no 
> > matter what, autovacuum can't keep up. This is what I have done:
> 
> Try reducing autovacuum_vacuum_cost_delay more, and/or increasing
> autovacuum_vacuum_cost_limit.

Or get rid of the cost delay entirely and let autovacuum actually go as
fast as it can when it's run.  The assertion that it can't keep up is
still plausible, but configuring autovacuum to sleep regularly and then
complaining that it's not able to keep up doesn't make sense.

Reducing the nap time might also be helpful if autovacuum is going as
fast as it can and it's able to clear a table in less than a minute.

There have been discussions on this list about parallel vacuum of a
particular table as well; to address this issue I'd encourage reviewing
those discussions and looking at writing a patch to implement that
feature as that would address the case where the table is large enough
that autovacuum simply can't get through all of it before the other
backends have used all space available and then substantially increased
the size of the relation (leading to vacuum on the table running for
longer).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-17 Thread Stephen Frost
Masahiko, Michael,

* Masahiko Sawada (sawada.m...@gmail.com) wrote:
> > This is beginning to shape.
> 
> Sorry, I missed lots of typo in the last patch. All comments from you
> are incorporated into the attached latest patch and I've checked it
> whether there is other typos. Please review it.

I've taken an initial look through the patch and it looks pretty
reasonable.  I need to go over it in more detail and work through
testing it myself next.

I expect to commit this (with perhaps some minor tweaks primairly around
punctuation/wording), barring any issues found, on Wednesday or Thursday
of this week.

Noah,

I'll provide an update regarding this open item by COB, Thursday July
20th.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Why have we got both largeobject and large_object test files?

2017-07-17 Thread Stephen Frost
Tom,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> I happened to notice that the regression tests contain both
> largeobject.sql and large_object.sql.  This seems at best confusing and at
> worst a source of mistakes.  The second file was added in March by commit
> ff992c074, has never been touched by any other commit, and is only 8 lines
> long.  Was there a really good reason not to incorporate that test into
> largeobject.sql?

Just to be clear that we're talking about the same thing- there is no
'largeobject.sql' in a clean source tree.  There is a 'largeobject.source'
in src/test/regress/input which is converted to largeobject.sql.

As for the general question of if the two could be merged, I can't think
of any reason off-hand why that wouldn't work, nor do I have any
particular recollection as to why I created a new file instead of using
the existing.  My shell history tells me that I found largeobject.source
while crafting the test case but not why I didn't use it.

The main thing is that the large_object.sql was specifically added to
test pg_upgrade/pg_dump, so the created object needs to be kept around
in the regression database with the comment after the tests run for that
to happen.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] More flexible LDAP auth search filters?

2017-07-16 Thread Stephen Frost
Magnus,

* Magnus Hagander (mag...@hagander.net) wrote:
> On Sun, Jul 16, 2017 at 11:05 PM, Stephen Frost  wrote:
> > I'd suggest that we try to understand why Kerberos couldn't be used in
> > that environment.  I suspect in at least some cases what users would
> > like is the ability to do Kerberos auth but then have LDAP checked to
> > see if a given user (who has now auth'd through Kerberos) is allowed to
> > connect.  We don't currently have any way to do that, but if we were
> > looking for things to do, that's what I'd suggest working on rather than
> > adding more to our LDAP auth system and implying by doing so that it's
> > reasonable to use.
> >
> > I find it particularly disappointing to see recommendations for using
> > LDAP auth, particularly in AD environments, that don't even mention
> > Kerberos or bother to explain how using LDAP sends the user's PW to the
> > server in cleartext.
> 
> You do realize, I'm sure, that there are many LDAP servers out there that
> are not AD, and that do not come with a Kerberos server attached to them...

I am aware that some exist, I've even contributed to their development
and packaging, but that doesn't make it a good idea to use them for
authentication.

> I agree that Kerberos is usually the better choice *if it's available*.

Which is the case in an AD environment..

> It's several orders of magnitude more complicated to set up though, and
> there are many environments that have ldap but don't have Kerberos.

Frankly, I simply don't agree with this.

> Refusing to improve LDAP for the users who have no choice seems like a very
> unfriendly thing to do.

I'm fine with improving LDAP in general, but, as I tried to point out,
having a way to make it easier to integrate PG into an AD environment
would be better.  It's not my intent to stop this patch but rather to
point out the issues with LDAP auth that far too frequently are not
properly understood.

> (And you can actually reasonably solve the case of
> kerberos-for-auth-ldap-for-priv by syncing the groups into postgres roles)

Yes, but sync'ing roles is a bit of a pain and it'd be nice if we could
avoid it, or perhaps make it easier.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] More flexible LDAP auth search filters?

2017-07-16 Thread Stephen Frost
Mark,

* Mark Cave-Ayland (mark.cave-ayl...@ilande.co.uk) wrote:
> On 16/07/17 23:26, Thomas Munro wrote:
> > Thank you very much for this feedback and example, which I used in the
> > documentation in the patch.  I see similar examples in the
> > documentation for other things on the web.
> > 
> > I'll leave it up to Magnus and Stephen to duke it out over whether we
> > want to encourage LDAP usage, extend documentation to warn about
> > cleartext passwords with certain LDAP implementations or
> > configurations, etc etc.  I'll add this patch to the commitfest and
> > get some popcorn.
> 
> If it helps, we normally recommend that clients use ldaps for both AD
> and UNIX environments, although this can be trickier from an
> administrative perspective in AD environments because it can require
> changes to the Windows firewall and certificate installation.

LDAPS is better than straight LDAP, of course, but it still doesn't
address the issue that the password is sent to the server, which both
SCRAM and Kerberos do and is why AD environments use Kerberos for
authentication, and why everything in an AD environment also should use
Kerberos.

Using Kerberos should also avoid the need to hack the Windows firewall
or deal with certificate installation.  In an AD environment, it's
actually pretty straight-forward to add a PG server too.  Further, in my
experience at least, there's been other changes recommended by Microsoft
that prevent using LDAP for auth because it's insecure.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] More flexible LDAP auth search filters?

2017-07-16 Thread Stephen Frost
Magnus, all,

* Magnus Hagander (mag...@hagander.net) wrote:
> (FWIW, a workaround I've applied more than once to this in AD environments
> (where kerberos for one reason or other can't be done, sorry Stephen) is to
> set up a RADIUS server and use that one as a "middle man". But it would be
> much better if we could do it natively)

I'd suggest that we try to understand why Kerberos couldn't be used in
that environment.  I suspect in at least some cases what users would
like is the ability to do Kerberos auth but then have LDAP checked to
see if a given user (who has now auth'd through Kerberos) is allowed to
connect.  We don't currently have any way to do that, but if we were
looking for things to do, that's what I'd suggest working on rather than
adding more to our LDAP auth system and implying by doing so that it's
reasonable to use.

I find it particularly disappointing to see recommendations for using
LDAP auth, particularly in AD environments, that don't even mention
Kerberos or bother to explain how using LDAP sends the user's PW to the
server in cleartext.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-14 Thread Stephen Frost
Greetings,

* Vladimir Borodin (r...@simply.name) wrote:
> > 14 июля 2017 г., в 1:33, Stephen Frost  написал(а):
> > What would be really nice for such cases is support for Kerberos and
> > delegated Kerberos credentials.  Having pgpool support that would remove
> > the need to deal with passwords at all.
> 
> Since nearly all systems with some kind of load nowadays use connection 
> poolers (pgpool-II or pgbouncer) between applications and postgres, it is a 
> pretty big pain to re-implement all authentication methods supported by 
> postgres in such poolers. Kerberos is cool but not the only thing that should 
> be supported by FDWs or connection poolers. I.e. many users would want to 
> have support for LDAP and SCRAM. And every time when there would be some 
> changes in postgres auth methods, exactly the same work (or even worse) 
> should be done in many (at least two) other products widely used by people.

Honestly, I disagree about the notion that LDAP support should be added
to anything or encouraged.  There's a reason that AD uses Kerberos and
not LDAP and Microsoft continues to (quite reasonably) make it more
difficult for individuals to perform LDAP auth in AD.

The auth methods in PG do not see a huge amount of churn over the years
and so I don't agree with the argument that it would be a problem for
other systems to support Kerberos or similar strong auth methods.

> It seems that postgres either should provide connection pooling feature in 
> core or give external poolers a kind of generic mechanism to transparently 
> proxy auth requests/responses, so that authentication would be fully managed 
> by postgres and that would be the only place where changes in auth methods 
> should be done. Yes, in this case connection pooler actually behaves like man 
> in the middle so it should be done very carefully but it seems that there is 
> no other way.

While this might be possible by having some kind of special trusted
connection between the pooler and PG, I actually don't particularly like
the notion of inventing a bunch of complicated logic and pain so that a
connection pooler can avoid implementing a proper authentication system.

Having PG support connection pooling itself, of course, would be nice
but I'm not aware of anyone currently working on it.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-13 Thread Stephen Frost
Michael, all,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Thu, Jul 13, 2017 at 7:13 AM, Masahiko Sawada  
> wrote:
> > Sorry, I missed lots of typo in the last patch. All comments from you
> > are incorporated into the attached latest patch and I've checked it
> > whether there is other typos. Please review it.
> 
> Thanks for providing a new version of the patch very quickly. I have
> spent some time looking at it again and testing it, and this version
> looks correct to me. Stephen, as a potential committer, would you look
> at it to address this open item?

Yes, this weekend.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-13 Thread Stephen Frost
Greetings Tatsuo,

* Tatsuo Ishii (is...@sraoss.co.jp) wrote:
> > What I am suggesting here is that in order to handle properly SCRAM
> > with channel binding, pgpool has to provide a different handling for
> > client <-> pgpool and pgpool <-> Postgres. In short, I don't have a
> > better answer than having pgpool impersonate the server and request
> > for a password in cleartext through an encrypted connection between
> > pgpool and the client if pgpool does not know about it, and then let
> > pgpool do by itself the SCRAM authentication on a per-connection basis
> > with each Postgres instances. When using channel binding, what would
> > matter is the TLS finish (for tls-unique) or the hash server
> > certificate between Postgres and pgpool, not between the client and
> > pgpool. But that's actually the point you are raising here:
> 
> Using a clear text password would not be acceptable for users even
> through an encrypted connection, I think.

Really, I don't think users who are concerned with security should be
using the md5 method either.

What would be really nice for such cases is support for Kerberos and
delegated Kerberos credentials.  Having pgpool support that would remove
the need to deal with passwords at all.

Ditto for having postgres_fdw support same.  More often than not,
Kerberos deployments (via AD, generally) is what I find in the
enterprises that I work with and they're happy to see we have Kerberos
but it's disappointing when they can't use Kerberos with either
connection poolers or with FDWs.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-09 Thread Stephen Frost
All,

* Noah Misch (n...@leadboat.com) wrote:
> On Fri, Jun 30, 2017 at 02:59:11PM -0400, Stephen Frost wrote:
> > * Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> > > On 6/30/17 04:08, Masahiko Sawada wrote:
> > > >> I'm not sure. I think this can be considered a bug in the 
> > > >> implementation for
> > > >> 10, and as such is "open for fixing". However, it's not a very 
> > > >> critical bug
> > > >> so I doubt it should be a release blocker, but if someone wants to 
> > > >> work on a
> > > >> fix I think we should commit it.
> > > > 
> > > > I agree with you. I'd like to hear opinions from other hackers as well.
> > > 
> > > It's preferable to make it work.  If it's not easily possible, then we
> > > should prohibit it.
> > > 
> > > Comments from Stephen (original committer)?
> > 
> > I agree that it'd be preferable to make it work, but I'm not sure I can
> > commit to having it done in short order.  I'm happy to work to prohibit
> > it, but if someone has a few spare cycles to make it actually work,
> > that'd be great.
> 
> [Action required within three days.  This is a generic notification.]
> 
> The above-described topic is currently a PostgreSQL 10 open item.  Stephen,
> since you committed the patch believed to have created it, you own this open
> item.  If some other commit is more relevant or if this does not belong as a
> v10 open item, please let us know.  Otherwise, please observe the policy on
> open item ownership[1] and send a status update within three calendar days of
> this message.  Include a date for your subsequent status update.  Testers may
> discover new open items at any time, and I want to plan to get them all fixed
> well in advance of shipping v10.  Consequently, I will appreciate your efforts
> toward speedy resolution.  Thanks.

I'm out of town this week but will review the patch from Masahiko and
provide a status update by July 17th.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-07 Thread Stephen Frost
Tatsuo,

* Tatsuo Ishii (is...@sraoss.co.jp) wrote:
> > I recall vaguely Ishii-san mentioning me at PGcon that pgpool was
> > actually cheating, but my memories are a bit fuzzy for this part.
> 
> What I meant by "cheating" was, Pgpool-II behaves as if PostgreSQL
> server in md5 auth.
> 
> For more details what Pgpool-II actually does in md5 auth, please see:
> 
> https://pgpool.net/mediawiki/index.php/FAQ#How_does_pgpool-II_handle_md5_authentication.3F

I see.  In short, that's not going to be able to work with SCRAM.

We also certainly can't weaken or break SCRAM to address Pgpool's needs.
I'm afraid an alternative approach will need to be considered for Pgpool
to be able to do what it does today, as I outlined in my other email.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-07 Thread Stephen Frost
Michael,

* Michael Paquier (michael.paqu...@gmail.com) wrote:
> On Sat, Jul 8, 2017 at 1:24 AM, Robert Haas  wrote:
> > IIUC, things will get even worse once channel binding is committed,
> > presumably for PostgreSQL 11.  The point of channel binding is to
> > guarantee that you are conducting the authentication exchange with the
> > target server, not some intermediate proxy that might be conducting a
> > hostile MITM attack.  pgpool may not be a hostile attack, but it is
> > acting as a MITM, and channel binding is going to figure that out and
> > fail the authentication.  So unless I'm misunderstanding, the solution
> > you are proposing figures to have a very limited shelf life.
> 
> We may be misunderstanding each other then. The solution I am
> proposing, or at least the solution that I imagine should be done but
> my pgpool-fu is limited, is to keep around connection context and SSL
> context to handle properly connections messages, and switch between
> them. When using channel binding, the client needs the server
> certificate for end-point and the TLS finish message which are both
> stored in the SSL context after the handshake. So you need to cache
> that for each server.

I'm not sure what you mean here- the whole point of channel binding is
to prevent exactly the kind of MITM that pgPool would essentially be in
this case.  If you're suggesting that things be changed such that a
server provides information necessary to pgPool to pass along to the
client to make the client believe that it's talking to the server and
that there's no man-in-the-middle then the whole point of channel
binding goes out the window.  If what you're suggesting doesn't require
any help from either the client or the server to work then I fear we've
done something wrong in the implementation of channel binding (I've not
looked at it very closely).

I don't have any perfect answers for pgPool here, unfortunately.  One
approach would be to give it a list of usernames/passwords to use,
though that's hardly ideal.  Perhaps it would be possible for pgPool to
impersonate the server to the client and the client to the server if it
was able to query the database as a superuser on some other connection,
but I don't *think* so because of the way SCRAM works and because pgPool
wouldn't have access to the client's cleartext password.

Of course, if pgPool could convince the client to use 'password' auth
and get the cleartext password from the client then that would work to
authenticate against the server and there wouldn't be any channel
binding involved since the client is talking the basic cleartext
password protocol and not SCRAM.

> One problem is that pgpool does not use directly libpq but tries to
> handle things by itself at protocol level, so it needs to replicate a
> lot of existing APIs. Postgres-XC/XL use a connection pooler,
> presumably XL uses even a background worker for this stuff since it
> has been integrated with Postgres 9.3. And this stuff use libpq. This
> makes life way easier to handle context data on a connection basis.
> Those don't need to handle protocol-level messages either, which is
> perhaps different from what pgpool needs.

I don't really think that pgpool using or not using libpq makes any
real difference here.  What'll be an issue for pgpool is when clients
get updated libpq libraries that don't support password auth...

> In short I would think that pgpool needs first to un-cheat its
> handling with MD5, which is likely here to simplify its code.

This also doesn't seem particularly relevant to me, but then again, I've
never actually looked at the pgPool code myself.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-06-30 Thread Stephen Frost
Peter, all,

* Peter Eisentraut (peter.eisentr...@2ndquadrant.com) wrote:
> On 6/30/17 04:08, Masahiko Sawada wrote:
> >> I'm not sure. I think this can be considered a bug in the implementation 
> >> for
> >> 10, and as such is "open for fixing". However, it's not a very critical bug
> >> so I doubt it should be a release blocker, but if someone wants to work on 
> >> a
> >> fix I think we should commit it.
> > 
> > I agree with you. I'd like to hear opinions from other hackers as well.
> 
> It's preferable to make it work.  If it's not easily possible, then we
> should prohibit it.
> 
> Comments from Stephen (original committer)?

I agree that it'd be preferable to make it work, but I'm not sure I can
commit to having it done in short order.  I'm happy to work to prohibit
it, but if someone has a few spare cycles to make it actually work,
that'd be great.

In short, I agree with Magnus and feel like I'm more-or-less in the same
boat as he is (though slightly jealous as that's not actually physically
the case, for I hear he has a rather nice boat...).

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] pg_terminate_backend can terminate background workers and autovacuum launchers

2017-06-23 Thread Stephen Frost
Alvaro, all,

* Alvaro Herrera (alvhe...@2ndquadrant.com) wrote:
> Yugo Nagata wrote:
> 
> > I tried to make it. Patch attached. 
> > 
> > It is easy and simple. Although I haven't tried for autovacuum worker,
> > I confirmed I can change other process' parameters without affecting others.
> > Do you want this in PG?
> 
> One advantage of this gadget is that you can signal backends that you
> own without being superuser, so +1 for the general idea.  (Please do
> create a fresh thread so that this can be appropriately linked to in
> commitfest app, though.)

Well, that wouldn't quite work with the proposed patch because the
proposed patch REVOKE's execute from public...

I'm trying to figure out how it's actually useful to be able to signal a
backend that you own about a config change that you *aren't* able to
make without being a superuser..  Now, if you could tell the other
backend to use a new value for some USERSET GUC, then *that* would be
useful and interesting.

I haven't got any particularly great ideas for how to make that happen
though.

> You need a much bigger patch for the autovacuum worker.  The use case I
> had in mind is that you have a maintenance window and can run fast
> vacuuming during it, but if the table is large and vacuum can't finish
> within that window then you want vacuum to slow down, without stopping
> it completely.  But implementing this requires juggling the
> stdVacuumCostDelay/Limit values during the reload, which are currently
> read at start of vacuuming only; the working values are overwritten from
> those during a rebalance.

Being able to change those values during a vacuuming run would certainly
be useful, I've had cases where I would have liked to have been able to
do just exactly that.  That's largely independent of this though.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Preliminary results for proposed new pgindent implementation

2017-06-19 Thread Stephen Frost
* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Stephen Frost  writes:
> > * Robert Haas (robertmh...@gmail.com) wrote:
> >> On Sat, Jun 17, 2017 at 5:41 PM, Peter Eisentraut
> >>  wrote:
> >>> On 6/16/17 10:51, Tom Lane wrote:
> >>>> So I'm back to the position that we ought to stick the indent
> >>>> code under src/tools/ in our main repo.  Is anyone really
> >>>> seriously against that?
> 
> >>> I think it would be better to have it separate.
> 
> >> +1.
> 
> > +1.
> 
> Given the license issues raised downthread, we have no choice in
> the short term.  So I have a request in to create a separate repo
> on git.postgresql.org (whose chain do I need to pull to get that
> approved, btw?)

u, that would probably be pginfra in some capacity, but I don't
recall seeing any notification of such a request.

I will follow up with those responsible,  #blamemagnus

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] Preliminary results for proposed new pgindent implementation

2017-06-19 Thread Stephen Frost
* Robert Haas (robertmh...@gmail.com) wrote:
> On Sat, Jun 17, 2017 at 5:41 PM, Peter Eisentraut
>  wrote:
> > On 6/16/17 10:51, Tom Lane wrote:
> >> So I'm back to the position that we ought to stick the indent
> >> code under src/tools/ in our main repo.  Is anyone really
> >> seriously against that?
> >
> > I think it would be better to have it separate.
> 
> +1.

+1.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RLS policy not getting honer while pg_dump on declarative partition

2017-06-19 Thread Stephen Frost
Amit,

* Amit Langote (langote_amit...@lab.ntt.co.jp) wrote:
> On 2017/06/17 9:20, Stephen Frost wrote:
> > I think we could certainly consider if this behavior is desirable in a
> > system which includes partitioning instead of inheritance,
> 
> Do we want CREATE POLICY foo ON parent creating the policy objects for all
> the partitions?  That is, cascade the policy object definition?

That seems to be what the OP is suggesting we should have.  I'm
certainly not convinced that we actually do want that though.  There are
clear reasons why it doesn't make sense for inheritance that aren't an
issue for partitions (for one thing, we don't have to worry about the
visible columns being different between the partitioned table and the
partitions), but that doesn't necessairly mean we want to make this
different for partitions vs. inheritance.

In any case though, I do tend to feel that it's rather too late to
consider changing things for PG10 in this area, even if we all felt that
it was the right/correct thing to do, which isn't clear.

> > but if we
> > wish to do so then I think we should be considering if the GRANT system
> > should also be changed as I do feel the two should be consistent.
> 
> IIUC, you are saying here that GRANT should be taught to cascade the
> permission grant/revokes to partitions.

What I'm saying here is that the way GRANT works and the way policies
are applied to partitioned tables should be consistent with each other.

> Also, the somewhat related nearby discussion about dumping the partition
> data through the root parent will perhaps have to think about some of
> these things.  Dumping data through the root table will take care of the
> problem that Rushabh is complaining about, because only rows visible per
> the parent's policies would be dumped.  Of course then the the set of rows
> dumped will be different from what it is today, because one would expect
> that a different set of policies would get applied - the root table's
> policies when dumping data through it vs. the individual partitions'
> policies when dumping data per partition.

While somewhat related, I don't think we should allow concerns about
pg_dump to drive what we're doing in the backend.  If pg_dump isn't
smart enough to allow different ways to dump the data out given the
considerations for what the backend supports/does, then we should add
new features to pg_dump, not reconsider what we're doing in the backend.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] RLS policy not getting honer while pg_dump on declarative partition

2017-06-16 Thread Stephen Frost
Greetings,

* Rushabh Lathia (rushabh.lat...@gmail.com) wrote:
> While doing some testing I noticed that RLS policy not getting honer
> while pg_dump on declarative partition.
> 
> I can understand that while doing SELECT on individual child
> table, policy of parent is not getting applied. But is this desirable
> behaviour? I think for partitions, any policy on the root table should
> get redirect to the child, thoughts?
> 
> If current behaviour is desirable then atleast we should document this.

The current behaviour matches how the GRANT system works, unless it's
been changed as part of the partitioning patches, we don't check the
privileges on tthe parent to see if an individual has access to the
child.

I think we could certainly consider if this behavior is desirable in a
system which includes partitioning instead of inheritance, but if we
wish to do so then I think we should be considering if the GRANT system
should also be changed as I do feel the two should be consistent.

Thinking it through a bit though, I would imagine someone certainly
might want to GRANT access to a given partition and not others, though
that could actually be done with an appropriate RLS policy on the
parent, but until we improve the performance of constraint exclusion (or
change entirely how all of that works with partitions...), I'm not sure
that's a practical answer in all cases.  It might also be the case that
one would wish for different policies to be used when a user is
accessing a table directly vs. going through the parent.

Thanks!

Stephen


signature.asc
Description: Digital signature


Re: [HACKERS] WIP: Data at rest encryption

2017-06-15 Thread Stephen Frost
Bruce,

* Bruce Momjian (br...@momjian.us) wrote:
> On Thu, Jun 15, 2017 at 07:27:55PM -0400, Stephen Frost wrote:
> > I expect the same would happen with the shell-command approach suggested
> > up-thread and the prompt-on-stdin approach too, they aren't great but I
> > expect users would still use the feature.  As Robert and I have
> > mentioned, there is a good bit of value to having this feature simply
> > because it avoids the need to get someone with root privileges to set up
> > an encrypted volume and I don't think having to use a shell command or
> > providing the password on stdin at startup really changes that very
> > much.
> 
> Understood, but now you are promoting a feature with an admittedly-poor
> API, duplication of an OS feature, and perhaps an invasive change to the
> code.  Those are high hurdles.

Of those, the only one that worries me, at least, is that it might be an
invasive and difficult to maintain code change.  As Robert said, and I
agree with, "duplication of an OS feature" is something we pretty
routinly, and justifiably, do.  The poor interface is unfortunate, but
if it's consistent with what we have today for a similar feature then
I'm really not too upset with it.  If we can do better, great, I'm all
for that, but if not, then I'd rather have the feature with the poor
interface than not have it at all.

If it's an invasive code change or one which ends up being difficult to
maintain, then that's a problem.  Getting some focus on that aspect
would be great and I certainly appreciate Robert's initial review and
commentary on it.

Thanks!

Stephen


signature.asc
Description: Digital signature


  1   2   3   4   5   6   7   8   9   10   >