Re: [RFC] [PATCH] Flexible "partition pruning" hook

2019-02-25 Thread Michael Paquier
On Tue, Feb 26, 2019 at 06:55:30AM +, Tsunakawa, Takayuki wrote:
> What concrete problems would you expect this patch to solve?  What
> kind of extensions do you imagine?  I'd like to hear about the
> examples.  For example, "PostgreSQL 12 will not be able to filter
> out enough partitions when planning/executing SELECT ... WHERE
> ... statement.  But an extension like this can extract just one
> partition early."

Indeed.  Hooks should be defined so as their use is as generic and
possible depending on their context, particularly since there is a
planner hook..  It is also important to consider first if the existing
core code can be made better depending on the requirements, removing
the need for a hook at the end.
--
Michael


signature.asc
Description: PGP signature


Re: Unused parameters & co in code

2019-02-25 Thread Michael Paquier
On Thu, Jan 31, 2019 at 03:47:59PM +0900, Michael Paquier wrote:
> - 0001 cleans up port in SendAuthRequest.  This one is disappointing,
> so I am fine to discard it.
> - 0002 works on _bt_relbuf, whose callers don't actually benefit from
> the cleanup as the relation worked on is always used for different
> reasons, so it can be discarded.
> - 0003 works on the code of GIN, which simplifies at least the code,
> so it could be applied.  This removes more than changed.
> - 0004 also cleans some code for clause parsing, with a negative line
> output.
> - 0005 is for pg_rewind, which is some stuff I introduced, so I'd like
> to clean up my mess and the change is recent :)
> - 0006 is for tablecmds.c, and something I would like to apply because
> it reduces some confusion with some recursion arguments which are not
> needed for constraint handling and inheritance.  Most of the complains
> come from lockmode not being used but all the AtPrep and AtExec
> routines are rather symmetric so I am not bothering about that.

A note for the archives: I have committed 0005 as 6e52209e because it
was directly something I worked on.  I have dropped the rest as I am
not clear if all those arguments can be useful for future use or not.
--
Michael


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Laurenz Albe
Fujii Masao wrote:
> Another idea is to improve an exclusive backup method so that it will never
> cause such issue. What about changing an exclusive backup mode of
> pg_start_backup() so that it creates something like backup_label.pending file
> instead of backup_label? Then if the database cluster has backup_label.pending
> file but not recovery.signal (this is the case where the database is recovered
> just after the server crashes while an exclusive backup is in progress),
> in this idea, the recovery using that database cluster always ignores
> (or removes) backup_label.pending file and start replaying WAL from
> the REDO location that pg_control file indicates. So this idea enables us to
> work around the issue that an exclusive backup could cause.

Then if you restore a backup, but forget to add the recovery.signal file,
PostgreSQL will happily recover from a wrong checkpoint and you end up with
a corrupted database.

I think the fundamental problem with all these approaches is that there is
no safe way to distinguish a server crashed in backup mode from a restored
backup.  This is what makes the problem so hard.

The existing exclusive backup is in my opinion the safest variant: it refuses
to create a corrupted cluster without manual intervention and gives you a dire
warning to consider if you are doing the right thing.

Yours,
Laurenz Albe




RE: [RFC] [PATCH] Flexible "partition pruning" hook

2019-02-25 Thread Tsunakawa, Takayuki
From: Mike Palmiotto [mailto:mike.palmio...@crunchydata.com]
> Attached is a patch which attempts to solve a few problems:
> 
> 1) Filtering out partitions flexibly based on the results of an external
> function call (supplied by an extension).
> 2) Filtering out partitions from pg_inherits based on the same function
> call.
> 3) Filtering out partitions from a partitioned table BEFORE the partition
> is actually opened on-disk.

What concrete problems would you expect this patch to solve?  What kind of 
extensions do you imagine?  I'd like to hear about the examples.  For example, 
"PostgreSQL 12 will not be able to filter out enough partitions when 
planning/executing SELECT ... WHERE ... statement.  But an extension like this 
can extract just one partition early."

Would this help the following issues with PostgreSQL 12?

* UPDATE/DELETE planning takes time in proportion to the number of partitions, 
even when the actually accessed partition during query execution is only one.

* Making a generic plan takes prohibitably long time (IIRC, about 12 seconds 
when the number of partitoons is 1,000 or 8,000.)


Regards
Takayuki Tsunakawa







Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

On 2/26/19 6:51 AM, Michael Paquier wrote:

On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:

Here's the really obvious bad thing: if users do not update their procedures
and we ignore backup_label.pending on startup then they will end up with a
corrupt database because it will not replay from the correct checkpoint.  If
we error on the presence of backup_label.pending then we are right back to
where we started.


Not really.  If we error on backup_label.pending, we can make the
difference between a backend which has crashed in the middle of an
exclusive backup without replaying anything and a backend which is
started based on a base backup, so an operator can take some action to
see what's wrong with the server.  If you issue an error, users can
also see that their custom backup script is wrong because they forgot
to rename the flag after taking a backup of the data folder(s).


The operator still has a decision to make, manually, just as they do 
now.  The wrong decision may mean a corrupt database.


Here's the scenario:

1) They do a restore, forget to rename backup_label.pending.
2) Postgres won't start, which is the same action we take now.
3) The user is not sure what to do, rename or delete?  They delete, and 
the cluster is corrupted.


Worse, they have scripted the deletion of backup_label so that the 
cluster will restart on crash.  This is the recommendation from our 
documentation after all.  If that script runs after a restore instead of 
a crash, then the cluster will be corrupt -- silently.


--
-David
da...@pgmasters.net



Re: Reaping Temp tables to avoid XID wraparound

2019-02-25 Thread Michael Paquier
On Fri, Feb 22, 2019 at 04:01:02PM +0100, Magnus Hagander wrote:
> I did the "insert column in the middle of pg_stat_get_activity", I'm not
> sure that is right -- how do we treate that one? Do we just append at the
> end because people are expected to use the pg_stat_activity view? It's a
> nontrivial part of the patch.

I think that it would be more confusing to add the new column at the
tail, after all the SSL fields.

> That one aside, does the general way to track it appear reasonable? (docs
> excluded until we have agreement on that)

It does.  A temp table is associated to a session so it's not like
autovacuum can work on it.  With this information it is at least
possible to take actions.  We could even get autovacuum to kill such
sessions. /me hides

> And should we also expose the oid in pg_stat_activity in this case, since
> we have it?

For the case reported here, just knowing the XID where the temporary
namespace has been created is enough so as the goal is to kill the
session associated with it.  Still, it seems to me that knowing the
temporary schema name used by a given session is useful, and that's
cheap to get as the information is already there.

One problem that I can see with your patch is that you would set the
XID once any temporary object created, including when objects other
than tables are created in pg_temp, including functions, etc.  And it
does not really matter for wraparound monitoring.  Still, the patch is
simple..
--
Michael


signature.asc
Description: PGP signature


RE: reloption to prevent VACUUM from truncating empty pages at the end of relation

2019-02-25 Thread Tsunakawa, Takayuki
From: Masahiko Sawada [mailto:sawada.m...@gmail.com]
> This test expects that the inserted tuple is always reclaimed by
> subsequent vacuum, but it's not always true if there are concurrent
> transactions. So size of the reloptions_test table will not be 0 if
> the tuple is not vacuumed. In my environment this test sometimes
> failed with 'make check -j 4'.

Hmm, "make check -j4" certainly fails on my poor VM, too.

Modifying src/test/regress/parallel_schedule to put the reloptions test on a 
separate line seems to have fixed this issue.  Do you think this is the correct 
remedy?


Regards
Takayuki Tsunakawa






Re: No-rewrite timestamp<->timestamptz conversions

2019-02-25 Thread Noah Misch
On Thu, Feb 05, 2015 at 08:36:18PM -0500, Noah Misch wrote:
> On Tue, Nov 05, 2013 at 05:02:58PM -0800, Josh Berkus wrote:
> > I'd also love some way of doing a no-rewrite conversion between
> > timestamp and timestamptz, based on the assumption that the original
> > values are UTC time.  That's one I encounter a lot.
> 
> It was such a conversion that motivated me to add the no-rewrite ALTER TABLE
> ALTER TYPE support in the first place.  Interesting.  Support for it didn't
> end up in any submitted patch due to a formal problem: a protransform function
> shall only consult IMMUTABLE facts, but we posit that timezone==UTC is a
> STABLE observation.  However, a protransform function can easily simplify the
> immutable expression "tscol AT TIME ZONE 'UTC'", avoiding a rewrite.  See
> attached patch.

This (commit b8a18ad) ended up causing wrong EXPLAIN output and wrong
indxpath.c processing.  Hence, commit c22ecc6 neutralized the optimization;
see that commit's threads for details.  I pondered ways to solve those
problems, but I didn't come up with anything satisfying for EXPLAIN.  (One
dead-end thought was to introduce an ExprShortcut node having "Node
*semantics" and "Node *shortcut" fields, where "semantics" is deparsed for
EXPLAIN and "shortcut" is actually evaluated.  That would require teaching
piles of code about the new node type, which isn't appropriate for the benefit
in question.)

Stepping back a bit, commit b8a18ad didn't provide a great UI.  I doubt folks
write queries this way spontaneously; to do so, they would have needed to
learn that such syntax enables this optimization.  If I'm going to do
something more invasive, it should optimize the idiomatic "alter table t alter
timestamptzcol type timestamp".  One could do that with a facility like
SupportRequestSimplify except permitted to consider STABLE facts.  I suppose I
could add a volatility field to SupportRequestSimplify.  So far, I can't think
of a second use case for such a facility, so instead I think
ATColumnChangeRequiresRewrite() should have a hard-wired call for
F_TIMESTAMPTZ_TIMESTAMP and F_TIMESTAMP_TIMESTAMPTZ.  Patch attached.  If we
find more applications of this concept, it shouldn't be hard to migrate this
logic into SupportRequestSimplify.  Does anyone think that's better to do from
the start?

Thanks,
nm
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35bdb0e..74dd032 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -95,6 +95,7 @@
 #include "utils/ruleutils.h"
 #include "utils/snapmgr.h"
 #include "utils/syscache.h"
+#include "utils/timestamp.h"
 #include "utils/typcache.h"
 
 
@@ -9634,11 +9635,15 @@ ATPrepAlterColumnType(List **wqueue,
  * When the data type of a column is changed, a rewrite might not be required
  * if the new type is sufficiently identical to the old one, and the USING
  * clause isn't trying to insert some other value.  It's safe to skip the
- * rewrite if the old type is binary coercible to the new type, or if the
- * new type is an unconstrained domain over the old type.  In the case of a
- * constrained domain, we could get by with scanning the table and checking
- * the constraint rather than actually rewriting it, but we don't currently
- * try to do that.
+ * rewrite in these cases:
+ *
+ * - the old type is binary coercible to the new type
+ * - the new type is an unconstrained domain over the old type
+ * - {NEW,OLD} or {OLD,NEW} is {timestamptz,timestamp} and the timezone is UTC
+ *
+ * In the case of a constrained domain, we could get by with scanning the
+ * table and checking the constraint rather than actually rewriting it, but we
+ * don't currently try to do that.
  */
 static bool
 ATColumnChangeRequiresRewrite(Node *expr, AttrNumber varattno)
@@ -9660,6 +9665,23 @@ ATColumnChangeRequiresRewrite(Node *expr, AttrNumber 
varattno)
return true;
expr = (Node *) d->arg;
}
+   else if (IsA(expr, FuncExpr))
+   {
+   FuncExpr   *f = (FuncExpr *) expr;
+
+   switch (f->funcid)
+   {
+   case F_TIMESTAMPTZ_TIMESTAMP:
+   case F_TIMESTAMP_TIMESTAMPTZ:
+   if 
(TimestampTimestampTzRequiresRewrite())
+   return true;
+   else
+   expr = linitial(f->args);
+   break;
+   default:
+   return true;
+   }
+   }
else
return true;
}
diff --git a/src/backend/utils/adt/timestamp.c 
b/src/backend/utils/adt/timestamp.c
index e0ef2f7..1b0effa 100644
--- a/src/backend/utils/adt/timestamp.c

Re: Segfault when restoring -Fd dump on current HEAD

2019-02-25 Thread Michael Paquier
On Tue, Feb 26, 2019 at 12:16:35AM -0500, Tom Lane wrote:
> Well, if we didn't want to fix this, a reasonable way to go about
> it would be to bump the archive version number in pg_dump output,
> so that old versions would issue a useful complaint instead of crashing.
> However, I repeat that this patch was sold as a notational improvement,
> not something that was going to break format compatibility.  I think if
> anyone had mentioned the latter, there would have been push-back against
> its being committed at all.  I am providing such push-back right now,
> because I don't think we should break file compatibility for this.

While I agree that the patch makes handling of the different fields in
archive entries cleaner, I agree as well that this is not enough to
justify a dump version bump.

> I think this patch needs to be worked over so that what it writes
> is exactly what was written before.  If the author is unwilling
> to do that PDQ, it should be reverted.

Works for me.  With a quick read of the code, it seems to me that it
is possible to keep compatibility while keeping the simplifications
around ArchiveEntry()'s refactoring.  Alvaro?
--
Michael


signature.asc
Description: PGP signature


Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
David Rowley  writes:
> Using the attached patch (as text file so as not to upset the CFbot),
> which basically just measures and logs the time taken to run
> pg_plan_query. ...
> Surprisingly it took 1.13% longer.  I did these tests on an AWS
> md5.large instance.

Interesting.  Seems to suggest that maybe the cases I discounted
as being infrequent aren't so infrequent?  Another possibility
is that the new coding adds more cycles to foreach() loops than
I'd hoped for.

Anyway, it's just a POC; the main point at this stage is to be
able to make such comparisons at all.  If it turns out that we
*can't* make this into a win, then all that bellyaching about
how inefficient Lists are was misinformed ...

regards, tom lane



Re: [HACKERS] generated columns

2019-02-25 Thread Michael Paquier
On Mon, Feb 25, 2019 at 09:46:35PM +0100, Peter Eisentraut wrote:
> The virtual generated column part is still a bit iffy.  I'm still
> finding places here and there where virtual columns are not being
> expanded correctly.  Maybe it needs more refactoring.  One big unsolved
> issue is how the storage of such columns should work.  Right now, they
> are stored as nulls.  That works fine, but what I suppose we'd really
> want is to not store them at all.  That, however, creates all kinds of
> complications in the planner if target lists have non-matching lengths
> or the resnos don't match up.  I haven't figured out how to do this
> cleanly.

Hmm.  Not storing virtual columns looks like the correct concept to
me instead of storing them as NULL.

> So I'm thinking if we can get agreement on the stored columns, I can cut
> out the virtual column stuff for PG12.  That should be fairly easy.

The shape of what is used for stored columns looks fine to me.

+   if (attgenerated)
+   {
+   /*
+* Generated column: Dropping anything that the generation expression
+* refers to automatically drops the generated column.
+*/
+   recordDependencyOnSingleRelExpr(, expr, RelationGetRelid(rel),
+   DEPENDENCY_AUTO,
+   DEPENDENCY_AUTO, false);
+   }
A CCI is not necessary I think here, still the recent thread about
automatic dependencies with identity columns had a much similar
pattern...

+   else if (generated[0] == ATTRIBUTE_GENERATED_VIRTUAL)
+   default_str = psprintf("generated always as (%s)", 
PQgetvalue(res, i, attrdef_col));
Nit: I would add VIRTUAL instead of relying on the default option.

Another thing I was thinking about: could it be possible to add a
sanity check in sanity_check.sql so as a column more that one
field in attidentity, attgenerated and atthasdef set at the same time?
--
Michael


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Mark Kirkwood



On 26/02/19 5:41 PM, Stephen Frost wrote:

Greetings Mark,

* Mark Kirkwood (mark.kirkw...@catalyst.net.nz) wrote:

ISTM that the onus should be on the patch submitter to provide additions to
pg_basebackup that make it as painless as possible for those people *not*
using pgBackRest to continue making backups. Breaking this is just not
right. Submitting patches that mean that people *must* use pgBackRest is
also not right IMHO.

I'm sorry that there's some confusion here- to be clear, no one is
required to use pgBackRest.  pg_basebackup works quite well and wouldn't
be impacted by the changes proposed no this thread.  The arguments
against removing the exclusive backup feature don't have anything to do
with pg_basebackup.

Ah yes (checks pg_basbackup code), you are correct! Reading this thread 
I thought I saw a comment to the effect that pg_basebackup was being 
broken, hence the less than impressed post.


Your relentless bashing of people doing their own backups and heavy 
marketing of pgBackRest - unfortunately - made it easy for me to believe 
that this was a possibility that you might see as ok. So - apologies for 
the misunderstanding, however less marketing of your own product would 
avoid me jumping to the wrong conclusion.


regards

Mark




Re: Segfault when restoring -Fd dump on current HEAD

2019-02-25 Thread Tom Lane
Michael Paquier  writes:
> On Mon, Feb 25, 2019 at 11:20:14AM -0500, Tom Lane wrote:
>> It appears to me that f831d4acc required a good deal more adult
>> supervision than it actually got.  That was alleged to be a small
>> notational refactoring, not a redefinition of what gets put into
>> dump files.

> How much consistent do we need to be for custom dump archives
> regarding backward and upward-compatibility?

Well, if we didn't want to fix this, a reasonable way to go about
it would be to bump the archive version number in pg_dump output,
so that old versions would issue a useful complaint instead of crashing.
However, I repeat that this patch was sold as a notational improvement,
not something that was going to break format compatibility.  I think if
anyone had mentioned the latter, there would have been push-back against
its being committed at all.  I am providing such push-back right now,
because I don't think we should break file compatibility for this.

Also, I'm now realizing that 4dbe19690 was probably not fixing an
aboriginal bug, but something that this patch introduced, because
we'd likely have noticed it before if the owner field could have
been a null pointer all along.  How much do you want to bet on
whether there are other such bugs now, that we haven't yet tripped
over?

I think this patch needs to be worked over so that what it writes
is exactly what was written before.  If the author is unwilling
to do that PDQ, it should be reverted.

regards, tom lane



Re: [HACKERS] generated columns

2019-02-25 Thread Michael Paquier
On Mon, Feb 25, 2019 at 09:51:52PM +0100, Peter Eisentraut wrote:
> On 2019-01-15 08:13, Michael Paquier wrote:
>> +   boolhas_generated_stored;
>> +   boolhas_generated_virtual;
>>  } TupleConstr;
>> Could have been more simple to use a char as representation here.
> 
> Seems confusing if both apply at the same time.

Ouch, I see.  The flags count for all attributes.  I missed that in a
previous read of the patch.  Yeah, two booleans make sense.

>> When testing a bulk INSERT into a table which has a stored generated
>> column, memory keeps growing in size linearly, which does not seem
>> normal to me.
> 
> This was a silly coding error.  It's fixed in v8.

Thanks, this one looks fine.

>> +/*
>> + * Thin wrapper around libpq to obtain server version.
>> + */
>> +static int
>> +libpqrcv_server_version(WalReceiverConn *conn)
>> This should be introduced in separate patch in my opinion (needed
>> afterwards for logirep).
> 
> Yes, it could be committed separately.

I would split that one and I think that it could go in.  If you wish
to keep things grouped that's fine by me as well.

>> What about the catalog representation of attgenerated?  Would it merge
>> with attidentity & co?  Or not?
> 
> I think the way it is now seems best.  The other options that were
> discussed are also plausible, but that the discussions did not reveal
> any overwhelming arguments for a a change.

Hm.  Does the SQL standard mention more features which could be merged
with stored values, virtual values, default expressions and identity
columns?  I don't know the last trends in this area but I am wondering
if there are any other column specific, expression-like features like
that associated to a column.  That could give more strength with
having one column in pg_attribute to govern them all.  Well, assuming
that something else is implemented of course.  That's a lot of
assumptions, and it's not like the current implementation is wrong
either.
--
Michael


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Michael Paquier
On Mon, Feb 25, 2019 at 08:17:27PM +0200, David Steele wrote:
> Here's the really obvious bad thing: if users do not update their procedures
> and we ignore backup_label.pending on startup then they will end up with a
> corrupt database because it will not replay from the correct checkpoint.  If
> we error on the presence of backup_label.pending then we are right back to
> where we started.

Not really.  If we error on backup_label.pending, we can make the
difference between a backend which has crashed in the middle of an
exclusive backup without replaying anything and a backend which is
started based on a base backup, so an operator can take some action to
see what's wrong with the server.  If you issue an error, users can
also see that their custom backup script is wrong because they forgot
to rename the flag after taking a backup of the data folder(s).

The real pain point about the non-exclusive APIs is really that we
need to keep the connection opened, which can be done easily using
drivers like psycopg2 or such, still that's far from straight-forward
for the average enterprise user, and it can be hard for some companies
to complicate a software stack with more external dependencies.  So
yes, while I can see the long-term benefits of non-exclusive backups,
they are much more complicated to use than exclusive backups.
--
Michael


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings Mark,

* Mark Kirkwood (mark.kirkw...@catalyst.net.nz) wrote:
> ISTM that the onus should be on the patch submitter to provide additions to
> pg_basebackup that make it as painless as possible for those people *not*
> using pgBackRest to continue making backups. Breaking this is just not
> right. Submitting patches that mean that people *must* use pgBackRest is
> also not right IMHO.

I'm sorry that there's some confusion here- to be clear, no one is
required to use pgBackRest.  pg_basebackup works quite well and wouldn't
be impacted by the changes proposed no this thread.  The arguments
against removing the exclusive backup feature don't have anything to do
with pg_basebackup.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Segfault when restoring -Fd dump on current HEAD

2019-02-25 Thread Michael Paquier
On Mon, Feb 25, 2019 at 11:20:14AM -0500, Tom Lane wrote:
> It appears to me that f831d4acc required a good deal more adult
> supervision than it actually got.  That was alleged to be a small
> notational refactoring, not a redefinition of what gets put into
> dump files.

How much consistent do we need to be for custom dump archives
regarding backward and upward-compatibility?  For dumps, we give no
guarantee that a dump taken with pg_dump on version N will be
compatible with a backend at version (N+1), so there is a effort for
backward compatibility, not really upward compatibility.

It seems to me that f831d4acc should have bumped at least
MAKE_ARCHIVE_VERSION as it changes the dump contents, still it seems
like a lot of for some refactoring?  FWIW, I have gone through the
commit's thread and I actually agree that instead of a mix of empty
strings and NULL, using only NULL is cleaner.
--
Michael


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Robert,

* Robert Haas (robertmh...@gmail.com) wrote:
> Hmm, so what you're saying is that you'd like to disable an API that
> some non-backrest users are relying upon but which no backrest users
> are relying upon.  And you don't understand why some non-backrest
> users are opposed to that plan.  Is that a correct summary of your
> position?

This topic has come up more than once and on multiple threads (entirely
independent of pgbackrest and any of the other individuals on this
thread) and I think it needs to be a discussion held at the PGCon
Developer meeting.

I'll propose it.

If you'd like to propose something to fix the, as well discussed,
dangerous exclusive backup method, or further litigate at what point a
dangerous and misleading interface should be removed, I'm sure you'll
find we'd be happy to talk about that.

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Pluggable Storage - Andres's take

2019-02-25 Thread Amit Khandekar
On Sat, 23 Feb 2019 at 01:22, Robert Haas  wrote:
>
> On Fri, Feb 22, 2019 at 11:19 AM Amit Khandekar  
> wrote:
> > Thanks for the review. Attached v2.
>
> Thanks.  I took this, combined it with Andres's
> v12-0040-WIP-Move-xid-horizon-computation-for-page-level-.patch, did
> some polishing of the code and comments, and pgindented.  Here's what
> I ended up with; see what you think.

Thanks Robert ! The changes look good.


-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 4:38 PM David Steele  wrote:
> > FWIW, if you weren't selling backrest quite so hard everywhere backups
> > are mentioned, I'd find this thread a lot more convicing.
>
> pgBackRest has not used exclusive backups since the new API was
> introduced in 9.6 so this is not an issue for our users.
>
> Over time we have contributed back to Postgres in areas we thought could
> be improved based on our work on the pgBackRest project: 6ad8ac60,
> 9fe3c644, 017e4f25, 78874531, 449338cc, 98267ee8, 8694cc96, 920a5e50,
> c37b3d08, 5fc1670b, b981df4c.  This does not include the various backup
> related patches that we have reviewed.
>
> If promoting pgBackRest were our primary concern then it would be in our
> interest to allow Postgres exclusive backups to stay broken and
> pg_basebackup to be as primitive as possible.

Hmm, so what you're saying is that you'd like to disable an API that
some non-backrest users are relying upon but which no backrest users
are relying upon.  And you don't understand why some non-backrest
users are opposed to that plan.  Is that a correct summary of your
position?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Tue, Feb 26, 2019 at 8:38 AM John Naylor  wrote:
>
> On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> > I will look into it today and respond back with my findings.  John,
> > see if you also get time to investigate this.
>
> Will do, but it will likely be tomorrow
>

No problem, thanks!

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 11:14 PM Tom Lane  wrote:
>
> I wrote:
> > Amit Kapila  writes:
> >> Avoid creation of the free space map for small heap relations, take 2.
>
> > I think this patch still has some issues.
>
> Just out of curiosity ... how can it possibly be even a little bit sane
> that fsm_local_map is a single static data structure, without even any
> indication of which table it is for?
>

It is just for finding a free block among a few blocks of relation.
We clear this when we have found a block with enough free space, when
we extend the relation, or on transaction abort.  So, I think we don't
need any per table information.

> If, somehow, there's a rational argument for that design, why is it
> not explained in freespace.c?  The most charitable interpretation of
> what I see there is that it's fatally undercommented.
>

There is some explanation in freespace.c and in README
(src/backend/storage/freespace/README).  I think we should add some
more information where this data structure (FSMLocalMap) is declared.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread John Naylor
On Tue, Feb 26, 2019 at 8:59 AM Amit Kapila  wrote:
> I will look into it today and respond back with my findings.  John,
> see if you also get time to investigate this.

Will do, but it will likely be tomorrow

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: POC: converting Lists into arrays

2019-02-25 Thread David Rowley
On Sun, 24 Feb 2019 at 15:24, Tom Lane  wrote:
> I haven't attempted any performance measurements on this, but at
> least in principle it should speed things up a bit, especially
> for complex test cases involving longer Lists.  I don't have any
> very suitable test cases at hand, anyway.

I've not yet looked at the code, but I thought I'd give this a quick benchmark.

Using the attached patch (as text file so as not to upset the CFbot),
which basically just measures and logs the time taken to run
pg_plan_query. Using this, I ran make installcheck 3 times unpatched
and same again with the patch. I pulled the results of each run into a
spreadsheet and took the average of each of the 3 runs then took the
sum of the total average planning time over the 20334 individual
results.

Results patched atop of 067786cea:

Total average time unpatched:  0.739808667 seconds
Total average time patched:  0.748144333 seconds.

Surprisingly it took 1.13% longer.  I did these tests on an AWS
md5.large instance.

If required, I can send the entire spreadsheet. It's about 750 KB.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index 8b4d94c9a1..437256a35f 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -21,6 +21,7 @@
 
 #include 
 #include 
+#include 
 #include 
 #include 
 #include 
@@ -965,7 +966,15 @@ pg_plan_queries(List *querytrees, int cursorOptions, 
ParamListInfo boundParams)
}
else
{
+   clock_t start, end;
+
+   start = clock();
+
stmt = pg_plan_query(query, cursorOptions, boundParams);
+
+   end = clock();
+
+   elog(LOG, "PlannerTime: %f", (double)(end - start) / 
CLOCKS_PER_SEC);
}
 
stmt_list = lappend(stmt_list, stmt);


RE: SQL statement PREPARE does not work in ECPG

2019-02-25 Thread Takahashi, Ryohei
Hi Matsumura-san,


Thank you for your explaining.

> An important point of the route is that it calls PQprepare() and PQprepare()
> needs type-Oid list. (Idea-1) If we fix for Prepare statement with AS clause 
> and
> with parameter list to walk through the route, preprocessor must parse the 
> parameter list and
> preprocessor or ecpglib must make type-Oid list. I think it's difficult.
> Especially, I wonder if it can treat user defined type and complex structure 
> type.

I agree.
In the case of standard types, ECPG can get oids from pg_type.h.
However, in the case of user defined types, ECPG needs to access pg_type table 
and it is overhead.

Therefore, the second idea seems better.


By the way, should we support prepare statement like following?
(I think yes.)


EXEC SQL PREPARE test_prep (int) AS SELECT id from test_table where id = :ID or 
id =$1;


Current ECPG produces following code.


ECPGdo(__LINE__, 0, 1, NULL, 0, ECPGst_normal, "prepare \"test_prep\" ( int ) 
as \" select id from test_table where id = $1  or id = $1 \"",
ECPGt_int,&(ID),(long)1,(long)1,sizeof(int),
ECPGt_NO_INDICATOR, NULL , 0L, 0L, 0L, ECPGt_EOIT, ECPGt_EORT);



In this case, both ":ID" and "$1" in the original statement are converted to 
"$1" and ECPGdo() cannot distinguish them.
Therefore, ECPG should produce different code.

For example, 
- ECPG convert ":ID" to "$1" and "$1" in the original statement to "$$1"
- next_insert() do not check "$$1"
- ECPGdo() reconvert "$$1" to "$1"



Regards,
Ryohei Takahashi




Re: JIT on FreeBSD ARMv7

2019-02-25 Thread Andrew Gierth
> "Andrew" == Andrew Gierth  writes:

 Andrew> This seems to be the cleanest way to get a JIT build working on
 Andrew> FreeBSD on the armv7 platform. Without this, it fails because
 Andrew> the ABI functions __aeabi_ldivmod and so on are not found by
 Andrew> the symbol lookup for JITted code.

Turns out it needs more symbols, as per the attached.

-- 
Andrew (irc:RhodiumToad)

diff --git a/src/backend/jit/llvm/abi_funcs_arm.h b/src/backend/jit/llvm/abi_funcs_arm.h
new file mode 100644
index 00..d2eaaa82f0
--- /dev/null
+++ b/src/backend/jit/llvm/abi_funcs_arm.h
@@ -0,0 +1,164 @@
+/*
+ * List of ARM ABI functions to forcibly import.
+ *
+ * There is intentionally no #if guard here
+ */
+
+/* LLVM compiler_rt */
+ABISYM(__absvdi2)
+ABISYM(__absvsi2)
+ABISYM(__addvdi3)
+ABISYM(__addvsi3)
+ABISYM(__ashldi3)
+ABISYM(__ashrdi3)
+ABISYM(__bswapdi2)
+ABISYM(__bswapsi2)
+ABISYM(__clzdi2)
+ABISYM(__clzsi2)
+ABISYM(__cmpdi2)
+ABISYM(__ctzdi2)
+ABISYM(__ctzsi2)
+ABISYM(__divdc3)
+ABISYM(__divdi3)
+ABISYM(__divmoddi4)
+ABISYM(__divmodsi4)
+ABISYM(__divsc3)
+ABISYM(__divsi3)
+ABISYM(__divtc3)
+ABISYM(__divxc3)
+ABISYM(__extendhfsf2)
+ABISYM(__ffsdi2)
+ABISYM(__ffssi2)
+ABISYM(__fixdfdi)
+ABISYM(__fixsfdi)
+ABISYM(__fixunsdfdi)
+ABISYM(__fixunsdfsivfp)
+ABISYM(__fixunssfdi)
+ABISYM(__fixunssfsivfp)
+ABISYM(__fixunsxfdi)
+ABISYM(__fixunsxfsi)
+ABISYM(__fixxfdi)
+ABISYM(__floatdidf)
+ABISYM(__floatdisf)
+ABISYM(__floatdixf)
+ABISYM(__floatundidf)
+ABISYM(__floatundisf)
+ABISYM(__floatundixf)
+ABISYM(__floatunsidf)
+ABISYM(__floatunsisf)
+ABISYM(__gcc_personality_v0)
+ABISYM(__gnu_f2h_ieee)
+ABISYM(__gnu_h2f_ieee)
+ABISYM(__lshrdi3)
+ABISYM(__moddi3)
+ABISYM(__modsi3)
+ABISYM(__muldc3)
+ABISYM(__muldi3)
+ABISYM(__mulodi4)
+ABISYM(__mulosi4)
+ABISYM(__mulsc3)
+ABISYM(__multc3)
+ABISYM(__mulvdi3)
+ABISYM(__mulvsi3)
+ABISYM(__mulxc3)
+ABISYM(__negdf2vfp)
+ABISYM(__negdi2)
+ABISYM(__negsf2vfp)
+ABISYM(__negvdi2)
+ABISYM(__negvsi2)
+ABISYM(__paritydi2)
+ABISYM(__paritysi2)
+ABISYM(__popcountdi2)
+ABISYM(__popcountsi2)
+ABISYM(__powidf2)
+ABISYM(__powisf2)
+ABISYM(__powixf2)
+ABISYM(__subvdi3)
+ABISYM(__subvsi3)
+ABISYM(__truncdfhf2)
+ABISYM(__truncsfhf2)
+ABISYM(__ucmpdi2)
+ABISYM(__udivdi3)
+ABISYM(__udivmoddi4)
+ABISYM(__udivmodsi4)
+ABISYM(__udivsi3)
+ABISYM(__umoddi3)
+ABISYM(__umodsi3)
+
+/* ARM EABI */
+ABISYM(__aeabi_atexit)
+ABISYM(__aeabi_cdcmpeq)
+ABISYM(__aeabi_cdcmple)
+ABISYM(__aeabi_cdrcmple)
+ABISYM(__aeabi_cfcmpeq)
+ABISYM(__aeabi_cfcmple)
+ABISYM(__aeabi_cfrcmple)
+ABISYM(__aeabi_d2f)
+ABISYM(__aeabi_d2h)
+ABISYM(__aeabi_d2iz)
+ABISYM(__aeabi_d2lz)
+ABISYM(__aeabi_d2ulz)
+ABISYM(__aeabi_dadd)
+ABISYM(__aeabi_dcmpeq)
+ABISYM(__aeabi_dcmpge)
+ABISYM(__aeabi_dcmpgt)
+ABISYM(__aeabi_dcmple)
+ABISYM(__aeabi_dcmplt)
+ABISYM(__aeabi_dcmpun)
+ABISYM(__aeabi_ddiv)
+ABISYM(__aeabi_dmul)
+ABISYM(__aeabi_dsub)
+ABISYM(__aeabi_f2d)
+ABISYM(__aeabi_f2h)
+ABISYM(__aeabi_f2iz)
+ABISYM(__aeabi_f2lz)
+ABISYM(__aeabi_f2ulz)
+ABISYM(__aeabi_fadd)
+ABISYM(__aeabi_fcmpeq)
+ABISYM(__aeabi_fcmpge)
+ABISYM(__aeabi_fcmpgt)
+ABISYM(__aeabi_fcmple)
+ABISYM(__aeabi_fcmplt)
+ABISYM(__aeabi_fcmpun)
+ABISYM(__aeabi_fdiv)
+ABISYM(__aeabi_fmul)
+ABISYM(__aeabi_fsub)
+ABISYM(__aeabi_h2f)
+ABISYM(__aeabi_i2d)
+ABISYM(__aeabi_i2f)
+ABISYM(__aeabi_idiv)
+ABISYM(__aeabi_idivmod)
+ABISYM(__aeabi_l2d)
+ABISYM(__aeabi_l2f)
+ABISYM(__aeabi_lasr)
+ABISYM(__aeabi_lcmp)
+ABISYM(__aeabi_ldivmod)
+ABISYM(__aeabi_llsl)
+ABISYM(__aeabi_llsr)
+ABISYM(__aeabi_lmul)
+ABISYM(__aeabi_memclr)
+ABISYM(__aeabi_memclr4)
+ABISYM(__aeabi_memclr8)
+ABISYM(__aeabi_memcmp)
+ABISYM(__aeabi_memcmp4)
+ABISYM(__aeabi_memcmp8)
+ABISYM(__aeabi_memcpy)
+ABISYM(__aeabi_memcpy4)
+ABISYM(__aeabi_memcpy8)
+ABISYM(__aeabi_memmove)
+ABISYM(__aeabi_memmove4)
+ABISYM(__aeabi_memmove8)
+ABISYM(__aeabi_memset)
+ABISYM(__aeabi_memset4)
+ABISYM(__aeabi_memset8)
+ABISYM(__aeabi_read_tp)
+ABISYM(__aeabi_ui2d)
+ABISYM(__aeabi_ui2f)
+ABISYM(__aeabi_uidiv)
+ABISYM(__aeabi_uidivmod)
+ABISYM(__aeabi_ul2d)
+ABISYM(__aeabi_ul2f)
+ABISYM(__aeabi_ulcmp)
+ABISYM(__aeabi_uldivmod)
+
+/* end */
diff --git a/src/backend/jit/llvm/llvmjit.c b/src/backend/jit/llvm/llvmjit.c
index 82c4afb701..b450e70c37 100644
--- a/src/backend/jit/llvm/llvmjit.c
+++ b/src/backend/jit/llvm/llvmjit.c
@@ -606,6 +606,22 @@ llvm_compile_module(LLVMJitContext *context)
 			 errhidecontext(true)));
 }
 
+/*
+ * Handle any ABI issues.
+ *
+ * Add more entries here for other archs as needed.
+ */
+static void
+llvm_load_abi(void)
+{
+#if (defined(__arm__) || defined(__arm)) && \
+!(defined(__aarch64__) || defined(__aarch64))
+#define ABISYM(s) { extern void s(void); LLVMAddSymbol(#s, ); }
+#include "abi_funcs_arm.h"
+#undef ABISYM
+#endif
+}
+
 /*
  * Per session initialization.
  */
@@ -664,6 +680,9 @@ llvm_session_initialize(void)
 	LLVMDisposeMessage(features);
 	features = NULL;
 
+	/* force load any ABI symbols needed by the platform */
+	llvm_load_abi();
+
 	/* force symbols in main binary to be loaded */
 	

Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 17:55:46 -0800, Andres Freund wrote:
> Hm, I wonder if that's necessary / whether we can just work around user
> visible breakage at a small cost. I think I'm mostly concerned with two
> allocations for the very common case of small (1-3 entries) lists. We
> could just allocate the first array together with the header, and not
> free that if the list grows beyond that point. That'd mean we'd only do
> separate allocations once they actually amortize over a number of
> allocations.

Btw, if we actually were going to go for always allocating header + data
together (and thus incuring the problems you mention upthread), we ought
to store the members as a FLEXIBLE_ARRAY_MEMBER together with the
list. Probably not worth it, but reducing the number of pointer
indirections for "list" accesses would be quite neat.

Greetings,

Andres Freund



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Amit Kapila
On Mon, Feb 25, 2019 at 10:32 PM Tom Lane  wrote:
>
> Amit Kapila  writes:
> > Avoid creation of the free space map for small heap relations, take 2.
>
> I think this patch still has some issues.
>

I will look into it today and respond back with my findings.  John,
see if you also get time to investigate this.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 18:41:17 -0500, Tom Lane wrote:
> I wrote:
> > Andres Freund  writes:
> >>> 1. This still involves at least two palloc's for every nonempty List,
> >>> because I kept the header and the data array separate.  Perhaps it's
> >>> worth allocating those in one palloc.
> 
> >> Hm, I think if we force external code to audit their code, we better
> >> also do this. This is a significant number of allocations, and I don't
> >> think it'd be good to spread this out over two releases.
> 
> > If we choose to do it, I'd agree with doing both in the same major release
> > cycle, so that extensions see just one breakage.  But I think it'd still
> > best be developed as a follow-on patch.
> 
> By the by ... this idea actively breaks the mechanism I'd proposed for
> preserving foreach's behavior of evaluating the List reference only once.
> If we save a hidden copy of whatever the user says the List reference
> is, and then he assigns a new value to it mid-loop, we're screwed if
> the list header can move.

Hm, I wonder if that's necessary / whether we can just work around user
visible breakage at a small cost. I think I'm mostly concerned with two
allocations for the very common case of small (1-3 entries) lists. We
could just allocate the first array together with the header, and not
free that if the list grows beyond that point. That'd mean we'd only do
separate allocations once they actually amortize over a number of
allocations.

Greetings,

Andres Freund



RE: Timeout parameters

2019-02-25 Thread Nagaura, Ryohei
Hi, kirk-san.

> From: Jamison, Kirk 
> $ git apply ../socket_timeout_v5.patch
> fatal: corrupt patch at line 24
> --> l24: diff --git a/src/interfaces/libpq/fe-connect.c
> --> b/src/interfaces/libpq/fe-connect.c
How about applying by "patch -p1"?
I have confirmed that my patch could be applied in that way while not confirmed 
using "git apply" command.

> a. Use parse_int_param instead of atoi
> >+conn->socket_timeout = atoi(conn->pgsocket_timeout);
This is my bad. I'll remake it.
Very sorry for the same mistake.


Best regards,
-
Ryohei Nagaura 




crosstab/repivot...any interest?

2019-02-25 Thread Merlin Moncure
On Fri, Jan 25, 2019 at 9:14 PM Morris de Oryx 
wrote:
>
> Hello, I'm not a C coder and can't helpbut I love cross-tab/pivot
tables. They're the best, and just fantastic for preparing data to feed
into various analysis tools. The tablefunc module is helpful, but a bit
awkward to use (as I remember it.)
>
> From a user's point of view, I high-performance cross-tab generator would
be just fantastic.
>
> As I understand it, this is what's involved in a pivot:
>
> 1. Identify rows that should be grouped (consolidated.)
> 2. Distinguish the value that identifies each derived column.
> 3. Distinguish the value that identifies each row-column value.
> 4. Collapse the rows, build the columns, and populate the 'cells' with
data.
>
> In an ideal world, you would be able to perform different grouping
operations. Such as count, sum, avg, etc.
>
> If there's a way to do this in a system-wide and standards-pointing way,
so much the better.
>
> Apologies if I'm violating list etiquette by jumping in here. I've been
lurking on several Postgres lists for a bit and picking up interesting
details every day. If I've been Unintentionally and Cluelessly Off, I'm
find with being told.

No worries, sir! Apologies on the late reply.  I've made some headway on
this item.  Waiting for postgres to implement the SQL standard pivoting
(regardless if it implements the cases I need) is not an option for my
personal case. I can't use the SQL approach either as it's very slow and
imposing some scaling limits that need to work around in the short run.

My strategy is to borrow [steal] from crosstab_hash and make a new version
called repivot which takes an arleady pivoted data set and repivots it
against an identified column.   Most of the code can be shared with
tablefunc so ideally this could live as an amendment to that extension.
That might not happen though, so I'm going to package it as a separate
extension (removing the majority of tablefunc that I don't need) and submit
it to this group for consideration.

If we punt, it'll end up as a private extension or live the life of an
Orphan in pgxn.  If there's some interest here, we can consider a new
contrib extension (which I personally rate very unlikely) or recasting as
an extra routine to tablefunc.  Any way we slice it, huge thanks to Joe
Conway for giving us such an awesome function to work with all these
years (not to mention the strategic plr language).  SRF crosstab() is still
somewhat baroque, but it still fills a niche that nothing else implements.

The interface I'm looking at is:
SELECT repivot(
  query TEXT,
  static_attributes INT,  /* number of static attributes that are unchanged
around key; we need this in our usages */
  attribute_query  TEXT); /* query that draws up the pivoted attribute list
*/

The driving query is expected to return 0+ static attributes which are
essentially simply pasted to the output. The next two columns are the key
column and the pivoting column.   So if you had three attributes, the input
set would be:

a1, a2, a3, k1, p, v1...vn

Where the coordinates v and p would exchange.  I need to get this done
quickly and so am trying to avoid more abstracted designs (maybe multi part
keys should be supported through...this is big limitation of crosstab
albeit with some obnoxious work arounds).

merlin


Re: current_logfiles not following group access and instead follows log_file_mode permissions

2019-02-25 Thread Haribabu Kommi
On Mon, Feb 4, 2019 at 12:16 PM Haribabu Kommi 
wrote:

>
> And regarding current_logfiles permissions, I feel this file should have
> permissions of data directory files as it is present in the data directory
> whether it stores the information of log file, until this file is
> completely
> removed with another approach to store the log file details.
>
> I am not sure whether this has been already discussed or not? How about
> using shared memory to store the log file names? So that we don't need
> of this file?
>

I checked the code why the current_logfiles is not implemented as shared
memory
and found that the current syslogger doesn't attach to the shared memory of
the
postmaster. To support storing the current_logfiles in shared memory, the
syslogger
process also needs to attach to the shared memory, this seems to be a new
infrastructure
change.

In case if we are not going to change the permissions of the file to group
access mode
instead of if we strict with log_file_mode, I just tried the attached patch
of moving the
current_logfiles patch to the log_directory. The only drawback of this
approach, is incase
if the user changes the log_directory, the current_logfiles is present in
the old log_directory.
I don't see that as a problem.

comments?

Regards,
Haribabu Kommi
Fujitsu Australia


0001-Move-the-current_logfiles-file-into-log_directory.patch
Description: Binary data


RE: Timeout parameters

2019-02-25 Thread Jamison, Kirk
On Monday, February 25, 2019 1:49 PM (GMT+9), Nagaura, Ryohei wrote:

> Thank you for discussion.
> I made documentation about socket_timeout and reflected Tsunakawa-san's
> comment in the new patch.
> # Mainly only fixing documentation...
> The current documentation of socket_timeout is as follows:
> socket_timeout
> Controls the number of second of client's waiting time for individual socket
> read/write operations. This can be used both as a force global query timeout
> and network problems detector. A value of zero (the default) turns this off.

Thanks for the update.
However, your socket_timeout patch currently does not apply.
$ git apply ../socket_timeout_v5.patch
fatal: corrupt patch at line 24
--> l24: diff --git a/src/interfaces/libpq/fe-connect.c 
b/src/interfaces/libpq/fe-connect.c

>socket_timeout
>
> Controls the number of second of client's waiting time for individual socket 
> read/write
> operations. This can be used both as a force global query timeout and network 
> problems
> detector. A value of zero (the default) turns this off.

Got the doc fix. I wonder if we need to document what effect the parameter does:
terminating the connection. How about:

Controls the number of seconds of client-server communication inactivity
before forcibly closing the connection in order to prevent client from
infinite waiting for individual socket read/write operations. This can
be used both as a force global query timeout and network problems 
detector, i.e. hardware failure and dead connection. A value of zero 
(the default) turns this off.

Well, you may remove the "i.e. hardware failure and dead connection" if
that's not necessary.

I know you've only added the doc recommendation.
But regarding the code, you may also fix other parts:
a. Use parse_int_param instead of atoi
>+  conn->socket_timeout = atoi(conn->pgsocket_timeout);

b. proper spacing after comma
>+  {"socket_timeout", NULL, NULL, NULL,
>+  "Socket-timeout","",10, /* strlen(INT32_MAX) == 10 */
>+  offsetof(struct pg_conn, pgsocket_timeout)},
... 
>+  result = pqSocketCheck(conn,forRead,forWrite,finish_time);

There are probably other parts I missed.

Regards,
Kirk Jamison



Re: Allowing extensions to supply operator-/function-specific info

2019-02-25 Thread Tom Lane
Paul Ramsey  writes:
> On Mon, Feb 25, 2019 at 3:01 PM Tom Lane  wrote:
>> It's whichever one the index column's opclass belongs to.  Basically what
>> you're trying to do here is verify whether the index will support the
>> optimization you want to perform.

> * If I have tbl1.geom
> * and I have built two indexes on it, a btree_geometry_ops and a
> gist_geometry_ops_2d, and
> * and SupportRequestIndexCondition.opfamily returns me the btree family
> * and I look and see, "damn there is no && operator in there"
> * am I SOL, even though an appropriate index does exist?

No.  If there are two indexes matching your function's argument, you'll
get a separate call for each index.  The support function is only
responsible for thinking about one index at a time and seeing if it
can be used.  If more than one can be used, figuring out which
one is better is done by later cost comparisons.

regards, tom lane



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
I wrote:
> Andres Freund  writes:
>>> 1. This still involves at least two palloc's for every nonempty List,
>>> because I kept the header and the data array separate.  Perhaps it's
>>> worth allocating those in one palloc.

>> Hm, I think if we force external code to audit their code, we better
>> also do this. This is a significant number of allocations, and I don't
>> think it'd be good to spread this out over two releases.

> If we choose to do it, I'd agree with doing both in the same major release
> cycle, so that extensions see just one breakage.  But I think it'd still
> best be developed as a follow-on patch.

By the by ... this idea actively breaks the mechanism I'd proposed for
preserving foreach's behavior of evaluating the List reference only once.
If we save a hidden copy of whatever the user says the List reference
is, and then he assigns a new value to it mid-loop, we're screwed if
the list header can move.

Now do you see why I'm a bit afraid of this?  Perhaps it's worth doing,
but it's going to introduce a whole new set of code breakages that are
going to be just as hard to audit for as anything else discussed in
this thread.  (Example: outer function creates a nonempty list, and
passes it down to some child function that appends to the list, and
there's no provision for returning the possibly-modified list header
pointer back up.)  I'm not really convinced that saving one more palloc
per List is worth it.

regards, tom lane



Re: Allowing extensions to supply operator-/function-specific info

2019-02-25 Thread Paul Ramsey
On Mon, Feb 25, 2019 at 3:01 PM Tom Lane  wrote:

> > Looking at the examples, they are making use of the opfamily that
> > comes in SupportRequestIndexCondition.opfamily.
> > That opfamily Oid is the first one in the IndexOptInfo.opfamily array.
> > Here's where my thread of understanding fails to follow. I have, in
> > PostGIS, actually no operator families defined (CREATE OPERATOR
> > FAMILY). I do, however, have quite a few operator classes defined for
> > geometry: 10, actually!
>
> Yes, you do have operator families: there's no such thing as an operator
> class without a containing operator family, and hasn't been for quite
> a long time.  If you write CREATE OPERATOR CLASS without a FAMILY
> clause, the command silently creates an opfamily with the same name you
> specified for the opclass, and links the two together.

OK, starting to understand...

> > Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the
> > && operator to indicate the lossy operation I would like to combine
> > with ST_Intersects.
> > Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the
> > &&& operator to indicate the lossy operation I would like to combine
> > with ST_Intersects.
>
> Right.  So the hard part here is to figure out whether the OID you're
> handed matches one of these operator families.  As I mentioned (in
> the other thread [1], maybe you didn't see it?) the best short-term
> idea I've got for that is to look up the opfamily by OID (see the
> OPFAMILYOID syscache) and check to see if its name matches one of
> the above.  You might want to verify that the index AM's OID is what
> you expect, too, just for a little extra safety.

I read it, I just didn't entirely understand it. I think maybe I do
know? I'm reading and re-reading everything and trying to build a
mental model that makes sense :)

Back to SupportRequestIndexCondition.opfamily though:

> It's whichever one the index column's opclass belongs to.  Basically what
> you're trying to do here is verify whether the index will support the
> optimization you want to perform.

* If I have tbl1.geom
* and I have built two indexes on it, a btree_geometry_ops and a
gist_geometry_ops_2d, and
* and SupportRequestIndexCondition.opfamily returns me the btree family
* and I look and see, "damn there is no && operator in there"
* am I SOL, even though an appropriate index does exist?

> Sure, but that was a pretty lame way of getting the optimization,
> as you well know because you've been fighting its deficiencies for
> so long.

Hrm. :) I will agree to disagree. This is an intellectually
interesting journey, but most of its length is quite far removed from
our proximate goal of adding realistic costs to our functions, and the
code added will be quite a bit harder for folks to follow than what it
replaces.

Reading your code is a pleasure and the comments are great, it's just
a hard slog up for someone who is still going "Node*, hm, how does
that work..."

ATB,

P

> Perhaps at some point we'll have some infrastructure that makes this
> less painful, but it's not happening for v12.
>
> regards, tom lane
>
> [1] https://www.postgresql.org/message-id/22876.1550591...@sss.pgh.pa.us



Re: NOT IN subquery optimization

2019-02-25 Thread David Rowley
On Tue, 26 Feb 2019 at 11:51, Li, Zheng  wrote:
> Resend the patch with a whitespace removed so that "git apply patch" works 
> directly.

I had a quick look at this and it seems to be broken for the empty
table case I mentioned up thread.

Quick example:

Setup:

create table t1 (a int);
create table t2 (a int not null);
insert into t1 values(NULL),(1),(2);

select * from t1 where a not in(select a from t2);

Patched:
 a
---
 1
 2
(2 rows)

Master:
 a
---

 1
 2
(3 rows)

This will be due to the fact you're adding an a IS NOT NULL qual to
the scan of a:

postgres=# explain select * from t1 where a not in(select a from t2);
QUERY PLAN
--
 Hash Anti Join  (cost=67.38..152.18 rows=1268 width=4)
   Hash Cond: (t1.a = t2.a)
   ->  Seq Scan on t1  (cost=0.00..35.50 rows=2537 width=4)
 Filter: (a IS NOT NULL)
   ->  Hash  (cost=35.50..35.50 rows=2550 width=4)
 ->  Seq Scan on t2  (cost=0.00..35.50 rows=2550 width=4)
(6 rows)

but as I mentioned, you can't do that as t2 might be empty and there's
no way to know that during planning.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Andres Freund  writes:
> Btw, should we remove the ENABLE_LIST_COMPAT stuff independent of this
> discussion? Seems like we could even just do that for 12.

+1.  I took it out in the POC patch, but I see no very good reason
not to do it sooner than that.

>> The most critical thing that we lose by doing this is that when a
>> List is modified, all of its cells may need to move, which breaks
>> a lot of code that assumes it can insert or delete a cell while
>> hanging onto a pointer to a nearby cell.

> We could probably "fix" both this, and the cost of making such
> modifications, by having more of an list-of-arrays style
> representation. When adding/removing middle-of-the-"list" elements, we
> could chop that array into two (by modifying metadata, not freeing), and
> shove the new data into a new array inbetween.  But I don't think that'd
> overall be a win, even if it'd get us out of the silent API breakage
> business.

Yeah, I'm afraid that would still leave us with pretty expensive
primitives.

>> 1. This still involves at least two palloc's for every nonempty List,
>> because I kept the header and the data array separate.  Perhaps it's
>> worth allocating those in one palloc.

> Hm, I think if we force external code to audit their code, we better
> also do this. This is a significant number of allocations, and I don't
> think it'd be good to spread this out over two releases.

If we choose to do it, I'd agree with doing both in the same major release
cycle, so that extensions see just one breakage.  But I think it'd still
best be developed as a follow-on patch.


I had an idea that perhaps is worth considering --- upthread I rejected
the idea of deleting lnext() entirely, but what if we did so?  We could
redefine foreach() to not use it:

#define foreach(cell, l) \
for (int cell##__index = 0; \
 (cell = list_nth_cell(l, cell##__index)) != NULL; \
 cell##__index++)

We'd need to fix list_nth_cell() to return NULL not choke on an index
equal to (or past?) the array end, but that's easy.

I think this would go a very long way towards eliminating the hazards
associated with iterating around a list-modification operation.
On the downside, it's hard to see how to merge it with the other idea
for evaluating the List reference only once, so we'd still have the
hazard that the list ref had better be a stable expression.  But that's
likely to be much easier to audit for than what the POC patch asks
people to do (maybe there's a way to check it mechanically, even?).

Also, any code that does contain explicit use of lnext() is likely
in need of rethinking anyhow, so taking it away would help answer
the objection about making problems easy to identify.

regards, tom lane



Re: Allowing extensions to supply operator-/function-specific info

2019-02-25 Thread Tom Lane
Paul Ramsey  writes:
> So... trying to figure out how to use SupportRequestIndexCondition to
> convert a call to Intersects() in to a call that also has the operator
> && as well.

OK.

> Looking at the examples, they are making use of the opfamily that
> comes in SupportRequestIndexCondition.opfamily.
> That opfamily Oid is the first one in the IndexOptInfo.opfamily array.
> Here's where my thread of understanding fails to follow. I have, in
> PostGIS, actually no operator families defined (CREATE OPERATOR
> FAMILY). I do, however, have quite a few operator classes defined for
> geometry: 10, actually!

Yes, you do have operator families: there's no such thing as an operator
class without a containing operator family, and hasn't been for quite
a long time.  If you write CREATE OPERATOR CLASS without a FAMILY
clause, the command silently creates an opfamily with the same name you
specified for the opclass, and links the two together.

> Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the
> && operator to indicate the lossy operation I would like to combine
> with ST_Intersects.
> Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the
> &&& operator to indicate the lossy operation I would like to combine
> with ST_Intersects.

Right.  So the hard part here is to figure out whether the OID you're
handed matches one of these operator families.  As I mentioned (in
the other thread [1], maybe you didn't see it?) the best short-term
idea I've got for that is to look up the opfamily by OID (see the
OPFAMILYOID syscache) and check to see if its name matches one of
the above.  You might want to verify that the index AM's OID is what
you expect, too, just for a little extra safety.

> A given call to ST_Intersects(tbl1.geom, tbl2.geom) could have two
> indexes to apply the problem, but
> SupportRequestIndexCondition.opfamily will, I assume, only be exposing
> one to me: which one?

It's whichever one the index column's opclass belongs to.  Basically what
you're trying to do here is verify whether the index will support the
optimization you want to perform.

> Anyways, to true up how hard this is, I've been carefully reading the
> implementations for network address types and LIKE, and I'm still
> barely at the WTF stage. The selectivity and the number of rows
> support modes I could do. The SupportRequestIndexCondition is based on
> a detailed knowledge of what an operator family is, an operator class
> is, how those relate to types... I think I can get there, but it's
> going to be far from easy (for me).

You definitely want to read this:

https://www.postgresql.org/docs/devel/xindex.html#XINDEX-OPFAMILY

and maybe some of the surrounding sections.

> And it'll put a pretty high bar in
> front of anyone who previously just whacked an inline SQL function in
> place to get an index assisted function up and running.

Sure, but that was a pretty lame way of getting the optimization,
as you well know because you've been fighting its deficiencies for
so long.

Perhaps at some point we'll have some infrastructure that makes this
less painful, but it's not happening for v12.

regards, tom lane

[1] https://www.postgresql.org/message-id/22876.1550591...@sss.pgh.pa.us



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-23 21:24:40 -0500, Tom Lane wrote:
> For quite some years now there's been dissatisfaction with our List
> data structure implementation.  Because it separately palloc's each
> list cell, it chews up lots of memory, and it's none too cache-friendly
> because the cells aren't necessarily adjacent.

Indeed. Might be worthwhile to note that linked list, even if stored in
adjacent memory, are *still* not very friendly for out-of-order CPUs, as
they introduce a dependency between fetching the pointer to the next
element, and processing the next element. Whereas for arrays etc CPUs
start executing instructions for the next element, before finishing the
last one.


> Every time this has come up, I've opined that the right fix is to jack
> up the List API and drive a new implementation underneath, as we did
> once before (cf commit d0b4399d81).

Btw, should we remove the ENABLE_LIST_COMPAT stuff independent of this
discussion? Seems like we could even just do that for 12.


> The big-picture performance change is that this makes list_nth()
> a cheap O(1) operation, while lappend() is still pretty cheap;
> on the downside, lcons() becomes O(N), as does insertion or deletion
> in the middle of a list.  But we don't use lcons() very much
> (and maybe a lot of the existing usages aren't really necessary?),
> while insertion/deletion in long lists is a vanishingly infrequent
> operation.  Meanwhile, making list_nth() cheap is a *huge* win.

Right.


> The most critical thing that we lose by doing this is that when a
> List is modified, all of its cells may need to move, which breaks
> a lot of code that assumes it can insert or delete a cell while
> hanging onto a pointer to a nearby cell.

We could probably "fix" both this, and the cost of making such
modifications, by having more of an list-of-arrays style
representation. When adding/removing middle-of-the-"list" elements, we
could chop that array into two (by modifying metadata, not freeing), and
shove the new data into a new array inbetween.  But I don't think that'd
overall be a win, even if it'd get us out of the silent API breakage
business.


> Another annoying consequence of lnext() needing a List pointer is that
> the List arguments of foreach() and related macros are now evaluated
> each time through the loop.  I could only find half a dozen places
> where that was actually unsafe (all fixed in the draft patch), but
> it's still bothersome.  I experimented with ways to avoid that, but
> the only way I could get it to work was to define foreach like this:

Yea, that problem is why the ilist stuff has the iterator
datastructure. That was before we allowed variable defs in for
though...


> #define foreach(cell, l)for (const List *cell##__foreach = 
> foreach_setup(l, );  cell != NULL; cell = lnext(cell##__foreach, 
> cell))
> 
> static inline const List *
> foreach_setup(const List *l, ListCell **cell)
> {
> *cell = list_head(l);
> return l;
> }
> 
> That works, but there are two problems.  The lesser one is that a
> not-very-bright compiler might think that the "cell" variable has to
> be forced into memory, because its address is taken.

I don't think that's a huge problem. I don't think there are any
platforms we really care about with such compilers? And I can't imagine
that being the only performance problem on such a platform.


> The bigger one is
> that this coding forces the "cell" variable to be exactly "ListCell *";
> you can't add const or volatile qualifiers to it without getting
> compiler warnings.  There are actually more places that'd be affected
> by that than by the need to avoid multiple evaluations.  I don't think
> the const markings would be a big deal to lose, and the two places in
> do_autovacuum that need "volatile" (because of a nearby PG_TRY) could
> be rewritten to not use foreach.  So I'm tempted to do that, but it's
> not very pretty.

Hm, that's a bit ugly, indeed.


> Maybe somebody can think of a better solution?

We could cast away const & volatile on most compilers, and do better on
gcc & clang, I guess. We could use typeof() and similar games to add the
relevant qualifiers. Or alternatively, also optionally of course, use
C11 _Generic trickery for defining the type.  But that seems
unsatisfying (but safe, I think).



> There's a lot of potential follow-on work that I've not touched yet:
> 
> 1. This still involves at least two palloc's for every nonempty List,
> because I kept the header and the data array separate.  Perhaps it's
> worth allocating those in one palloc.  However, right now there's an
> assumption that the header of a nonempty List doesn't move when you
> change its contents; that's embedded in the API of the lappend_cell
> functions, and more than likely there's code that depends on that
> silently because it doesn't bother to store the results of other
> List functions back into the appropriate pointer.  So if we do that
> at all I think it'd be better tackled in a 

Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> > * Andres Freund (and...@anarazel.de) wrote:
> > > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > > You say above that the new interface is unquestionably an improvement
> > > 
> > > FWIW, I think we didn't design it even remotely as well as we ought to
> > > have. It was both a mistake to not offer a version of non-exclusive
> > > backups that works with a client connection (for integration with the
> > > TSMs of this world), and it was a mistake not to offer a commandline
> > > tool that does the non-exclusive pg/start backup.
> > 
> > I'm not really following- did you mean to say "without a client
> > connection" above?
> 
> Yes.

Ah, alright, that makes more sense then.

You might bring that up with the individual who wrote it.

> > We could still add some kind of commandline tool to help with the
> > non-exclusive pg start/stop backup
> 
> That doesn't help, as long as the connection requirement is there. As
> explained elsewhere in this thread, a lot of larger external backup
> infrastructure just gives you a start/stop hook, and that makes using a
> persistent connection hard and fragile.

I tend to agree w/ Magnus in that I'm not sure that this is actually
as common these days as claimed.  That said, I'm not against someone
trying to implement something that doesn't require a connection, though
I understand from discussion with David and his comments elsewhere on
this thread that it isn't that simple a thing to do- and people would
have to migrate to whatever it is anyway.

> > but I'm not really sure exactly what that would look like and I
> > honestly don't have terribly much interest in spending resources on
> > implementing it since it would lack a great many features that we'd
> > then end up wanting to add on... and then we end up backing into
> > building yet another backup tool.
> 
> Meh. You are proposing to remove a feature (even if a dangerous
> one). Requiring some minimal compat work to make htat more palpaple
> isn't crazy.  Nor does using backrest or whatnot do *ANYTHING* about the
> users of large company wide backup tools.

Yes, I'm supporting the proposal to remove a dangerous feature.  I
really don't think much more needs to be said than that.  If the feature
wasn't dangerous then I wouldn't be argueing for its removal.  I do feel
bad that pgbackrest hasn't got a solution already to this, but we've had
other priorities.  I do think we'll get there, eventually, but not any
time soon.  Being told that I have to implement another feature so that
we can remove the dangerous one is something I don't agree with.

> > > > I really, honestly, believe what we need to *stop* doing is trying to
> > > > drag along support for old/deprecated interfaces after we've introduced
> > > > new ones, thus avoiding these arguments and the time spent on them, and
> > > > the time spent dealing with implementing and documenting new APIs around
> > > > old ones.  The only thing it seems to unquestionably do is make us argue
> > > > about when we are going to *finally* rip out the old/deprecated API (or
> > > > interface, or whatever you want to call it).
> > > 
> > > While I agree that we should remove non-exclusive backups, I VEHEMENTLY
> > > oppose the unconditionality of the above statement. Yes, it's annoying
> > > to keep interfaces around, but unnecessarily inflicting pain to everyone
> > > also is annoying.
> > 
> > Alright, then how about we provide a bit of help for everyone who's got
> > a system built around recovery.conf today, instead of just completely
> > ripping that out?
> 
> Oh, comeon. Obviously we're sometimes going to have to make breaking
> changes. But you're essentially arguing that we shouldn't provide compat
> where we can come up with a reasonable way to provide backward
> compatibility. And I think that's crazy and will hurt postgres.

This was discussed, *extensively*, previously on this list with a whole
lot of proposals trying to come up with easy ways to "fix" exclusive
backups and I don't really think that this thread has proposed anything
novel in that regard.  Basically, I just don't think it's that easy.
Perhaps I'm wrong, in which case, great, I'd be happy to review a simple
patch that solves it.

> > > That's not to say we shouldn't be better at announcing and then
> > > following through on the deprecation of old interfaces.
> > 
> > We announce things have changed every year, and people have five years
> > from that point, during which we'll continue to support them and fix
> > bugs and deal with security issues, to make whatever changes they need
> > to for the new interface.
> > 
> > The argument seems to be that people want new features but they don't
> > want to have to do any work to get those new features.
> 
> I mean, duh? We can't make that happen all the time, but I don't
> understand why it's surprising to have that as *one* goal 

Re: Allowing extensions to supply operator-/function-specific info

2019-02-25 Thread Paul Ramsey
On Mon, Jan 28, 2019 at 9:51 AM Tom Lane  wrote:
> is people like PostGIS, who already cleared that bar.  I hope that
> we'll soon have a bunch of examples, like those in the 0004 patch,
> that people can look at to see how to do things in this area.  I see
> no reason to believe it'll be all that much harder than anything
> else extension authors have to do.

It's a little harder :)

So... trying to figure out how to use SupportRequestIndexCondition to
convert a call to Intersects() in to a call that also has the operator
&& as well.

Looking at the examples, they are making use of the opfamily that
comes in SupportRequestIndexCondition.opfamily.

That opfamily Oid is the first one in the IndexOptInfo.opfamily array.
Here's where my thread of understanding fails to follow. I have, in
PostGIS, actually no operator families defined (CREATE OPERATOR
FAMILY). I do, however, have quite a few operator classes defined for
geometry: 10, actually!

 btree_geometry_ops
 hash_geometry_ops
 gist_geometry_ops_2d
 gist_geometry_ops_nd
 brin_geometry_inclusion_ops_2d
 brin_geometry_inclusion_ops_3d
 brin_geometry_inclusion_ops_4d
 spgist_geometry_ops_2d
 spgist_geometry_ops_nd
 spgist_geometry_ops_nd

Some of those are not useful to me (btree, hash) for sure.
Some of them (gist_geometry_ops_2d, spgist_geometry_ops_2d ) use the
&& operator to indicate the lossy operation I would like to combine
with ST_Intersects.
Some of them (gist_geometry_ops_nd, spgist_geometry_ops_nd) use the
&&& operator to indicate the lossy operation I would like to combine
with ST_Intersects.

A given call to ST_Intersects(tbl1.geom, tbl2.geom) could have two
indexes to apply the problem, but
SupportRequestIndexCondition.opfamily will, I assume, only be exposing
one to me: which one?

Anyways, to true up how hard this is, I've been carefully reading the
implementations for network address types and LIKE, and I'm still
barely at the WTF stage. The selectivity and the number of rows
support modes I could do. The SupportRequestIndexCondition is based on
a detailed knowledge of what an operator family is, an operator class
is, how those relate to types... I think I can get there, but it's
going to be far from easy (for me). And it'll put a pretty high bar in
front of anyone who previously just whacked an inline SQL function in
place to get an index assisted function up and running.

P.



Re: Auxiliary Processes and MyAuxProc

2019-02-25 Thread Mike Palmiotto
On Mon, Feb 25, 2019 at 1:41 PM Mike Palmiotto
 wrote:
>
> 
> >
> > If memory serves, StartChildProcess already was an attempt to unify
> > the treatment of postmaster children.  It's possible that another
> > round of unification would be productive, but I think you'll find
> > that there are random small differences in requirements that'd
> > make it messy.
>
> It kind of seemed like it, but I noticed the small differences in
> requirements, which made me a bit hesitant. I'll go ahead and see what
> I can do and submit the patch for consideration.

I'm considering changing StartChildProcess to take a struct with data
for forking/execing each different process. Each different backend
type would build up the struct and then pass it on to
StartChildProcess, which would handle each separately. This would
ensure that the fork type is set prior to InitPostmasterChild and
would provide us with the information necessary to do what we need in
the InitPostmasterChild_hook.

Attached is a patch to fork_process.h which shows roughly what I'm
thinking. Does this seem somewhat sane as a first step?




--
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com
diff --git a/src/include/postmaster/fork_process.h b/src/include/postmaster/fork_process.h
index eb3c8ec974..6cf0bfa2fe 100644
--- a/src/include/postmaster/fork_process.h
+++ b/src/include/postmaster/fork_process.h
@@ -12,6 +12,29 @@
 #ifndef FORK_PROCESS_H
 #define FORK_PROCESS_H
 
+typedef enum
+{
+	CheckerProcess = 0,
+	BootstrapProcess,
+	StartupProcess,
+	BgWriterProcess,
+	CheckpointerProcess,
+	WalWriterProcess,
+	WalReceiverProcess,
+	AutoVacuumLauncherProcess,
+	AutoVacuumWorkerProcess,
+	SysloggerProcess,
+	BgWorkerProcess
+
+	NUMFORKPROCTYPES			/* Must be last! */
+} ForkProcType;
+
+typedef struct ForkProcData
+{
+	ForkProcType	type;
+	char		   *fork_av[10];
+} ForkProcData
+
 extern pid_t fork_process(void);
 
 #endif			/* FORK_PROCESS_H */


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

On 2/25/19 11:59 PM, Christophe Pettus wrote:




On Feb 25, 2019, at 13:38, Andres Freund  wrote:

I think you might be right about this specific issue. But to me it
sounds like you also don't appreciate that development resources are
really constrained too, and providing endless backward compatibility for
everything is going to use both resources directly, and indirectly by
making the whole system more complex.


One of the things I've tried not to do in this discussion is turn it into a 
policy debate about backwards compatibility in general, rather than this very 
specific feature.  Of course, there's a cost to keeping around features, and I 
fully appreciate that.  In this discussion, the code complexity didn't seem to 
be the fundamental driver behind removing the feature; the relative safety of 
it was, along with maintaining the documentation.


Code complexity is very much an issue.  Since the two methods are merged 
with a ton of conditionals, working on one without affecting the other 
is a real chore, especially since exclusive backups have *zero* tests. 
Personally I'm not interested in writing those tests so I try not to 
touch the code either.



I jumped in because there seemed to be an argument going on that all of the 
cool kids will have moved off the old interface and there was essentially no 
cost to removing it in v12 or v13, and that didn't correspond to my experience.


I don't think anyone believes there will be zero cost, but the issues 
with the exclusive method seem to outweigh the benefits, at least in my 
experience.


--
-David
da...@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Andres Freund (and...@anarazel.de) wrote:
> On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> > 
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> > 
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
> 
> FWIW, if you weren't selling backrest quite so hard everywhere backups
> are mentioned, I'd find this thread a lot more convicing.

I push pgbackrest, just like I push PG, because I view it as the best
open source tool out there for the job, well architected, designed from
the ground-up to do what a backup tool needs to, and it has a strong
community around it with people contributing back, all openly and under
a permissive license which allows anyone to use it, hack on it, and do
what they want with it- basically, I see it as the PG of backup tools
for PG.  What PG is to relational databases, pgbackrest is to PG backup
tools, imv.

As David mentioned already, this really doesn't change things for
pgbackrest one way or the other- we don't use the old API and haven't
since the non-exclusive API was available.  In addition, pgbackrest
hasn't got a solution for the "we just want to take a snapshot" or "we
just want to use start/stop hooks".  If I was really trying to push
people to pgbackrest because of the flaws in the exclusive backup mode,
I'd surely develop answers to those issues first and make them part of
pgbackrest in short order, and then start yelling from the tops of
buildings about how broken the exclusive backup API is and about how
I've got an easy solution for everyone to move to, maybe do a blog post
on planet.  I'm not doing that though, instead I'm discussing the
issues here, with a tiny fraction of our community, and pushing for us
to agree to get rid of a dangerous API in favor of an API that anyone
can use w/o pgbackrest, if they want to.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: POC: converting Lists into arrays

2019-02-25 Thread Peter Geoghegan
On Mon, Feb 25, 2019 at 1:51 PM Andres Freund  wrote:
> Those could trivially support distinguisiong at least between lists
> containing pointer, int, oid, or node. But even optionally doing more
> than that would be fairly easy. It's not those modules don't currently
> know the types of elements they're dealing with?
>
>
> > If you add a support for a new datatype, where does that leave
> > stored rules?
>
> We don't maintain stored rules across major versions (they break due to
> a lot of changes), so I don't quite understand that problem.

The point is that the implicit need to have support for serializing
and deserializing everything is something that constrains the design,
and must also constrain the design of any successor data structure.
The contents of pg_list.[ch] are not why it's a PITA to add support
for a new datatype. Also, most of the time the Lists are lists of
nodes, which is essentially an abstract base type for heterogeneous
types anyway. I don't really get what you mean about type safety,
because you haven't spelled it out in a way that acknowledges all of
this.

-- 
Peter Geoghegan



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Christophe Pettus



> On Feb 25, 2019, at 13:38, Andres Freund  wrote:
> 
> I think you might be right about this specific issue. But to me it
> sounds like you also don't appreciate that development resources are
> really constrained too, and providing endless backward compatibility for
> everything is going to use both resources directly, and indirectly by
> making the whole system more complex.

One of the things I've tried not to do in this discussion is turn it into a 
policy debate about backwards compatibility in general, rather than this very 
specific feature.  Of course, there's a cost to keeping around features, and I 
fully appreciate that.  In this discussion, the code complexity didn't seem to 
be the fundamental driver behind removing the feature; the relative safety of 
it was, along with maintaining the documentation.

I jumped in because there seemed to be an argument going on that all of the 
cool kids will have moved off the old interface and there was essentially no 
cost to removing it in v12 or v13, and that didn't correspond to my experience.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

On 2/25/19 11:38 PM, Andres Freund wrote:

Hi,

On 2019-02-25 09:24:32 -0800, Christophe Pettus wrote:

But the resistance to major version upgrades is *huge*, and I'm
strongly biased against anything that will make that harder.  I'm not
sure I'm communicating how big a problem telling many large
installations, "If you move to v12/13/etc., you will have to change
your backup system" is going to be.


I think you might be right about this specific issue. But to me it
sounds like you also don't appreciate that development resources are
really constrained too, and providing endless backward compatibility for
everything is going to use both resources directly, and indirectly by
making the whole system more complex.  I've been on your side of this
fight a couple times (and largely lost), but I think it's important to
appreciate that it's all a balancing, and we all have valid reasons to
keep the balance between development pace / ease of use / ease of
upgrade the way we argue for them. Hard upgrading is going to reduce
adoption, but so is lack of feature development and too many creaky &
redundant & easy to misuse interfaces.


+1.

And those (like me) who are motivated to work in this area are put off 
by the byzantine code and how easy it is to break things -- in large 
part because of the *complete* absence of tests for exclusive mode.


I did a round of doc updates for non/exclusive backups a while back and 
it was excruciating to measure and document the subtle differences in a 
way that a user might possibly understand.  I'm honestly not motivated 
to come back to it until we remove the deprecated interfaces.


--
-David
da...@pgmasters.net



Re: POC: converting Lists into arrays

2019-02-25 Thread Peter Geoghegan
On Mon, Feb 25, 2019 at 1:31 PM Andres Freund  wrote:
> > Andres said that he doesn't like the pg_list.h API. It's not pretty,
> > but is it really that bad?
>
> Yes. The function names alone confound anybody new to postgres, we tend
> to forget that after a few years. A lot of the function return types are
> basically unpredictable without reading the code, the number of builtin
> types is pretty restrictive, and there's no typesafety around the choice
> of actually stored.

But a lot of those restrictions are a consequence of needing what
amount to support functions in places as distant from pg_list.h as
pg_stat_statements.c, or the parser, or outfuncs.c. I'm not saying
that we couldn't do better here, but the design is constrained by
this. If you add a support for a new datatype, where does that leave
stored rules? Seems ticklish to me, at the very least.

-- 
Peter Geoghegan



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 13:41:48 -0800, Peter Geoghegan wrote:
> On Mon, Feb 25, 2019 at 1:31 PM Andres Freund  wrote:
> > > Andres said that he doesn't like the pg_list.h API. It's not pretty,
> > > but is it really that bad?
> >
> > Yes. The function names alone confound anybody new to postgres, we tend
> > to forget that after a few years. A lot of the function return types are
> > basically unpredictable without reading the code, the number of builtin
> > types is pretty restrictive, and there's no typesafety around the choice
> > of actually stored.
> 
> But a lot of those restrictions are a consequence of needing what
> amount to support functions in places as distant from pg_list.h as
> pg_stat_statements.c, or the parser, or outfuncs.c.

Those could trivially support distinguisiong at least between lists
containing pointer, int, oid, or node. But even optionally doing more
than that would be fairly easy. It's not those modules don't currently
know the types of elements they're dealing with?


> If you add a support for a new datatype, where does that leave
> stored rules?

We don't maintain stored rules across major versions (they break due to
a lot of changes), so I don't quite understand that problem.

Greetings,

Andres Freund



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

Hi Andres,

On 2/25/19 10:57 PM, Andres Freund wrote:

Hi,

On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:

It will be annoying if after this removal, companies must change their
backup strategy by using specific postgres tools (pgbackrest, barman).


You don't write your own database system using CSV files and shell
magic, do you?  I have to say that it continues to boggle my mind that
people insist that *this* part of the system has to be able to be
implementable using shell scripts.

Folks, these are your backups we're talking about, your last resort if
everything else goes up in flames, why do you want to risk that by
implementing your own one-off solution, particularly when there's known
serious issues using that interface, and you want to just use shell
scripts to do it...?


FWIW, if you weren't selling backrest quite so hard everywhere backups
are mentioned, I'd find this thread a lot more convicing.


pgBackRest has not used exclusive backups since the new API was 
introduced in 9.6 so this is not an issue for our users.


Over time we have contributed back to Postgres in areas we thought could 
be improved based on our work on the pgBackRest project: 6ad8ac60, 
9fe3c644, 017e4f25, 78874531, 449338cc, 98267ee8, 8694cc96, 920a5e50, 
c37b3d08, 5fc1670b, b981df4c.  This does not include the various backup 
related patches that we have reviewed.


If promoting pgBackRest were our primary concern then it would be in our 
interest to allow Postgres exclusive backups to stay broken and 
pg_basebackup to be as primitive as possible.


Our objections to exclusive backups stem from a desire to do the right 
thing, as we see it, and because we have seen first hand all the ways 
that backups can go wrong.


--
-David
da...@pgmasters.net



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Magnus Hagander
On Mon, Feb 25, 2019 at 10:14 PM Andres Freund  wrote:

> On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> > Greetings,
> >
> > * Andres Freund (and...@anarazel.de) wrote:
> > > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > > You say above that the new interface is unquestionably an improvement
> > >
> > > FWIW, I think we didn't design it even remotely as well as we ought to
> > > have. It was both a mistake to not offer a version of non-exclusive
> > > backups that works with a client connection (for integration with the
> > > TSMs of this world), and it was a mistake not to offer a commandline
> > > tool that does the non-exclusive pg/start backup.
> >
> > I'm not really following- did you mean to say "without a client
> > connection" above?
>
> Yes.
>
>
> > We could still add some kind of commandline tool to help with the
> > non-exclusive pg start/stop backup
>
> That doesn't help, as long as the connection requirement is there. As
> explained elsewhere in this thread, a lot of larger external backup
> infrastructure just gives you a start/stop hook, and that makes using a
> persistent connection hard and fragile.
>

In my experience they don't just give you that. They also give you a module
that calls the postgres APIs to do backups.

Some of them are using the old API, some are using the new one. The only
thing that would be required there would be to upgrade the module to one
that supports the 9.6 APIs. (And you have to upgrade those modules whenever
you upgrade oracle or mssql or whatever else you have, so that's not
something those organisations are unused to)

I agree with the need for as simpler way to drive it in the simple case or
in the say filesystem snapshot case, but I'm not sure the pre/post case is
all that common in this context? Some certainly use it today even if they
don't have to, but that can probably be easily changed to such a
"simplified cmd tool" that you have suggested elsewhere as well.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/ 


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 09:24:32 -0800, Christophe Pettus wrote:
> But the resistance to major version upgrades is *huge*, and I'm
> strongly biased against anything that will make that harder.  I'm not
> sure I'm communicating how big a problem telling many large
> installations, "If you move to v12/13/etc., you will have to change
> your backup system" is going to be.

I think you might be right about this specific issue. But to me it
sounds like you also don't appreciate that development resources are
really constrained too, and providing endless backward compatibility for
everything is going to use both resources directly, and indirectly by
making the whole system more complex.  I've been on your side of this
fight a couple times (and largely lost), but I think it's important to
appreciate that it's all a balancing, and we all have valid reasons to
keep the balance between development pace / ease of use / ease of
upgrade the way we argue for them. Hard upgrading is going to reduce
adoption, but so is lack of feature development and too many creaky &
redundant & easy to misuse interfaces.

Greetings,

Andres Freund



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 22:35:37 +0100, Tomas Vondra wrote:
> So let's say we want to measure the improvement this patch gives us.
> What would be some reasonable (and corner) cases to benchmark? I do have
> some ideas, but as you've been looking at this in the past, perhaps you
> have something better.

I think queries over tables with a fair number of columns very easily
stress test the list overhead around targetlists - I don't have a
profile lying around, but the overhead of targetlist processing
(ExecTypeFromTL etc) at execution time clearly shows up. Larger
individual expressions can easily show up via eval_const_expressions()
etc and ExecInitExpr().  Both probably can be separated into benchmarks
with prepared statements (ExecTypeFromTl() and ExecInitExpr() will show
up, but planner work obviously not), and non-prepared benchmarks will
stress the planner more.  I suspect there's also a few planner benefits
with large numbers of paths, but I don't quite remember the profiles
well enough to construct a benchmark from memory.

Greetings,

Andres Freund



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

On 2/25/19 11:20 PM, Christophe Pettus wrote:




On Feb 25, 2019, at 11:24, Stephen Frost  wrote:
Aren't they going to need to make a change for v12 now anyway?

Hopefully they're regularly testing their backups by doing a restore of
them, and dropping a recovery.conf into the directory of a v12 system
after restore will do exactly nothing and they'll get errors complaining
about how they need to provide a restore_command.


It varies with the installation, but backups are 100% - epsilon automated, while restores 
are very frequently done via runbook.  Changing a runbook is institutionally less 
contentious, and the change is more or less "drop the file into mumble/conf.d rather 
than mumble", which is less of a break.


This is true but is becoming far less true over time.  Automation of 
recovery is now very common -- as it should be.  Backups should be 
tested regularly and recovery should not be a scary thing that is done 
at 2am from an outdated runbook.


The recovery.conf change will have serious impact when it arrives in PG12.

--
-David
da...@pgmasters.net



Re: POC: converting Lists into arrays

2019-02-25 Thread Tomas Vondra



On 2/25/19 10:03 PM, Andres Freund wrote:
> Hi,
> 
> On 2019-02-25 11:59:06 -0800, Peter Geoghegan wrote:
>> On Mon, Feb 25, 2019 at 10:59 AM Tom Lane  wrote:
>>> Because it involves touching ten times more code (and that's a very
>>> conservative estimate).  Excluding changes in pg_list.h + list.c,
>>> what I posted touches approximately 600 lines of code (520 insertions,
>>> 644 deletions to be exact).  For comparison's sake, there are about
>>> 1800 uses of foreach in the tree, each of which would require at least
>>> 3 changes to replace (the foreach itself, the ListCell variable
>>> declaration, and at least one lfirst() reference in the loop body).
>>
>> If we knew that the list code was the bottleneck in a handful of
>> cases, then I'd come down on Robert's side here. It would then be
>> possible to update the relevant bottlenecked code in an isolated
>> fashion, while getting the lion's share of the benefit. However, I
>> don't think that that's actually possible. The costs of using Lists
>> everywhere are real and measurable, but it's also highly distributed.
>> At least, that's my recollection from previous discussion from several
>> years back. I remember talking about this with Andres in early 2016.
> 
> It's distributed, but not *that* distributed. The largest source of
> "cost" at execution time used to be all-over expression evaluation, but
> that's gone now. That was a lot of places, but it's not outside of reach
> of a targeted change. Now it's targetlist handling, which'd have to
> change together with plan time, where it's a large issue.
> 

So let's say we want to measure the improvement this patch gives us.
What would be some reasonable (and corner) cases to benchmark? I do have
some ideas, but as you've been looking at this in the past, perhaps you
have something better.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 13:21:30 -0800, Peter Geoghegan wrote:
> ISTM that we should separate the question of whether or not the List
> API needs to continue to work without needing to change code in third
> party extensions from the question of whether or not the List API
> needs to be replaced whole cloth. These are not exactly independent
> questions, but they don't necessarily need to be discussed all at
> once.

I'm not convinced by that - if we are happy with the list API, not
duplicating code would be a stronger argument than if we actually are
unhappy.  It makes no sense to go around and replace the same code twice
in a row if we also think other changes should be made (at the same
time, we obviously ought not to do too much at once, otherwise we'll
never get anywhere).


> Andres said that he doesn't like the pg_list.h API. It's not pretty,
> but is it really that bad?

Yes. The function names alone confound anybody new to postgres, we tend
to forget that after a few years. A lot of the function return types are
basically unpredictable without reading the code, the number of builtin
types is pretty restrictive, and there's no typesafety around the choice
of actually stored.

Greetings,

Andres Freund



[RFC] [PATCH] Flexible "partition pruning" hook

2019-02-25 Thread Mike Palmiotto
Greetings,

Attached is a patch which attempts to solve a few problems:

1) Filtering out partitions flexibly based on the results of an
external function call (supplied by an extension).
2) Filtering out partitions from pg_inherits based on the same function call.
3) Filtering out partitions from a partitioned table BEFORE the partition
is actually opened on-disk.

The name "partitionChildAccess_hook" comes from the fact that the
backend may not have access to a particular partition within the
partitioned table. The idea would be to silently filter out the
partition from queries to the parent table, which means also adjusting
the returned contents of find_inheritance_children based on the
external function.

I am curious how the community feels about these patches and if there
is an alternative approach to solve the above issues (perhaps
another existing hook).

Thanks for your time.

-- 
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com
diff --git a/src/backend/catalog/pg_inherits.c b/src/backend/catalog/pg_inherits.c
index 00f7957b3d..c4711490f1 100644
--- a/src/backend/catalog/pg_inherits.c
+++ b/src/backend/catalog/pg_inherits.c
@@ -25,6 +25,7 @@
 #include "catalog/indexing.h"
 #include "catalog/pg_inherits.h"
 #include "parser/parse_type.h"
+#include "partitioning/partprune.h"
 #include "storage/lmgr.h"
 #include "utils/builtins.h"
 #include "utils/fmgroids.h"
@@ -92,11 +93,16 @@ find_inheritance_children(Oid parentrelId, LOCKMODE lockmode)
 	while ((inheritsTuple = systable_getnext(scan)) != NULL)
 	{
 		inhrelid = ((Form_pg_inherits) GETSTRUCT(inheritsTuple))->inhrelid;
+
+		if (enable_partition_pruning && inhrelid && !partitionChildAccess_hook(inhrelid))
+			continue;
+
 		if (numoids >= maxoids)
 		{
 			maxoids *= 2;
 			oidarr = (Oid *) repalloc(oidarr, maxoids * sizeof(Oid));
 		}
+
 		oidarr[numoids++] = inhrelid;
 	}
 
diff --git a/src/backend/optimizer/path/allpaths.c b/src/backend/optimizer/path/allpaths.c
index 0debac75c6..185bfc10eb 100644
--- a/src/backend/optimizer/path/allpaths.c
+++ b/src/backend/optimizer/path/allpaths.c
@@ -1067,6 +1067,13 @@ set_append_rel_size(PlannerInfo *root, RelOptInfo *rel,
 			continue;
 		}
 
+		if (enable_partition_pruning && !partitionChildAccess_hook(childRTE->relid))
+		{
+			/* Implement custom partition pruning filter*/
+			set_dummy_rel_pathlist(childrel);
+			continue;
+		}
+
 		/*
 		 * CE failed, so finish copying/modifying targetlist and join quals.
 		 *
diff --git a/src/include/partitioning/partprune.h b/src/include/partitioning/partprune.h
index 2f75717ffb..968b0ccbe0 100644
--- a/src/include/partitioning/partprune.h
+++ b/src/include/partitioning/partprune.h
@@ -74,6 +74,12 @@ typedef struct PartitionPruneContext
 #define PruneCxtStateIdx(partnatts, step_id, keyno) \
 	((partnatts) * (step_id) + (keyno))
 
+/* Custom partition child access hook. Provides further partition pruning given
+ * child OID.
+ */
+typedef bool (*partitionChildAccess_hook_type) (Oid childOID);
+PGDLLIMPORT partitionChildAccess_hook_type partitionChildAccess_hook;
+
 extern PartitionPruneInfo *make_partition_pruneinfo(struct PlannerInfo *root,
 		 struct RelOptInfo *parentrel,
 		 List *subpaths,


Re: POC: converting Lists into arrays

2019-02-25 Thread Peter Geoghegan
On Mon, Feb 25, 2019 at 1:04 PM Tom Lane  wrote:
> I'm getting slightly annoyed by arguments that reject a live, workable
> patch in favor of pie-in-the-sky proposals.  Both you and Robert seem
> to be advocating solutions that don't exist and would take a very large
> amount of work to create.  If you think differently, let's see a patch.

ISTM that we should separate the question of whether or not the List
API needs to continue to work without needing to change code in third
party extensions from the question of whether or not the List API
needs to be replaced whole cloth. These are not exactly independent
questions, but they don't necessarily need to be discussed all at
once.

Andres said that he doesn't like the pg_list.h API. It's not pretty,
but is it really that bad?

The List implementation claims to be generic, but it's not actually
that generic. It has to work as a Node. It's not quite fair to say
that it's confusing without acknowledging that pg_list.h is special to
query processing.

-- 
Peter Geoghegan



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 16:03:43 -0500, Tom Lane wrote:
> Andres Freund  writes:
> > FWIW, rewrites of this kind can be quite nicely automated using
> > coccinelle [1]. One sometimes needs to do a bit of mop-up with variable
> > names, but otherwise it should be mostly complete.
> 
> I'm getting slightly annoyed by arguments that reject a live, workable
> patch in favor of pie-in-the-sky proposals.  Both you and Robert seem
> to be advocating solutions that don't exist and would take a very large
> amount of work to create.  If you think differently, let's see a patch.

Uhm, we're talking about an invasive proposal from two weekend days
ago. It seems far from crazy to voice our concerns with the silent
breakage you propose. Nor, even if we were obligated to work on an
alternative approach, which we aren't, would it be realistic for us to
have written an alternative implementation within the last few hours,
while also working on our own priorities.

I'm actually quite interested in this topic, both in the sense that it's
great to see work, and in the sense that I'm willing to help with the
effort.

Greetings,

Andres Freund



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Christophe Pettus



> On Feb 25, 2019, at 11:24, Stephen Frost  wrote:
> Aren't they going to need to make a change for v12 now anyway?
> 
> Hopefully they're regularly testing their backups by doing a restore of
> them, and dropping a recovery.conf into the directory of a v12 system
> after restore will do exactly nothing and they'll get errors complaining
> about how they need to provide a restore_command.

It varies with the installation, but backups are 100% - epsilon automated, 
while restores are very frequently done via runbook.  Changing a runbook is 
institutionally less contentious, and the change is more or less "drop the file 
into mumble/conf.d rather than mumble", which is less of a break.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 20:43:01 +0200, David Steele wrote:
> fsync() is the major corruption issue we are facing right now but that
> doesn't mean there aren't other sources of corruption we should be thinking
> about.  I've thought about this one a lot and it scares me.

FWIW, I think this kind of issue practially bites a *LOT* more people
than the fsync() issue.


> I've worked on ways to make it better, but all of them break something and
> involve compromises that are nearly as severe as removing exclusive backups
> entirely.

There've been plenty proposals to make non-exclusive backups less
painful in this thread, so ...?

Greetings,

Andres Freund



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Andres Freund
On 2019-02-25 08:42:02 -0500, Stephen Frost wrote:
> Greetings,
> 
> * Andres Freund (and...@anarazel.de) wrote:
> > On 2019-02-24 17:19:22 -0500, Stephen Frost wrote:
> > > You say above that the new interface is unquestionably an improvement
> > 
> > FWIW, I think we didn't design it even remotely as well as we ought to
> > have. It was both a mistake to not offer a version of non-exclusive
> > backups that works with a client connection (for integration with the
> > TSMs of this world), and it was a mistake not to offer a commandline
> > tool that does the non-exclusive pg/start backup.
> 
> I'm not really following- did you mean to say "without a client
> connection" above?

Yes.


> We could still add some kind of commandline tool to help with the
> non-exclusive pg start/stop backup

That doesn't help, as long as the connection requirement is there. As
explained elsewhere in this thread, a lot of larger external backup
infrastructure just gives you a start/stop hook, and that makes using a
persistent connection hard and fragile.


> but I'm not really sure exactly what that would look like and I
> honestly don't have terribly much interest in spending resources on
> implementing it since it would lack a great many features that we'd
> then end up wanting to add on... and then we end up backing into
> building yet another backup tool.

Meh. You are proposing to remove a feature (even if a dangerous
one). Requiring some minimal compat work to make htat more palpaple
isn't crazy.  Nor does using backrest or whatnot do *ANYTHING* about the
users of large company wide backup tools.


> > > I really, honestly, believe what we need to *stop* doing is trying to
> > > drag along support for old/deprecated interfaces after we've introduced
> > > new ones, thus avoiding these arguments and the time spent on them, and
> > > the time spent dealing with implementing and documenting new APIs around
> > > old ones.  The only thing it seems to unquestionably do is make us argue
> > > about when we are going to *finally* rip out the old/deprecated API (or
> > > interface, or whatever you want to call it).
> > 
> > While I agree that we should remove non-exclusive backups, I VEHEMENTLY
> > oppose the unconditionality of the above statement. Yes, it's annoying
> > to keep interfaces around, but unnecessarily inflicting pain to everyone
> > also is annoying.
> 
> Alright, then how about we provide a bit of help for everyone who's got
> a system built around recovery.conf today, instead of just completely
> ripping that out?

Oh, comeon. Obviously we're sometimes going to have to make breaking
changes. But you're essentially arguing that we shouldn't provide compat
where we can come up with a reasonable way to provide backward
compatibility. And I think that's crazy and will hurt postgres.


> > That's not to say we shouldn't be better at announcing and then
> > following through on the deprecation of old interfaces.
> 
> We announce things have changed every year, and people have five years
> from that point, during which we'll continue to support them and fix
> bugs and deal with security issues, to make whatever changes they need
> to for the new interface.
> 
> The argument seems to be that people want new features but they don't
> want to have to do any work to get those new features.

I mean, duh? We can't make that happen all the time, but I don't
understand why it's surprising to have that as *one* goal that's in
conflict with other goals?  Your analysis also neglects that a lot of
software will have to work across multiple supported versions of
postgres (at the very least in prep for the migration, but also often
longer) - by having a few versions were both interfaces work, that can
be made *MUCH* less painful.


> When is something too big of a change to make in a given major version
> and we have to, instead, provide backwards compatibility for it?  What
> is that policy?  How many releases do we have to support it for?  How do
> we communicate that to our users, so that they know what they can depend
> on being there across major releases and what they can't?

I literally said that we should have more policy around this.


I'm going to stop here, discussing with you in this thread seems
unproductive.


Greetings,

Andres Freund



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Andres Freund  writes:
> FWIW, rewrites of this kind can be quite nicely automated using
> coccinelle [1]. One sometimes needs to do a bit of mop-up with variable
> names, but otherwise it should be mostly complete.

I'm getting slightly annoyed by arguments that reject a live, workable
patch in favor of pie-in-the-sky proposals.  Both you and Robert seem
to be advocating solutions that don't exist and would take a very large
amount of work to create.  If you think differently, let's see a patch.

regards, tom lane



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 11:59:06 -0800, Peter Geoghegan wrote:
> On Mon, Feb 25, 2019 at 10:59 AM Tom Lane  wrote:
> > Because it involves touching ten times more code (and that's a very
> > conservative estimate).  Excluding changes in pg_list.h + list.c,
> > what I posted touches approximately 600 lines of code (520 insertions,
> > 644 deletions to be exact).  For comparison's sake, there are about
> > 1800 uses of foreach in the tree, each of which would require at least
> > 3 changes to replace (the foreach itself, the ListCell variable
> > declaration, and at least one lfirst() reference in the loop body).
> 
> If we knew that the list code was the bottleneck in a handful of
> cases, then I'd come down on Robert's side here. It would then be
> possible to update the relevant bottlenecked code in an isolated
> fashion, while getting the lion's share of the benefit. However, I
> don't think that that's actually possible. The costs of using Lists
> everywhere are real and measurable, but it's also highly distributed.
> At least, that's my recollection from previous discussion from several
> years back. I remember talking about this with Andres in early 2016.

It's distributed, but not *that* distributed. The largest source of
"cost" at execution time used to be all-over expression evaluation, but
that's gone now. That was a lot of places, but it's not outside of reach
of a targeted change. Now it's targetlist handling, which'd have to
change together with plan time, where it's a large issue.


> > So we've already blown past 5000 lines worth of changes if we want to
> > do it another way ... and that's just *one* component of the List API.
> 
> If you want to stop third party code from compiling, you can find a
> way to do that without really changing your approach. Nothing stops
> you from changing some symbol names minimally, and then making
> corresponding mechanical changes to all of the client code within the
> tree. Third party code authors would then follow this example, with
> the expectation that it's probably going to be a totally mechanical
> process.
> 
> I'm not necessarily advocating that approach. I'm simply pointing out
> that a compromise is quite possible.

That breaks extension code using lists unnecessarily though. And given
that there's semantic change, I don't htink it's an entirely mechanical
process.

Greetings,

Andres Freund



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Andres Freund  writes:
> Yea, it'd be more convincing. I'm not convinced it'd be a no-brainer
> though. Unless you've been hacking PG for a fair bit, the pg_list.h APIs
> are very hard to understand / remember. Given this change essentially
> requires auditing all code that uses List, ISTM we'd be much better off
> also changing the API at the same time.  Yes that'll mean there'll be
> vestigial uses nobody bothered to convert in extension etc, but that's
> not that bad.

The pain factor for back-patching is alone a strong reason for not just
randomly replacing the List API with different spellings.

regards, tom lane



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 08:14:16 -0500, Stephen Frost wrote:
> > It will be annoying if after this removal, companies must change their
> > backup strategy by using specific postgres tools (pgbackrest, barman).
> 
> You don't write your own database system using CSV files and shell
> magic, do you?  I have to say that it continues to boggle my mind that
> people insist that *this* part of the system has to be able to be
> implementable using shell scripts.
> 
> Folks, these are your backups we're talking about, your last resort if
> everything else goes up in flames, why do you want to risk that by
> implementing your own one-off solution, particularly when there's known
> serious issues using that interface, and you want to just use shell
> scripts to do it...?

FWIW, if you weren't selling backrest quite so hard everywhere backups
are mentioned, I'd find this thread a lot more convicing.

Greetings,

Andres Freund



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 13:59:36 -0500, Tom Lane wrote:
> Robert Haas  writes:
> > On Mon, Feb 25, 2019 at 1:17 PM Tom Lane  wrote:
> >> I'm not following your point here.  If we change key data structures
> >> (i.e. parsetrees, plan trees, execution trees) to use some other list-ish
> >> API, that *in itself* breaks everything that accesses those data
> >> structures.  The approach I propose here isn't zero-breakage, but it
> >> requires far fewer places to be touched than a complete API replacement
> >> would do.
> 
> > Sure, but if you have third-party code that touches those things,
> > it'll fail to compile.  With your proposed approach, there seems to be
> > a risk that it will compile but not work.
> 
> Failing to compile isn't really a benefit IMO.

It's a huge benefit. It's a lot of effort to look through all source
code for potential breakages. Especially if all list usages, rather than
some planner detail that comparatively few extensions touch, needs to be
audited.


> >> I completely disagree.  Your proposal is probably an order of magnitude
> >> more painful than the approach I suggest here, while not really offering
> >> any additional performance benefit (or if you think there would be some,
> >> you haven't explained how).  Strictly on cost/benefit grounds, it isn't
> >> ever going to happen that way.
> 
> > Why would it be ten times more painful, exactly?
> 
> Because it involves touching ten times more code (and that's a very
> conservative estimate).  Excluding changes in pg_list.h + list.c,
> what I posted touches approximately 600 lines of code (520 insertions,
> 644 deletions to be exact).  For comparison's sake, there are about
> 1800 uses of foreach in the tree, each of which would require at least
> 3 changes to replace (the foreach itself, the ListCell variable
> declaration, and at least one lfirst() reference in the loop body).
> So we've already blown past 5000 lines worth of changes if we want to
> do it another way ... and that's just *one* component of the List API.
> Nor is there any reason to think the changes would be any more mechanical
> than what I had to do here.  (No fair saying that I already found the
> trouble spots, either.  A different implementation would likely break
> assumptions in different ways.)

FWIW, rewrites of this kind can be quite nicely automated using
coccinelle [1]. One sometimes needs to do a bit of mop-up with variable
names, but otherwise it should be mostly complete.

Greetings,

Andres Freund

[1] http://coccinelle.lip6.fr/



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Peter Geoghegan  writes:
> If we knew that the list code was the bottleneck in a handful of
> cases, then I'd come down on Robert's side here. It would then be
> possible to update the relevant bottlenecked code in an isolated
> fashion, while getting the lion's share of the benefit. However, I
> don't think that that's actually possible. The costs of using Lists
> everywhere are real and measurable, but it's also highly distributed.
> At least, that's my recollection from previous discussion from several
> years back. I remember talking about this with Andres in early 2016.

Yeah, that's exactly the point.  If we could replace some small number
of places with a vector-ish data structure and get all/most of the win,
then that would be the way to go about it.  But I'm pretty sure that
we aren't going to make much of an improvement without wholesale
changes.  Nor is it really all that attractive to have some pieces of
the parse/plan/execution tree data structures using one kind of list
while other places use another.  If we're to attack this at all,
I think we should go for a wholesale replacement.

Another way of looking at this is that if we expected that extensions
had a lot of private Lists, unrelated to these core data structures,
it might be worth preserving the List implementation so as not to cause
problems for such usage.  But I doubt that that's the case; or that
any such private lists are more likely to be broken by these API changes
than the core usage is (600 changes in however many lines we've got is
not a lot); or that people would really want to deal with two independent
list implementations with different behaviors just to avoid revisiting
some portions of their code while they're being forced to revisit others
anyway.

> If you want to stop third party code from compiling, you can find a
> way to do that without really changing your approach. Nothing stops
> you from changing some symbol names minimally, and then making
> corresponding mechanical changes to all of the client code within the
> tree. Third party code authors would then follow this example, with
> the expectation that it's probably going to be a totally mechanical
> process.

Yeah, if we expected that only mechanical changes would be needed, and
forcing certain syntax changes would be a good guide to what had to be
done, then this'd be a helpful way to proceed.  The lnext changes in
my proposed patch do indeed work like that.  But the part that's actually
hard is finding/fixing the places where you can't safely use lnext
anymore, and there's nothing very mechanical about that.  (Unless you want
to just forbid lnext altogether, which maybe is a reasonable thing to
contemplate, but I judged it overkill.)

BTW, I neglected to respond to Robert's earlier point that

>>> It is also perhaps worth mentioning that reimplementing a List as an
>>> array means that it is... not a list.  That is a pretty odd state of
>>> affairs

I think the reason we have Lisp-style lists all over the place has little
to do with whether those are ideal data structures, and a lot to do with
the fact that chunks of Postgres were originally written in Lisp, and
in that language using lists for everything is just How It's Done.
I don't have any problem with regarding that nomenclature as being mostly
a legacy thing, which is how I documented it in the proposed revision
to pg_list.h's header comment.

regards, tom lane



Re: [HACKERS] generated columns

2019-02-25 Thread Peter Eisentraut
On 2019-01-15 08:13, Michael Paquier wrote:
>>> Ah, the volatility checking needs some improvements.  I'll address that
>>> in the next patch version.
>>
>> ok
> 
> The same problem happens for stored and virtual columns.

This is fixed in v8.

> It would be nice to add a test with composite types, say something
> like:
> =# create type double_int as (a int, b int);
> CREATE TYPE
> =# create table double_tab (a int,
>   b double_int GENERATED ALWAYS AS ((a * 2, a * 3)) stored,
>   c double_int GENERATED ALWAYS AS ((a * 4, a * 5)) virtual);
> CREATE TABLE
> =# insert into double_tab values (1), (6);
> INSERT 0 2
> =# select * from double_tab ;
>  a |b|c
> ---+-+-
>  1 | (2,3)   | (4,5)
>  6 | (12,18) | (24,30)
> (2 rows)

I added that.

> +   boolhas_generated_stored;
> +   boolhas_generated_virtual;
>  } TupleConstr;
> Could have been more simple to use a char as representation here.

Seems confusing if both apply at the same time.

> Using NULL as generation expression results in a crash when selecting
> the relation created:
> =# CREATE TABLE gtest_err_1 (a int PRIMARY KEY, b int GENERATED ALWAYS
> AS (NULL));
> CREATE TABLE
> =# select * from gtest_err_1;
> server closed the connection unexpectedly

This is fixed in v8.

> +   The view column_column_usage identifies all
> generated
> "column_column_usage" is redundant.  Could it be possible to come up
> with a better name?

This is specified in the SQL stnadard.

> When testing a bulk INSERT into a table which has a stored generated
> column, memory keeps growing in size linearly, which does not seem
> normal to me.

This was a silly coding error.  It's fixed in v8.

> +/*
> + * Thin wrapper around libpq to obtain server version.
> + */
> +static int
> +libpqrcv_server_version(WalReceiverConn *conn)
> This should be introduced in separate patch in my opinion (needed
> afterwards for logirep).

Yes, it could be committed separately.

> What about the catalog representation of attgenerated?  Would it merge
> with attidentity & co?  Or not?

I think the way it is now seems best.  The other options that were
discussed are also plausible, but that the discussions did not reveal
any overwhelming arguments for a a change.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: POC: converting Lists into arrays

2019-02-25 Thread Andres Freund
Hi,

On 2019-02-25 13:02:03 -0500, Robert Haas wrote:
> On Sat, Feb 23, 2019 at 9:24 PM Tom Lane  wrote:
> > Every time this has come up, I've opined that the right fix is to jack
> > up the List API and drive a new implementation underneath, as we did
> > once before (cf commit d0b4399d81).  I thought maybe it was about time
> > to provide some evidence for that position, so attached is a POC patch
> > that changes Lists into expansible arrays, while preserving most of
> > their existing API.
> 
> I'm not really convinced that this is the way to go.  The thing is,
> any third-party code people have that uses a List may simply break.
> If you kept the existing List and changed a bunch of existing code to
> use a new Vector implementation, or Thomas's SimpleVector stuff, then
> that wouldn't happen.  The reason why people - or at least me - have
> been reluctant to accept that you can just jack up the API and drive a
> new implementation underneath is that the new implementation will
> involve breaking guarantees on which existing code relies; indeed,
> your email makes it pretty clear that this is the case.  If you could
> replace the existing implementation without breaking any code, that
> would be a no-brainer but there's no real way to do that and get the
> performance benefits you're seeking to obtain.

Yea, it'd be more convincing. I'm not convinced it'd be a no-brainer
though. Unless you've been hacking PG for a fair bit, the pg_list.h APIs
are very hard to understand / remember. Given this change essentially
requires auditing all code that uses List, ISTM we'd be much better off
also changing the API at the same time.  Yes that'll mean there'll be
vestigial uses nobody bothered to convert in extension etc, but that's
not that bad.

Greetings,

Andres Freund



Re: POC: converting Lists into arrays

2019-02-25 Thread Peter Geoghegan
On Mon, Feb 25, 2019 at 10:59 AM Tom Lane  wrote:
> Because it involves touching ten times more code (and that's a very
> conservative estimate).  Excluding changes in pg_list.h + list.c,
> what I posted touches approximately 600 lines of code (520 insertions,
> 644 deletions to be exact).  For comparison's sake, there are about
> 1800 uses of foreach in the tree, each of which would require at least
> 3 changes to replace (the foreach itself, the ListCell variable
> declaration, and at least one lfirst() reference in the loop body).

If we knew that the list code was the bottleneck in a handful of
cases, then I'd come down on Robert's side here. It would then be
possible to update the relevant bottlenecked code in an isolated
fashion, while getting the lion's share of the benefit. However, I
don't think that that's actually possible. The costs of using Lists
everywhere are real and measurable, but it's also highly distributed.
At least, that's my recollection from previous discussion from several
years back. I remember talking about this with Andres in early 2016.

> So we've already blown past 5000 lines worth of changes if we want to
> do it another way ... and that's just *one* component of the List API.

If you want to stop third party code from compiling, you can find a
way to do that without really changing your approach. Nothing stops
you from changing some symbol names minimally, and then making
corresponding mechanical changes to all of the client code within the
tree. Third party code authors would then follow this example, with
the expectation that it's probably going to be a totally mechanical
process.

I'm not necessarily advocating that approach. I'm simply pointing out
that a compromise is quite possible.

> Nor is there any reason to think the changes would be any more mechanical
> than what I had to do here.  (No fair saying that I already found the
> trouble spots, either.  A different implementation would likely break
> assumptions in different ways.)

The idea of making a new vector/array implementation that is a more or
less drop in replacement for List seems okay to me. C++ has both a
std::list and a std::vector, and they support almost the same
interface. Obviously the situation is different here, since you're
retrofitting a new implementation with different performance
characteristics, rather than implementing both in a green field
situation. But it's not that different.

-- 
Peter Geoghegan



Re: \describe*

2019-02-25 Thread Robert Haas
On Sat, Feb 23, 2019 at 7:19 PM Andres Freund  wrote:
> Sure, but it was late, and we have far more patches than we can deal
> with. Many of them much much older than this.

More importantly, at least in my opinion, is that this is one of those
questions that people tend to have very strong feelings about.  Doing
something at the last minute risks people not feeling that they had an
adequate time to express those feelings before something got shipped.
Not everybody reads this list every day, or tests every new commit as
soon as it goes into the tree.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Bruce Momjian
On Mon, Feb 25, 2019 at 02:24:18PM -0500, Stephen Frost wrote:
> * Bruce Momjian (br...@momjian.us) wrote:
> > On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> > > I don't want to see more users stumbling over the issues with the
> > > exclusive backup interface.  A better interface exists, and has existed
> > > since 9.6.
> > 
> > Do you really think we would be having this discussion if the
> > non-exclusive backup method was unequivocally better?  It is better for
> > some use-cases, and impossible for others.
> 
> Based on Christophe's comment above, anything which required users to
> make a change on upgrade to their backup system would be cause to have
> this discussion, which likely includes most possible solutions to the
> issues with exclusive backup too, unfortunately..

I am not addressing what Christophe said, but rather what you said.  We
clearly are fine in requiring people to update their software for major
releases.  I think if there was a way for old backup methods to work at
all, this would be a lot simpler converation.

> > Also, you can't say it will have no impact for five years on people who
> > do not upgrade.  The impact will be that they will have no new Postgres
> > features for five years.
> 
> I don't think I made the claim that there wouldn't be any impact for
> five years, I said they would continue to have support for five years.
> 
> Also, this is exactly what we tell them for any other breaking change
> (such as removal of recovery.conf).
> 
> > I am not taking a stance on this issue, but making unclear statements
> > isn't helping the discussion.
> 
> It's not my intent to make unclear statements, so I certainly appreicate
> you, and anyone else, pointing out when I do.  I'm happy to clarify.

Good.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* David Steele (da...@pgmasters.net) wrote:
> On 2/25/19 7:50 PM, Fujii Masao wrote:
> >Thought?
> 
> Here's the really obvious bad thing: if users do not update their procedures
> and we ignore backup_label.pending on startup then they will end up with a
> corrupt database because it will not replay from the correct checkpoint.  If
> we error on the presence of backup_label.pending then we are right back to
> where we started.

Right, we definitely can't make a change which would cause non-updated
systems using the same APIs to suddenly end up with corruption.  If we
require a change be made to the scripts that run the process then we
must make it something the user is essentially opt'ing into- where
they've told us explicitly that they're ready to use the new interface,
which is how the non-exclusive backup mode was added.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Christophe Pettus (x...@thebuild.com) wrote:
> > On Feb 25, 2019, at 08:55, Stephen Frost  wrote:
> > 
> > I honestly do doubt that they have had the same experiences that I have
> > had
> 
> Well, I guarantee you that no two people on this list have had identical 
> experiences. :)  

Indeed! :)

> I certainly have been bitten by the problems with the current system.  But 
> the resistance to major version upgrades is *huge*, and I'm strongly biased 
> against anything that will make that harder.  I'm not sure I'm communicating 
> how big a problem telling many large installations, "If you move to 
> v12/13/etc., you will have to change your backup system" is going to be.

Aren't they going to need to make a change for v12 now anyway?

Hopefully they're regularly testing their backups by doing a restore of
them, and dropping a recovery.conf into the directory of a v12 system
after restore will do exactly nothing and they'll get errors complaining
about how they need to provide a restore_command.

That's actually what prompted bringing this painful topic up again-
there's a large change being done to how backup/restore with PG is done
(the recovery.conf file is going away), and people who have written any
kind of automation (or even just documented procedures...) around that
will have to update their systems.  At least from my perspective, making
them have to do such an update once, instead of once now and another
time in the future when we remove exclusive backup (or figure out a way
to do it that's safe and update the instructions for how to do it
right...), is the better option.

* Bruce Momjian (br...@momjian.us) wrote:
> On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> > I don't want to see more users stumbling over the issues with the
> > exclusive backup interface.  A better interface exists, and has existed
> > since 9.6.
> 
> Do you really think we would be having this discussion if the
> non-exclusive backup method was unequivocally better?  It is better for
> some use-cases, and impossible for others.

Based on Christophe's comment above, anything which required users to
make a change on upgrade to their backup system would be cause to have
this discussion, which likely includes most possible solutions to the
issues with exclusive backup too, unfortunately..

> Also, you can't say it will have no impact for five years on people who
> do not upgrade.  The impact will be that they will have no new Postgres
> features for five years.

I don't think I made the claim that there wouldn't be any impact for
five years, I said they would continue to have support for five years.

Also, this is exactly what we tell them for any other breaking change
(such as removal of recovery.conf).

> I am not taking a stance on this issue, but making unclear statements
> isn't helping the discussion.

It's not my intent to make unclear statements, so I certainly appreicate
you, and anyone else, pointing out when I do.  I'm happy to clarify.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Robert Haas  writes:
> On Mon, Feb 25, 2019 at 1:17 PM Tom Lane  wrote:
>> I'm not following your point here.  If we change key data structures
>> (i.e. parsetrees, plan trees, execution trees) to use some other list-ish
>> API, that *in itself* breaks everything that accesses those data
>> structures.  The approach I propose here isn't zero-breakage, but it
>> requires far fewer places to be touched than a complete API replacement
>> would do.

> Sure, but if you have third-party code that touches those things,
> it'll fail to compile.  With your proposed approach, there seems to be
> a risk that it will compile but not work.

Failing to compile isn't really a benefit IMO.  Now, if we could avoid
the *semantic* differences (like whether it's safe to hold onto a pointer
into a List while doing FOO on the list), then we'd have something.
The biggest problem with what I'm proposing is that it doesn't always
manage to do that --- but any other implementation is going to break
such assumptions too.  I do not think that forcing cosmetic changes
on people is going to do much to help them revisit possibly-hidden
assumptions like those.  What will help is to provide debugging aids to
flush out such assumptions, which I've endeavored to do in this patch.
And I would say that any competing proposal is going to be a failure
unless it provides at-least-as-effective support for flushing out bugs
in naive updates of existing List-using code.

>> I completely disagree.  Your proposal is probably an order of magnitude
>> more painful than the approach I suggest here, while not really offering
>> any additional performance benefit (or if you think there would be some,
>> you haven't explained how).  Strictly on cost/benefit grounds, it isn't
>> ever going to happen that way.

> Why would it be ten times more painful, exactly?

Because it involves touching ten times more code (and that's a very
conservative estimate).  Excluding changes in pg_list.h + list.c,
what I posted touches approximately 600 lines of code (520 insertions,
644 deletions to be exact).  For comparison's sake, there are about
1800 uses of foreach in the tree, each of which would require at least
3 changes to replace (the foreach itself, the ListCell variable
declaration, and at least one lfirst() reference in the loop body).
So we've already blown past 5000 lines worth of changes if we want to
do it another way ... and that's just *one* component of the List API.
Nor is there any reason to think the changes would be any more mechanical
than what I had to do here.  (No fair saying that I already found the
trouble spots, either.  A different implementation would likely break
assumptions in different ways.)

If I said your proposal involved two orders of magnitude more work,
I might not be far off the mark.

regards, tom lane



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

Hi Christophe,

On 2/25/19 7:24 PM, Christophe Pettus wrote:




On Feb 25, 2019, at 08:55, Stephen Frost  wrote:

I honestly do doubt that they have had the same experiences that I have
had


Well, I guarantee you that no two people on this list have had identical experiences. :)  
I certainly have been bitten by the problems with the current system.  But the resistance 
to major version upgrades is *huge*, and I'm strongly biased against anything that will 
make that harder.  I'm not sure I'm communicating how big a problem telling many large 
installations, "If you move to v12/13/etc., you will have to change your backup 
system" is going to be.


I honestly think you are underestimating how bad this can be.

The prevailing wisdom is that it's unfortunate that these backup_labels 
get left around but they can be removed with scripting, so no big deal. 
After that the cluster will start.


But -- if you are too aggressive about removing the backup_label and 
accidentally do it before a real restore from backup, then you have a 
corrupt cluster.  Totally silent, but definitely corrupt.  You'll 
probably only see it when you start getting consistency errors from the 
indexes, if ever.  Page checksums won't catch it either unless you are 
*lucky* enough to have a torn page.


Erroneous scripting of this kind can also affect backups that were made 
with the non-exclusive method since the backups look the same.


fsync() is the major corruption issue we are facing right now but that 
doesn't mean there aren't other sources of corruption we should be 
thinking about.  I've thought about this one a lot and it scares me.


I've worked on ways to make it better, but all of them break something 
and involve compromises that are nearly as severe as removing exclusive 
backups entirely.


Regards,
--
-David
da...@pgmasters.net



Re: Auxiliary Processes and MyAuxProc

2019-02-25 Thread Mike Palmiotto
On Mon, Feb 25, 2019 at 1:29 PM Tom Lane  wrote:
>
> Mike Palmiotto  writes:
> 
>
> > For some context, I'm trying to come up with a patch to set the
> > process identifier (MyAuxProc/am_autovacuumworker/launcher,
> > am_archiver, etc.) immediately after forking,
>
> Don't we do that already?

Kind of. The indicators are set in the Main functions after
InitPostmasterChild and some other setup. In my little bit of digging
I found InitPostmasterChild to be the best place for a centralized
"early start-up hook." Is there a better place you can think of
off-hand?

> 
>
> If memory serves, StartChildProcess already was an attempt to unify
> the treatment of postmaster children.  It's possible that another
> round of unification would be productive, but I think you'll find
> that there are random small differences in requirements that'd
> make it messy.

It kind of seemed like it, but I noticed the small differences in
requirements, which made me a bit hesitant. I'll go ahead and see what
I can do and submit the patch for consideration.

-- 
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com



Re: POC: converting Lists into arrays

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 1:17 PM Tom Lane  wrote:
> I'm not following your point here.  If we change key data structures
> (i.e. parsetrees, plan trees, execution trees) to use some other list-ish
> API, that *in itself* breaks everything that accesses those data
> structures.  The approach I propose here isn't zero-breakage, but it
> requires far fewer places to be touched than a complete API replacement
> would do.

Sure, but if you have third-party code that touches those things,
it'll fail to compile.  With your proposed approach, there seems to be
a risk that it will compile but not work.

> Yup.  So are you saying that we'll never redesign parsetrees again?
> We break things regularly, as long as the cost/benefit justifies it.

I'm mostly objecting to the degree that the breakage is *silent*.

> I completely disagree.  Your proposal is probably an order of magnitude
> more painful than the approach I suggest here, while not really offering
> any additional performance benefit (or if you think there would be some,
> you haven't explained how).  Strictly on cost/benefit grounds, it isn't
> ever going to happen that way.

Why would it be ten times more painful, exactly?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Auxiliary Processes and MyAuxProc

2019-02-25 Thread Tom Lane
Mike Palmiotto  writes:
> I've been looking through the startup code for postmaster forks and
> saw a couple of mechanisms used. Most forks seem to be using
> StartChildProcess with a MyAuxProc emumerator, but then some
> (autovacuum launcher/worker, syslogger, bgworker, archiver) are forked
> through their own start functions.

> I noticed some implication in the pgsql-hackers archives [1] that
> non-AuxProcs are as such because they don't need access to shared
> memory. Is this an accurate explanation?

That was the original idea, but it hasn't stood the test of time very
well.  In particular, the AuxProcType enum only works for child
processes that there's supposed to be exactly one of, so we haven't
used it for autovac workers or bgworkers, although those certainly
have to be connected to shmem (hm, maybe that's not true for all
types of bgworker, not sure).  The autovac launcher is kind of a
weird special case --- I think it could have been included in AuxProcType,
but it wasn't, possibly to minimize the cosmetic difference between it
and autovac workers.

> For some context, I'm trying to come up with a patch to set the
> process identifier (MyAuxProc/am_autovacuumworker/launcher,
> am_archiver, etc.) immediately after forking,

Don't we do that already?

> With the current startup architecture, we have to touch multiple
> entrypoints to achieve the desired effect. Is there any particular
> reason we couldn't fold all of the startup processes into the
> StartChildProcess code and assign MyAuxProc for processes that don't
> currently have one, or is this a non-starter?

If memory serves, StartChildProcess already was an attempt to unify
the treatment of postmaster children.  It's possible that another
round of unification would be productive, but I think you'll find
that there are random small differences in requirements that'd
make it messy.

regards, tom lane



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Bruce Momjian
On Mon, Feb 25, 2019 at 11:33:33AM -0500, Stephen Frost wrote:
> I don't want to see more users stumbling over the issues with the
> exclusive backup interface.  A better interface exists, and has existed
> since 9.6.

Do you really think we would be having this discussion if the
non-exclusive backup method was unequivocally better?  It is better for
some use-cases, and impossible for others.

Also, you can't say it will have no impact for five years on people who
do not upgrade.  The impact will be that they will have no new Postgres
features for five years.

I am not taking a stance on this issue, but making unclear statements
isn't helping the discussion.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread David Steele

On 2/25/19 7:50 PM, Fujii Masao wrote:

On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe  wrote:


I'm not playing devil's advocate here to annoy you.  I see the problems
with the exclusive backup, and I see how it can hurt people.
I just think that removing exclusive backup without some kind of help
like Andres sketched above will make people unhappy.


+1

Another idea is to improve an exclusive backup method so that it will never
cause such issue. What about changing an exclusive backup mode of
pg_start_backup() so that it creates something like backup_label.pending file
instead of backup_label? Then if the database cluster has backup_label.pending
file but not recovery.signal (this is the case where the database is recovered
just after the server crashes while an exclusive backup is in progress),
in this idea, the recovery using that database cluster always ignores
(or removes) backup_label.pending file and start replaying WAL from
the REDO location that pg_control file indicates. So this idea enables us to
work around the issue that an exclusive backup could cause.


It's an interesting idea.


On the other hand, the downside of this idea is that the users need to change
the recovery procedure. When they want to do PITR using the backup having
backup_label.pending, they need to not only create recovery.signal but also
rename backup_label.pending to backup_label. Rename of backup_label file
is brand-new step for their recovery procedure, and changing the recovery
procedure might be painful for some users. But IMO it's less painful than
removing an exclusive backup API at all.


Well, given that we have invalidated all prior recovery procedures in 
PG12 I'm not sure how big a deal that is.  Of course, it's too late make 
a change like this for PG12.



Thought?


Here's the really obvious bad thing: if users do not update their 
procedures and we ignore backup_label.pending on startup then they will 
end up with a corrupt database because it will not replay from the 
correct checkpoint.  If we error on the presence of backup_label.pending 
then we are right back to where we started.


I know there are backup solutions that rely on copying all required WAL 
to pg_xlog/pg_wal before starting recovery.  Those solutions would 
silently break in this case and end up in corruption.  If we require 
recovery.signal then we still have the current problem of the cluster 
not starting after a crash.



BTW, if recovery.signal is created but backup_label.pending is not renamed
(this is the case where the operator forgets to rename the file even though
she or he create recovery signal file, i.e., mis-configuration), I think that
the recovery should emit PANIC immediately with the HINT like
"HINT: rename backup_label.pening to backup_label if you want to do PITR".


This causes its own problems, as stated above.

Regards,
--
-David
da...@pgmasters.net



Re: POC: converting Lists into arrays

2019-02-25 Thread Tom Lane
Robert Haas  writes:
> On Sat, Feb 23, 2019 at 9:24 PM Tom Lane  wrote:
>> Every time this has come up, I've opined that the right fix is to jack
>> up the List API and drive a new implementation underneath, as we did
>> once before (cf commit d0b4399d81).

> I'm not really convinced that this is the way to go.  The thing is,
> any third-party code people have that uses a List may simply break.
> If you kept the existing List and changed a bunch of existing code to
> use a new Vector implementation, or Thomas's SimpleVector stuff, then
> that wouldn't happen.

I'm not following your point here.  If we change key data structures
(i.e. parsetrees, plan trees, execution trees) to use some other list-ish
API, that *in itself* breaks everything that accesses those data
structures.  The approach I propose here isn't zero-breakage, but it
requires far fewer places to be touched than a complete API replacement
would do.

Just as with the dlist/slist stuff, inventing a new list API might be
acceptable for all-new data structures that didn't exist before, but
it isn't going to really help for code and data structures that've been
around for decades.

> If you could
> replace the existing implementation without breaking any code, that
> would be a no-brainer but there's no real way to do that and get the
> performance benefits you're seeking to obtain.

Yup.  So are you saying that we'll never redesign parsetrees again?
We break things regularly, as long as the cost/benefit justifies it.

> It is also perhaps worth mentioning that reimplementing a List as an
> array means that it is... not a list.  That is a pretty odd state of
> affairs, and to me is another sign that we want to leave the existing
> thing alone and convert some/most/all core code to use a new thing.

I completely disagree.  Your proposal is probably an order of magnitude
more painful than the approach I suggest here, while not really offering
any additional performance benefit (or if you think there would be some,
you haven't explained how).  Strictly on cost/benefit grounds, it isn't
ever going to happen that way.

regards, tom lane



Re: POC: converting Lists into arrays

2019-02-25 Thread Robert Haas
On Sat, Feb 23, 2019 at 9:24 PM Tom Lane  wrote:
> Every time this has come up, I've opined that the right fix is to jack
> up the List API and drive a new implementation underneath, as we did
> once before (cf commit d0b4399d81).  I thought maybe it was about time
> to provide some evidence for that position, so attached is a POC patch
> that changes Lists into expansible arrays, while preserving most of
> their existing API.

I'm not really convinced that this is the way to go.  The thing is,
any third-party code people have that uses a List may simply break.
If you kept the existing List and changed a bunch of existing code to
use a new Vector implementation, or Thomas's SimpleVector stuff, then
that wouldn't happen.  The reason why people - or at least me - have
been reluctant to accept that you can just jack up the API and drive a
new implementation underneath is that the new implementation will
involve breaking guarantees on which existing code relies; indeed,
your email makes it pretty clear that this is the case.  If you could
replace the existing implementation without breaking any code, that
would be a no-brainer but there's no real way to do that and get the
performance benefits you're seeking to obtain.

It is also perhaps worth mentioning that reimplementing a List as an
array means that it is... not a list.  That is a pretty odd state of
affairs, and to me is another sign that we want to leave the existing
thing alone and convert some/most/all core code to use a new thing.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Auxiliary Processes and MyAuxProc

2019-02-25 Thread Mike Palmiotto
Greetings,

I've been looking through the startup code for postmaster forks and
saw a couple of mechanisms used. Most forks seem to be using
StartChildProcess with a MyAuxProc emumerator, but then some
(autovacuum launcher/worker, syslogger, bgworker, archiver) are forked
through their own start functions.

I noticed some implication in the pgsql-hackers archives [1] that
non-AuxProcs are as such because they don't need access to shared
memory. Is this an accurate explanation?

For some context, I'm trying to come up with a patch to set the
process identifier (MyAuxProc/am_autovacuumworker/launcher,
am_archiver, etc.) immediately after forking, so we can provide
process context info to a hook in early startup. I wanted to make sure
to do things as cleanly as possible.

With the current startup architecture, we have to touch multiple
entrypoints to achieve the desired effect. Is there any particular
reason we couldn't fold all of the startup processes into the
StartChildProcess code and assign MyAuxProc for processes that don't
currently have one, or is this a non-starter?

Thank you for your time.

--
Mike Palmiotto
Software Engineer
Crunchy Data Solutions
https://crunchydata.com


[1] 
https://www.postgresql.org/message-id/20181127.175949.06807946.horiguchi.kyotaro%40lab.ntt.co.jp



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Fujii Masao
On Mon, Feb 25, 2019 at 10:49 PM Laurenz Albe  wrote:
>
> Stephen Frost wrote:
> > > It will be annoying if after this removal, companies must change their
> > > backup strategy by using specific postgres tools (pgbackrest, barman).
> >
> > You don't write your own database system using CSV files and shell
> > magic, do you?  I have to say that it continues to boggle my mind that
> > people insist that *this* part of the system has to be able to be
> > implementable using shell scripts.
> >
> > Folks, these are your backups we're talking about, your last resort if
> > everything else goes up in flames, why do you want to risk that by
> > implementing your own one-off solution, particularly when there's known
> > serious issues using that interface, and you want to just use shell
> > scripts to do it...?
>
> If we didn't think that simplicity in backup has some value, why does
> PostgreSQL provide pg_basebackup?
>
> Not everybody wants to use tools like pgbackrest and barman because it
> takes some effort to set them up properly.  It seems that you think that
> people who want something simpler are irresponsible.
>
> Incidentally, both barman and pgbackrest stream the backup to a backup server.
>
> I think the most important use case for the low level backup API is when
> your database is so large that you want to leverage storage techniques
> like snapshots for backup, because you can't or don't want to stream all
> those terabytes across the network.
>
> I'm not playing devil's advocate here to annoy you.  I see the problems
> with the exclusive backup, and I see how it can hurt people.
> I just think that removing exclusive backup without some kind of help
> like Andres sketched above will make people unhappy.

+1

Another idea is to improve an exclusive backup method so that it will never
cause such issue. What about changing an exclusive backup mode of
pg_start_backup() so that it creates something like backup_label.pending file
instead of backup_label? Then if the database cluster has backup_label.pending
file but not recovery.signal (this is the case where the database is recovered
just after the server crashes while an exclusive backup is in progress),
in this idea, the recovery using that database cluster always ignores
(or removes) backup_label.pending file and start replaying WAL from
the REDO location that pg_control file indicates. So this idea enables us to
work around the issue that an exclusive backup could cause.

On the other hand, the downside of this idea is that the users need to change
the recovery procedure. When they want to do PITR using the backup having
backup_label.pending, they need to not only create recovery.signal but also
rename backup_label.pending to backup_label. Rename of backup_label file
is brand-new step for their recovery procedure, and changing the recovery
procedure might be painful for some users. But IMO it's less painful than
removing an exclusive backup API at all.

Thought?

BTW, if recovery.signal is created but backup_label.pending is not renamed
(this is the case where the operator forgets to rename the file even though
she or he create recovery signal file, i.e., mis-configuration), I think that
the recovery should emit PANIC immediately with the HINT like
"HINT: rename backup_label.pening to backup_label if you want to do PITR".

Regards,

-- 
Fujii Masao



Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
I wrote:
> Amit Kapila  writes:
>> Avoid creation of the free space map for small heap relations, take 2.

> I think this patch still has some issues.

Just out of curiosity ... how can it possibly be even a little bit sane
that fsm_local_map is a single static data structure, without even any
indication of which table it is for?

If, somehow, there's a rational argument for that design, why is it
not explained in freespace.c?  The most charitable interpretation of
what I see there is that it's fatally undercommented.

regards, tom lane



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Christophe Pettus



> On Feb 25, 2019, at 08:55, Stephen Frost  wrote:
> 
> I honestly do doubt that they have had the same experiences that I have
> had

Well, I guarantee you that no two people on this list have had identical 
experiences. :)  I certainly have been bitten by the problems with the current 
system.  But the resistance to major version upgrades is *huge*, and I'm 
strongly biased against anything that will make that harder.  I'm not sure I'm 
communicating how big a problem telling many large installations, "If you move 
to v12/13/etc., you will have to change your backup system" is going to be.

--
-- Christophe Pettus
   x...@thebuild.com




Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:09 AM Stephen Frost  wrote:
> > I do think we can ask that people who wish to have a capability included
> > in PG (or continue to be included when there are serious known issues
> > with it...) be prepared to either build and maintain it themselves or to
> > convince someone else to do so (or both, and have a committer agree to
> > it).  That's how we've long operated and it wasn't my intent to imply
> > otherwise, but I agree that I could have said it in a nicer way to avoid
> > it coming across as telling Christophe what to do.
> 
> I think it is certainly true that if you want a new capability added,
> you have to either write it yourself or get someone to do it for you.
> There is SOME precedent for the proposition that if you want an
> obsolete capability not to be removed, you should be prepared to help
> fix it.  But frankly I think the latter proposition is one we've taken
> only occasionally, and only for things that were a whole lot cruftier
> and more problematic than this is.  For example, if a library is no
> longer available and we have a contrib module that depends on that
> library, it's reasonable to say that we can't keep the contrib module
> unless somebody rewrites it not to depend on that library any more.
> But the current situation is completely different.  The code
> maintenance burden resulting from keeping exclusive-mode backups is
> mostly hypothetical; the code works as well as it ever did.  We rarely
> take the position that we're going to rip out older code that doesn't
> work as nicely as newer code does just because nobody's prepared to
> e.g. better-document the old code.

I'd argue that if something which has been deprecated for years is
seriously getting in the way of making progress that we should be
willing to accept a proposal to rip it out after giving people
sufficient time to adjust, and that if someone really wants to keep that
deprecated API then they need to be willing to spend the time to address
the reasons why it was deprecated.

> For example, we still have FEBE protocol 2 support floating around.
> If you're going to talk about badly-designed footguns that ought to be
> excised with extreme prejudice, that would IMHO be a far better place
> to start than this.  Unlike this code, which it's now obvious is used
> by quite a number of people even just among those who read this list
> regularly, that code is probably nearly unused and untested and, if we
> broke it, we probably wouldn't know.

You can probably guess my feelings with regard to the FEBE v2 support.

At least, in that case, people aren't really using it though.  What I
really dislike is that we've got a *lot* of people using something that
we *know* has some pretty serious potential data-eating problems.
Saying that we should keep it around because a lot of people are using
it isn't the way to persuade me that we should keep it.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:33 AM Stephen Frost  wrote:
> > > Removing an option that people are currently using, and which they
> > > find better than other available options for reasons with which I
> > > understand that you disagree, will make users more sad.  Happy is
> > > better.
> >
> > I don't want to see more users stumbling over the issues with the
> > exclusive backup interface.  A better interface exists, and has existed
> > since 9.6.  The exclusive backup method was deprecated in 2016.  One of
> > the really bad things about the exclusive backup method is that it
> > *looks* like it works well and that there aren't any issues with it, but
> > when users hit the issues, they get very sad.  We might have 1000s of
> > happy users who never run into those issues and therefore don't want to
> > see us change anything, but what about the 10s or 100s of users who do
> > hit those issues, what do we tell them?  Seems like we're saying "well,
> > sorry, we knew those issues existed and it's unfortunate you hit them,
> > but there's this better thing that you *should* have been using and then
> > you wouldn't have hit those issues."  That doesn't seem likely to make
> > people happy with us.
> 
> Sure, nobody wants users to keep hitting the same problems over and
> over again, but the above is still reductionist.  If we take the view
> that anything that can cause data corruption in any scenario, no
> matter how remote, is more than everything else, then we should just
> stop shipping minor releases and halt all other development work until
> we deal with https://commitfest.postgresql.org/22/528/ that has been
> open for 14 CommitFests and until we've fully dealt with every aspect
> of fsync-gate.  I don't see you treating either of those issues with
> the same sense of urgency with which you're treating this one, so
> evidently you feel entitled to decide which
> potentially-data-corrupting issues you want to attack first.

I'm not advocating for removing it today or tomorrow, or in some point
release.  We're already to the stage where we're talking about
something that wouldn't hit until late 2020.  Also, frankly, while there
might be something I could do to help, the whole madness with fsync gate
seems to be getting dealt with by some very smart people who know a
whole lot more about the internals of Linux and FreeBSD and I'd rather
not get in their way.  On the other hand, dealing with backups and
issues around backups has been something that I at least hope I've
gotten decent at.

> And I agree that you should have exactly that right.  Along similar
> lines, I'd like our users to have the right to decide when they want
> to upgrade from using exclusive backup mode to non-exclusive backup
> mode, instead of being forced into making a change at a time decided
> by our fiat.

Our users *have* that option, and have for over two years now, and
it'll have been 4 years by the time v13 comes out.  I agree that the
non-exclusive backup mode isn't as simple as the exclusive backup mode-
to use your example, what we *used* to be doing with fsync() was sure a
lot simpler than what we're going to be doing in the future, but
sometimes people have to make changes, even to more complicated APIs,
to make sure that they're doing things the right way to ensure that
their data doesn't get eaten.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: pgsql: Avoid creation of the free space map for small heap relations, t

2019-02-25 Thread Tom Lane
Amit Kapila  writes:
> Avoid creation of the free space map for small heap relations, take 2.

I think this patch still has some issues.  Note the following two
recent buildfarm failures:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=petalura=2019-02-20%2004%3A20%3A01
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dragonet=2019-02-24%2022%3A48%3A02

Both of them failed in an assertion added by this patch:

TRAP: FailedAssertion("!((rel->rd_rel->relkind == 'r' || rel->rd_rel->relkind 
== 't') && fsm_local_map.map[oldPage] == 0x01)", File: 
"/home/bf/build/buildfarm-petalura/HEAD/pgsql.build/../pgsql/src/backend/storage/freespace/freespace.c",
 Line: 229)
...
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:4] LOG:  server process (PID 17981) 
was terminated by signal 6: Aborted
2019-02-20 06:23:26.660 CET [5c6ce440.3bd2:5] DETAIL:  Failed process was 
running: INSERT INTO T VALUES ( $1 )

I can even offer you a theory as to why we're managing to get through all
the core regression tests and then failing in ecpg: the failing test is
ecpg/test/thread/prep.pgc, and I think that may be the only place in our
tests where we stress concurrent insertions into a single small table.
So I think you've got race-condition issues here.

It might or might not be significant that these two animals are part
of Andres' JIT-enabled menagerie.  I could see that affecting test
timing to the point of showing a symptom that's otherwise hard to hit.

regards, tom lane



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Mon, Feb 25, 2019 at 11:23 AM Stephen Frost  wrote:
> > If that argument did matter, we could go back and find the prior
> > discussions about the issues around the exclusive backup mode and about
> > removing it, or next year we could point to this thread about it, or the
> > year after, and say "well, we talked about it a lot, so now let's just
> > do it", but it all ends up looking the same to our users unless we
> > actually write into some kind of user-facing documentation or WARNINGs
> > or similar that something is going away.
> 
> That's true.  But nobody's objecting to strengthening the warnings in
> the documentation.  People are objecting to removing the thing itself.

Then how long will we carry it forward?  How much warning do we need to
provide?

> > This isn't a fair argument either because we're having this *after* the
> > new API was implemented for backup- namely non-exclusive mode, which
> > *did* give people something better to use instead.
> 
> There are several people who are saying that doesn't meet all their
> needs, giving reasons why it's problematic, and suggesting things that
> could be done to make it less problematic.  It's not OK to say "we can
> remove exclusive backup mode because now we have non-exclusive backup
> mode" unless other people actually agree that all the use cases are
> covered, and it appears that they don't.

I haven't heard anyone say it doesn't meet their needs, just that it's
not as easy to use, which is a fair critcism, but not the same thing.
I'm all for making the non-exclusive mode less problematic if we're able
to.  If that's something that would help us move forward with getting
rid of the exclusive backup mode than that's an area that I'd be willing
to put resources.

> > I disagree quite a bit with this statement- the existing documentation
> > is absolutely horrid and needs to be completely ripped out and rewritten
> > and maintaining the exclusive backup mode in such a rewrite would
> > absolutely be a *lot* of additional work.
> >
> > We actually took a shot at trying to improve the documentation while
> > continuing to cover both the exclusive and the non-exclusive mode and
> > it's hugely painful.
> 
> Well, perhaps that's pain you need to incur.  The alternative seems to
> be to rip out something that people don't want ripped out.

... now I feel like I'm being told what to do.

The point is that keeping it actually *is* work, it isn't free, not even
in our tree.  Making our users stumble over the issues with it also
isn't free, that's another cost of keeping it.

> > I used to be one of those people.  I know that it looks fine and it
> > certainly seems appealing but having gone through bad experiences with
> > it, and seen others stumble through those same experiences time and time
> > again, I've learned that it really is an issue, and one that I would
> > very much like to avoid causing future users to stumble over.
> 
> Sure, that sounds great.  But the way to do that is to continue
> improving the system until exclusive-mode backups really are not a
> useful thing any more, not to remove it while there are still a lot of
> people relying on it who can offer tangible explanations for their
> choice to do so.
> 
> It feels to me like you are portraying the increasing number of people
> objecting to this change as naive or foolish or at least not as
> enlightened as you are, and I really object to that.  I happen to
> think that people like Christophe Pettus and Fujii Masao and Laurenz
> Albe are smart people whose opinions ought to be taken as just as
> valid as your own.

I honestly do doubt that they have had the same experiences that I have
had, but that doesn't mean that they're not smart or that I don't value
their opinion- I agree that they're all smart people and I certainly do
value their opinions.  That doesn't mean that I can't disagree with them
or can't explain why I disagree.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Should we increase the default vacuum_cost_limit?

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 8:39 AM David Rowley
 wrote:
> I decided to do the times by 10 option that I had mentioned Ensue
> debate about that...

+1 for raising the default substantially.  In my experience, and it
seems others are in a similar place, nobody ever gets into trouble
because the default is too high, but sometimes people get in trouble
because the default is too low.  If we raise it enough that a few
people have to reduce it and a few people have to further increase it,
IMHO that would be about right.  Not sure exactly what value would
accomplish that goal.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 11:33 AM Stephen Frost  wrote:
> > Removing an option that people are currently using, and which they
> > find better than other available options for reasons with which I
> > understand that you disagree, will make users more sad.  Happy is
> > better.
>
> I don't want to see more users stumbling over the issues with the
> exclusive backup interface.  A better interface exists, and has existed
> since 9.6.  The exclusive backup method was deprecated in 2016.  One of
> the really bad things about the exclusive backup method is that it
> *looks* like it works well and that there aren't any issues with it, but
> when users hit the issues, they get very sad.  We might have 1000s of
> happy users who never run into those issues and therefore don't want to
> see us change anything, but what about the 10s or 100s of users who do
> hit those issues, what do we tell them?  Seems like we're saying "well,
> sorry, we knew those issues existed and it's unfortunate you hit them,
> but there's this better thing that you *should* have been using and then
> you wouldn't have hit those issues."  That doesn't seem likely to make
> people happy with us.

Sure, nobody wants users to keep hitting the same problems over and
over again, but the above is still reductionist.  If we take the view
that anything that can cause data corruption in any scenario, no
matter how remote, is more than everything else, then we should just
stop shipping minor releases and halt all other development work until
we deal with https://commitfest.postgresql.org/22/528/ that has been
open for 14 CommitFests and until we've fully dealt with every aspect
of fsync-gate.  I don't see you treating either of those issues with
the same sense of urgency with which you're treating this one, so
evidently you feel entitled to decide which
potentially-data-corrupting issues you want to attack first.

And I agree that you should have exactly that right.  Along similar
lines, I'd like our users to have the right to decide when they want
to upgrade from using exclusive backup mode to non-exclusive backup
mode, instead of being forced into making a change at a time decided
by our fiat.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Referential Integrity Checks with Statement-level Triggers

2019-02-25 Thread Corey Huinker
>
>
> In order to avoid per-row calls of the constraint trigger functions, we
> could
> try to "aggregate" the constraint-specific events somehow, but I think a
> separate queue would be needed for the constraint-specific events.
>
> In general, the (after) triggers and constraints have too much in common,
> so
> separation of these w/o seeing code changes is beyond my imagination.
>
>
Yeah, there's a lot of potential for overlap where a trigger could "borrow"
an RI tuplestore or vice versa.

The people who expressed opinions on nuking triggers from orbit (it's the
only way to be sure) have yet to offer up any guidance on how to proceed
from here, and I suspect it's because they're all very busy getting things
ready for v12. I definitely have an interest in working on this for 13, but
I don't feel good about striking out on my own without their input.


Re: Should we increase the default vacuum_cost_limit?

2019-02-25 Thread Julien Rouhaud
On Mon, Feb 25, 2019 at 4:44 PM Laurenz Albe  wrote:
>
> David Rowley wrote:
> > On Tue, 26 Feb 2019 at 02:06, Joe Conway  wrote:
> > >
> > > On 2/25/19 1:17 AM, Peter Geoghegan wrote:
> > > > On Sun, Feb 24, 2019 at 9:42 PM David Rowley
> > > >  wrote:
> > > >> The current default vacuum_cost_limit of 200 seems to be 15 years old
> > > >> and was added in f425b605f4e.
> > > >>
> > > >> Any supporters for raising the default?
> > [...]
> > I'll add this to the March commitfest and set the target version as PG12.
>
> I think this is a good move.
>
> It is way easier to recover from an over-eager autovacuum than from
> one that didn't manage to finish...

+1



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 11:23 AM Stephen Frost  wrote:
> If that argument did matter, we could go back and find the prior
> discussions about the issues around the exclusive backup mode and about
> removing it, or next year we could point to this thread about it, or the
> year after, and say "well, we talked about it a lot, so now let's just
> do it", but it all ends up looking the same to our users unless we
> actually write into some kind of user-facing documentation or WARNINGs
> or similar that something is going away.

That's true.  But nobody's objecting to strengthening the warnings in
the documentation.  People are objecting to removing the thing itself.

> This isn't a fair argument either because we're having this *after* the
> new API was implemented for backup- namely non-exclusive mode, which
> *did* give people something better to use instead.

There are several people who are saying that doesn't meet all their
needs, giving reasons why it's problematic, and suggesting things that
could be done to make it less problematic.  It's not OK to say "we can
remove exclusive backup mode because now we have non-exclusive backup
mode" unless other people actually agree that all the use cases are
covered, and it appears that they don't.

> I disagree quite a bit with this statement- the existing documentation
> is absolutely horrid and needs to be completely ripped out and rewritten
> and maintaining the exclusive backup mode in such a rewrite would
> absolutely be a *lot* of additional work.
>
> We actually took a shot at trying to improve the documentation while
> continuing to cover both the exclusive and the non-exclusive mode and
> it's hugely painful.

Well, perhaps that's pain you need to incur.  The alternative seems to
be to rip out something that people don't want ripped out.

> I used to be one of those people.  I know that it looks fine and it
> certainly seems appealing but having gone through bad experiences with
> it, and seen others stumble through those same experiences time and time
> again, I've learned that it really is an issue, and one that I would
> very much like to avoid causing future users to stumble over.

Sure, that sounds great.  But the way to do that is to continue
improving the system until exclusive-mode backups really are not a
useful thing any more, not to remove it while there are still a lot of
people relying on it who can offer tangible explanations for their
choice to do so.

It feels to me like you are portraying the increasing number of people
objecting to this change as naive or foolish or at least not as
enlightened as you are, and I really object to that.  I happen to
think that people like Christophe Pettus and Fujii Masao and Laurenz
Albe are smart people whose opinions ought to be taken as just as
valid as your own.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Stephen Frost
Greetings,

* Robert Haas (robertmh...@gmail.com) wrote:
> On Fri, Feb 22, 2019 at 6:18 PM Stephen Frost  wrote:
> > What we need is a backup tool included in core that users feel
> > comfortable using instead of trying to write their own.
> 
> I agree.  That's a great idea.  Let's talk about how to make that
> happen.  Providing a tool that gives people MORE AND BETTER options
> than what they have today is likely to make users more happy.

I've been working for years on making this happen, and am currently
putting quite a large amount of resources into what I hope will be the
final step in the process to get a solution that will have a great many
more and better options than what's available today.  Maybe it'll end up
happening, which I think would be great, maybe it won't, which would be
sad, but at least we'll have tried.

> Removing an option that people are currently using, and which they
> find better than other available options for reasons with which I
> understand that you disagree, will make users more sad.  Happy is
> better.

I don't want to see more users stumbling over the issues with the
exclusive backup interface.  A better interface exists, and has existed
since 9.6.  The exclusive backup method was deprecated in 2016.  One of
the really bad things about the exclusive backup method is that it
*looks* like it works well and that there aren't any issues with it, but
when users hit the issues, they get very sad.  We might have 1000s of
happy users who never run into those issues and therefore don't want to
see us change anything, but what about the 10s or 100s of users who do
hit those issues, what do we tell them?  Seems like we're saying "well,
sorry, we knew those issues existed and it's unfortunate you hit them,
but there's this better thing that you *should* have been using and then
you wouldn't have hit those issues."  That doesn't seem likely to make
people happy with us.

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Remove Deprecated Exclusive Backup Mode

2019-02-25 Thread Robert Haas
On Mon, Feb 25, 2019 at 11:09 AM Stephen Frost  wrote:
> I do think we can ask that people who wish to have a capability included
> in PG (or continue to be included when there are serious known issues
> with it...) be prepared to either build and maintain it themselves or to
> convince someone else to do so (or both, and have a committer agree to
> it).  That's how we've long operated and it wasn't my intent to imply
> otherwise, but I agree that I could have said it in a nicer way to avoid
> it coming across as telling Christophe what to do.

I think it is certainly true that if you want a new capability added,
you have to either write it yourself or get someone to do it for you.
There is SOME precedent for the proposition that if you want an
obsolete capability not to be removed, you should be prepared to help
fix it.  But frankly I think the latter proposition is one we've taken
only occasionally, and only for things that were a whole lot cruftier
and more problematic than this is.  For example, if a library is no
longer available and we have a contrib module that depends on that
library, it's reasonable to say that we can't keep the contrib module
unless somebody rewrites it not to depend on that library any more.
But the current situation is completely different.  The code
maintenance burden resulting from keeping exclusive-mode backups is
mostly hypothetical; the code works as well as it ever did.  We rarely
take the position that we're going to rip out older code that doesn't
work as nicely as newer code does just because nobody's prepared to
e.g. better-document the old code.

For example, we still have FEBE protocol 2 support floating around.
If you're going to talk about badly-designed footguns that ought to be
excised with extreme prejudice, that would IMHO be a far better place
to start than this.  Unlike this code, which it's now obvious is used
by quite a number of people even just among those who read this list
regularly, that code is probably nearly unused and untested and, if we
broke it, we probably wouldn't know.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



  1   2   >