Re: Problems with plan estimates in postgres_fdw

2018-10-08 Thread Etsuro Fujita

(2018/10/05 19:15), Etsuro Fujita wrote:

(2018/08/02 23:41), Tom Lane wrote:

Andrew Gierth writes:

[ postgres_fdw is not smart about exploiting fast-start plans ]


Yeah, that's basically not accounted for at all in the current design.


One possibility: would it be worth adding an option to EXPLAIN that
makes it assume cursor_tuple_fraction?


[ handwaving ahead ]

I wonder whether it could be done without destroying postgres_fdw's
support for old servers, by instead including a LIMIT in the query sent
for explaining. The trick would be to know what value to put as the
limit, though. It'd be easy to do if we were willing to explain the query
twice (the second time with a limit chosen as a fraction of the rowcount
seen the first time), but man that's an expensive solution.

Another component of any real fix here would be to issue "SET
cursor_tuple_fraction" before opening the execution cursor, so as to
ensure that we actually get an appropriate plan on the remote side.

If we could tell whether there's going to be any use in fast-start plans,
it might make sense to build two scan paths for a foreign table, one
based
on a full-table scan and one based on EXPLAIN ... LIMIT 1. This still
means two explain requests, which is why I'm not thrilled about doing it
unless there's a high probability of the extra explain being useful.


Agreed, but ISTM that to address the original issue, it would be enough
to jsut add LIMIT (or ORDER BY LIMIT) pushdown to postgres_fdw based on
the upper-planner-pathification work.


Will work on it unless somebody else wants to.

Best regards,
Etsuro Fujita



Re: [HACKERS] Optional message to user when terminating/cancelling backend

2018-10-08 Thread Michael Paquier
On Fri, Oct 05, 2018 at 10:11:45AM +0200, Daniel Gustafsson wrote:
> Thanks!  Attached is a v17 which rebases the former 0002 patch on top of
> current master, along with the test fix for Windows that Thomas reported
> upthread (no other changes introduced over earlier versions).

Thanks for the new version.

In order to make a test with non-ASCII characters in the error message,
we could use a trick similar to xml.sql: use a function which ignores
the test case if server_encoding is not UTF-8 and use something like
chr() to pass down a messages with non-ASCII characters.  Then if the
function catches the error we are good.

+   *pid = slot->src_pid;
+   slot->orig_length = 0;
+   slot->message[0] = '\0';
Message consumption should be the part using memset(0), no?  system
calls are avoided within a spinlock section, but memset and strlcpy
should be fast enough.  Those are used in WalReceiverMain() for
example.

The facility to store messages should be in its own file, and
signalfuncs.c should depend on it, something like signal_message.c?

+typedef struct
+{
+   pid_t   dest_pid;   /* The pid of the process being signalled */
+   pid_t   src_pid;/* The pid of the processing signalling */
+   slock_t mutex;  /* Per-slot protection */
+   charmessage[MAX_CANCEL_MSG]; /* Message to send to signalled backend */
+   int orig_length;/* Length of the message as passed by the user,
+* if this length exceeds MAX_CANCEL_MSG it will
+* be truncated but we store the original length
+* in order to be able to convey truncation
*/
+   int sqlerrcode; /* errcode to use when signalling backend */
+} BackendSignalFeedbackShmemStruct

A couple of thoughts here:
- More information could make this facility more generic: an elevel to
be able to fully manipulate the custom error message, and a breakpoint
definition.  As far as I can see you have two of them in the patch which
are the callers of ConsumeBackendSignalFeedback().  One event cancels
the query, and another terminates it.  In both cases, the breakpoint
implies the elevel.  So could we have more possible breakpoints possible
and should we try to make this API more pluggable from the start?
- Is orig_length really useful in the data stored?  Why not just
warning/noticing the caller defining the message and just store the
message.

So, looking at your patch, I am wondering also about splitting it into
a couple of pieces for clarity:
- The base facility to be able to register and consume messages which
can be attached to a backend slot, and then be accessed by other things.
Let's think about it as a base facility which is useful for extensions
and module developers.  If coupled with a hook, it would be actually
possible to scan a backend's slot for some information which could be
consumed afterwards for custom error messages...
- The set of breakpoints which can then be used to define if a given
code path should show up a custom error message.  I can think of three
of them: cancel, termination and extension, which is a custom path that
extensions can use.  The extension breakpoint would make sense as
something included from the start, could be stored as an integer and I
guess should not be an enum but a set of defined tables (see pgstat.h
for wait event categories for example).
- The set of SQL functions making use of it in-core, in your case these
are the SQL functions for cancellation and termination.
--
Michael


signature.asc
Description: PGP signature


Re: pread() and pwrite()

2018-10-08 Thread Tom Lane
I wrote:
> Thomas Munro  writes:
>> On Tue, Oct 9, 2018 at 2:55 PM Tom Lane  wrote:
>>> Yeah, I've been burnt by that too recently.  It occurs to me we could make
>>> that at least a little less painful if we formatted the macro with one
>>> line per function name:

>> +1, was about to suggest the same!

> Sold, I'll go do it.

Learned a few new things about M4 along the way :-( ... but done.
You'll need to rebase the pread patch again of course.

regards, tom lane



Re: Refactor textToQualifiedNameList()

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 03:22:55PM +, Pavel Stehule wrote:
> I tested this patch. This patch removes some duplicate rows, what is
> good -  on second hand, after this patch, the textToQualifiedNameList
> does one more copy of input string more. I looked where this function
> is used, and I don't think so it is a issue. 
> 
> This patch is trivial - all tests passed and I don't see any
> problem. I'll mark as ready for commit. 
> 
> The new status of this patch is: Ready for Committer

Are you sure that there is no performance impact?  Say implement a
simple SQL-callable function which does textToQualifiedNameList on the
same string N times, and then compare the time it takes to run HEAD and 
the refactored implementation.  I may buy that an extra palloc call is
not something to worry about, but in my experience it can be a problem.

This patch also changes stringToQualifiedNameList from regproc.h to
varlena.h, which would break extensions calling it.  That's not nice
either.
--
Michael


signature.asc
Description: PGP signature


RE: Small performance tweak to run-time partition pruning

2018-10-08 Thread Imai, Yoshikazu
On Tue, Oct 9, 2018 at 2:02 AM, David Rowley wrote:
> Yeah, so subplan_map is just an array that stores the List index of
> the Append or MergeAppend's subplans. When we perform run-time pruning
> during executor initialisation, if we prune away some of these
> subplans then we don't initialise those pruned subplans at all which
> results in missing executor nodes for those plans. Instead of
> maintaining an array to store these with a bunch of NULLs in them to
> represent the pruned subnodes, for performance reasons, we make a
> gapless array and store them in there. This means that for the
> run-time pruning that we could do running actual execution
> (ExecFindMatchingSubPlans), the old subplan_map would be out of date,
> as it would contain the indexes of the subplans as if we'd not pruned
> any.  We can simply not bother adjusting the subplan_map if no further
> pruning is required. This could leave the map pointing to subplans
> that don't exist, but we only need to care about that when the map is
> actually going to be used for something. The good news is, we know in
> advance if the map will be used again.

Thanks for the detail explanation! I got it fully.

> > I saw the patch and it seems good to me about the codes.
> > I still couldn't check additional test cases in the patch.
> >
> >
> > BTW, when I looking the codes, I thought there is further improvements in
> > ExecInitAppend which is described below.
> >
> > j = i = 0;
> > firstvalid = nplans;
> > foreach(lc, node->appendplans)
> > {
> > if (bms_is_member(i, validsubplans))
> > {
> > Plan   *initNode = (Plan *) lfirst(lc);
> >
> > /*
> >  * Record the lowest appendplans index which is a 
> > valid partial
> >  * plan.
> >  */
> > if (i >= node->first_partial_plan && j < firstvalid)
> > firstvalid = j;
> >
> > appendplanstates[j++] = ExecInitNode(initNode, 
> > estate, eflags);
> > }
> > i++;
> > }
> >
> > If number of valid subplans is few compared to node->appendplans, it is a 
> > waste to check
> > bms_is_member(i, validsubplans) for all node->appendplans.
> > Of course, node->appendplans is List struct and we have to loop it to find 
> > valid plan,
> > that we can't simply use while(bms_next_member(i, validsubplans)) loop.
> > I don't have any good idea for it now, but can we improve it?
> 
> 
> 
> I do have other ideas for that but not ready to share code yet, but it
> basically requires reimplementing List to use arrays as their
> underlying implementation. This will allow the bms_next_member() type
> loop and list_nth() will be O(1) instead of O(N).

Great! 
So there might be little gain in using memory, but we get large performance 
improvement.

> It might be possible to squeeze a bit more performance out of this
> code with the current List implementation, but I it might actually
> slow performance in some cases, for example, if the only surviving
> partition was one of the last ones in the List.  Getting the final
> element with list_nth() is optimized, but if you consider a
> time-based, say monthly, RANGE partition, a DBA might maintain a
> partition ready for the next month of data, so it might be very common
> to access the 2nd last element in the list for all queries looking at
> "this months" data. In that case, list_nth(), in its current form, is
> as slow as can be.  

I understand accessing 2nd last element is slow in current List implementation,
but I don't know your new implementation is also slow in that case.
I don't know I might misunderstood "squeeze ~ out of ~ with ~" sentence.


> You'd also need to do a bms_num_members() or
> bms_get_singleton_member(), in order to decide if the alternative
> method is going to be any faster. That call is not going to be free.

Yeah, I need to use bms_num_members() or bms_get_singleton_member() to
check number of valid subplans is enough smaller than number of all append plans
to check alternative method is faster.

> It might also be possible to form the loop so that it calls
> bms_next_member() then store the result and loop until we reach that
> number. That would only save the bms_is_member call per loop, but I'd
> much rather do the array idea as it should also speed up plenty of
> other things.

Actually, it is possible refactor that loop with the method you described,
but it would be complex and hacky codes for only saving the wasting loop.

So, I also like your array idea.

--
Yoshikazu Imai



Re: pgsql: Fix event triggers for partitioned tables

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 10:39:23AM -0300, Alvaro Herrera wrote:
> Pushed now, thanks.

Thanks Alvaro for addressing the issue and back-patching.
--
Michael


signature.asc
Description: PGP signature


Re: Function to promote standby servers

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 07:36:50PM +0200, Laurenz Albe wrote:
> The attached patch has regression tests - I though it would be good
> to change some of the existing tests that run standby promotion to
> use the SQL function instead of pg_ctl.
> 
> I have left the name though -- as far as I can tell, "promote" has
> no other meaning in PostgreSQL than standby promotion, and I believe
> it is only good to be more verbose if that avoids confusion.

I am fine with pg_promote.

> diff --git a/src/backend/access/transam/xlog.c 
> b/src/backend/access/transam/xlog.c
> index 7375a78ffc..3a1f49e83a 100644
> --- a/src/backend/access/transam/xlog.c
> +++ b/src/backend/access/transam/xlog.c
> @@ -81,8 +81,6 @@ extern uint32 bootstrap_data_checksum_version;
>  /* File path names (all relative to $PGDATA) */
>  #define RECOVERY_COMMAND_FILE"recovery.conf"
>  #define RECOVERY_COMMAND_DONE"recovery.done"
> -#define PROMOTE_SIGNAL_FILE  "promote"
> -#define FALLBACK_PROMOTE_SIGNAL_FILE "fallback_promote"

Perhaps we could just move the whole set to xlog.h.

> +Datum
> +pg_promote(PG_FUNCTION_ARGS)
> +{
> + FILE *promote_file;
> +
> + if (!superuser())
> + ereport(ERROR,
> + (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> +  errmsg("must be superuser to promote standby 
> servers")));

Let's remove this restriction at code level, and instead use
function-level ACLs so as it is possible to allow non-superusers to
trigger a promotion as well, say a dedicated role.  We could add a
system role for this purpose, but I am not sure that it is worth the
addition yet.

> + while ($self->safe_psql('postgres', "SELECT pg_is_in_recovery()") != 
> 'f')
> + {
> +  sleep 1;
> + }
> + return;

This could go on infinitely, locking down buildfarm machines silently if
a commit is not careful.  What I would suggest to do instead is to not
touch PostgresNode.pm at all, and to just modify one test to trigger
promotion with the SQL function.  Then from the test, you should add a
comment that triggerring promotion from SQL is wanted instead of the
internal routine, and then please add a call to poll_query_until() in
the same way as what 6deb52b2 has removed.

As of now, this function returns true if promotion has been triggered,
and false if not.  However it seems to me that we should have something
more consistent with "pg_ctl promote", so there are actually three
possible states:
1) Node is in recovery, with promotion triggered.  pg_promote() returns
true in case of success in creating the trigger file.
2) Node is in recovery, with promotion triggered.  pg_promote() returns
false in case of failure in creating the trigger file.
3) Node is not in recovery, where I think a call to pg_promote should
trigger an error.  This is not handled in the patch.

An other thing which has value is to implement a "wait" mode for this
feature, or actually a nowait mode.  You could simply do what pg_ctl
does with its logic in get_control_dbstate() by looking at the control
file state.  I think that waiting for the promotion to happen should be
the default.

At the end this patch needs a bit more work.
--
Michael


signature.asc
Description: PGP signature


Re: Speeding up INSERTs and UPDATEs to partitioned tables

2018-10-08 Thread Amit Langote
Hi David,

On 2018/10/05 21:55, David Rowley wrote:
> On 17 September 2018 at 21:15, David Rowley
>  wrote:
>> v9 patch attached. Fixes conflict with 6b78231d.
> 
> v10 patch attached. Fixes conflict with cc2905e9.

Thanks for rebasing.

> I'm not so sure we need to zero the partition_tuple_slots[] array at
> all since we always set a value there is there's a corresponding map
> stored. I considered pulling that out, but in the end, I didn't as I
> saw some Asserts checking it's been properly set by checking the
> element  != NULL in nodeModifyTable.c and copy.c.  Perhaps I should
> have just gotten rid of those Asserts along with the palloc0 and
> subsequent memset after the expansion of the array. I'm undecided.

Maybe it's a good thing that it's doing the same thing as with the
child_to_parent_maps array, which is to zero-init it when allocated.

Thanks,
Amit




Re: pread() and pwrite()

2018-10-08 Thread Tom Lane
Thomas Munro  writes:
> On Tue, Oct 9, 2018 at 2:55 PM Tom Lane  wrote:
>> Yeah, I've been burnt by that too recently.  It occurs to me we could make
>> that at least a little less painful if we formatted the macro with one
>> line per function name:
>> 
>> AC_CHECK_FUNCS([
>> cbrt
>> clock_gettime
>> fdatasync
>> ...
>> wcstombs_l
>> ])
>> 
>> You'd still get conflicts in configure itself, of course, but that
>> doesn't require manual work to resolve -- just re-run autoconf.

> +1, was about to suggest the same!

Sold, I'll go do it.

regards, tom lane



Re: Pluggable Storage - Andres's take

2018-10-08 Thread Haribabu Kommi
On Wed, Oct 3, 2018 at 3:16 PM Andres Freund  wrote:

> On 2018-09-27 20:03:58 -0700, Andres Freund wrote:
> > On 2018-09-28 12:21:08 +1000, Haribabu Kommi wrote:
> > > Here I attached further cleanup patches.
> > > 1. Re-arrange the GUC variable
> > > 2. Added a check function hook for default_table_access_method GUC
> >
> > Cool.
> >
> >
> > > 3. Added a new hook validate_index. I tried to change the function
> > > validate_index_heapscan to slotify, but that have many problems as it
> > > is accessing some internals of the heapscandesc structure and accessing
> > > the buffer and etc.
> >
> > Oops, I also did that locally, in a way. I also made a validate a
> > callback, as the validation logic is going to be specific to the AMs.
> > Sorry for not pushing that up earlier.  I'll try to do that soon,
> > there's a fair amount of change.
>
> I've pushed an updated version, with a fair amount of pending changes,
> and I hope all your pending (and not redundant, by our concurrent
> development), patches merged.
>

Yes, All the patches are merged.

There's currently 3 regression test failures, that I'll look into
> tomorrow:
> - partition_prune shows a few additional Heap Blocks: exact=1 lines. I'm
>   a bit confused as to why, but haven't really investigated yet.
> - fast_default fails, because I've undone most of 7636e5c60fea83a9f3c,
>   I'll have to redo that in a different way.
> - I occasionally see failures in aggregates.sql - I've not figured out
>   what's going on there.
>

I also observed the failure of aggregates.sql, will look into it.


> Amit Khandekar said he'll publish a new version of the slot-abstraction
> patch tomorrow, so I'll rebase it onto that ASAP.
>

OK.
Here I attached two new API patches.
1. Set New Rel File node
2. Create Init fork

There is an another patch of "External Relations" in the older patch set,
which is not included in the current git. That patch is of creating
external relations by the extensions for their internal purpose. (Columnar
relations for the columnar storage). This new relkind can be used for
those relations, this way it provides the difference between normal and
columnar relations. Do you have any other idea of supporting those type
of relations?

And also I want to create a new API for heap_create_with_catalog
to let the pluggable storage engine to create additional relations.
This API is not required for every storage engine, so instead of moving
the entire function as API, how about adding an API at the end of the
function and calls only when it is set like hook functions? In case if the
storage engine doesn't need any of the heap_create_with_catalog
functionality then creating a full API is better.

Comments?

My next planned steps are a) to try to commit parts of the
> slot-abstraction work b) to try to break out a few more pieces out of
> the large pluggable storage patch.
>

OK. Let me know your views on the part of the pieces that are stable,
so that I can separate them from larger patch.

Regards,
Haribabu Kommi
Fujitsu Australia


0002-init-fork-API.patch
Description: Binary data


0001-New-API-setNewfilenode.patch
Description: Binary data


Re: [HACKERS] SERIALIZABLE with parallel query

2018-10-08 Thread Thomas Munro
On Tue, Oct 2, 2018 at 4:53 PM Thomas Munro
 wrote:
> Thanks for the review!  And sorry for my delayed response.  Here is a
> rebased patch, with changes as requested.

Rebased.

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Enable-parallel-query-with-SERIALIZABLE-isolatio-v16.patch
Description: Binary data


0002-Enable-the-read-only-SERIALIZABLE-optimization-f-v16.patch
Description: Binary data


Re: pread() and pwrite()

2018-10-08 Thread Thomas Munro
On Tue, Oct 9, 2018 at 2:55 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > Rebased again.  Patches that touch AC_CHECK_FUNCS are fun like that!
>
> Yeah, I've been burnt by that too recently.  It occurs to me we could make
> that at least a little less painful if we formatted the macro with one
> line per function name:
>
> AC_CHECK_FUNCS([
> cbrt
> clock_gettime
> fdatasync
> ...
> wcstombs_l
> ])
>
> You'd still get conflicts in configure itself, of course, but that
> doesn't require manual work to resolve -- just re-run autoconf.

+1, was about to suggest the same!

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Function for listing archive_status directory

2018-10-08 Thread Michael Paquier
On Tue, Oct 09, 2018 at 02:14:52AM +, Iwata, Aya wrote:
> Sorry, I made a mistake. You patch currently does not apply. Kindly
> rebase the patch.  I'm marking it as "Waiting on Author".

Thanks Iwata-san.  I was just trying to apply the patch but it failed so
the new status is fine.  On top of taking care of the rebase, please
make sure of the following:
- Calling pg_ls_dir_files() with missing_ok set to true.
- Renaming pg_ls_archive_status to pg_ls_archive_statusdir.
We have a pretty nice consistency in the name of such functions as they
finish by *dir, so it makes lookups using for example "\df *dir" easier
to spot all the functions in the same category.

+last modified time (mtime) of each file in the write ahead log (WAL)
+archive_status directory. By default only superusers
Here I would mention pg_wal/archive_status.

No need for a catalog bump in what you submit on this thread, this is
taken care by committers.
--
Michael


signature.asc
Description: PGP signature


Re: DSM segment handle generation in background workers

2018-10-08 Thread Thomas Munro
On Tue, Oct 9, 2018 at 1:53 PM Tom Lane  wrote:
> Thomas Munro  writes:
> > On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro
> >  wrote:
> >> That's because the bgworker startup path doesn't contain a call to
> >> srandom(...distinguishing stuff...), unlike BackendRun().  I suppose
> >> do_start_bgworker() could gain a similar call... or perhaps that call
> >> should be moved into InitPostmasterChild().  If we put it in there
> >> right after MyStartTime is assigned a new value, we could use the same
> >> incantation that PostmasterMain() uses.
>
> > Maybe something like this?
>
> I think the bit with
>
> #ifndef HAVE_STRONG_RANDOM
> random_seed = 0;
> random_start_time.tv_usec = 0;
> #endif
>
> should also be done in every postmaster child, no?  If we want to hide the
> postmaster's state from child processes, we ought to hide it from all.

Ok, here is a sketch patch with a new function InitRandomSeeds() to do
that.  It is called from PostmasterMain(), InitPostmasterChild() and
InitStandaloneProcess() for consistency.

It seems a bit strange to me that internal infrastructure shares a
random number generator with SQL-callable functions random() and
setseed(), though I'm not saying it's harmful.

While wondering if something like this should be back-patched, I
noticed that SQL random() is declared as parallel-restricted, which is
good: it means we aren't exposing a random() function that generates
the same values in every parallel worker.  So I think this is probably
just a minor nuisance and should probably only be pushed to master, or
at most to 11 (since Parallel Hash likes to create DSM segments in
workers), unless someone can think of a more serious way this can hurt
you.

(Tangentially:  I suppose it might be useful to have a different SQL
random number function that is parallel safe, that isn't associated
with a user-controllable seed, and whose seed is different in each
backend.)

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Make-sure-we-initialize-random-seeds-per-backend.patch
Description: Binary data


RE: Function for listing archive_status directory

2018-10-08 Thread Iwata, Aya
> I didn't find any problems with the patch, so I'm marking it as "Ready for
> Committer".

Sorry, I made a mistake. You patch currently does not apply. Kindly rebase the 
patch.
I'm marking it as "Waiting on Author".

Regards,
Aya Iwata


Re: Small performance tweak to run-time partition pruning

2018-10-08 Thread David Rowley
On 9 October 2018 at 14:23, Imai, Yoshikazu
 wrote:
> I checked codes and I think so too.
>
> Confirmation of my understanding and For more information to others:
>
> The subplan map is used when we call ExecFindInitialMatchingSubPlans or
> ExecFindMatchingSubPlans.
> ExecFindInitialMatchingSubPlans is called by ExecInit(Merge)Append.
> ExecFindmatchingSubPlans is called by below fours which is executed after
> ExecInit(Merge)Append and it is called when the
> as_valid_subplans(or ms_valid_subplans) is not NULL.

Thanks for looking at this.

Yeah, so subplan_map is just an array that stores the List index of
the Append or MergeAppend's subplans. When we perform run-time pruning
during executor initialisation, if we prune away some of these
subplans then we don't initialise those pruned subplans at all which
results in missing executor nodes for those plans. Instead of
maintaining an array to store these with a bunch of NULLs in them to
represent the pruned subnodes, for performance reasons, we make a
gapless array and store them in there. This means that for the
run-time pruning that we could do running actual execution
(ExecFindMatchingSubPlans), the old subplan_map would be out of date,
as it would contain the indexes of the subplans as if we'd not pruned
any.  We can simply not bother adjusting the subplan_map if no further
pruning is required. This could leave the map pointing to subplans
that don't exist, but we only need to care about that when the map is
actually going to be used for something. The good news is, we know in
advance if the map will be used again.

> I saw the patch and it seems good to me about the codes.
> I still couldn't check additional test cases in the patch.
>
>
> BTW, when I looking the codes, I thought there is further improvements in
> ExecInitAppend which is described below.
>
> j = i = 0;
> firstvalid = nplans;
> foreach(lc, node->appendplans)
> {
> if (bms_is_member(i, validsubplans))
> {
> Plan   *initNode = (Plan *) lfirst(lc);
>
> /*
>  * Record the lowest appendplans index which is a 
> valid partial
>  * plan.
>  */
> if (i >= node->first_partial_plan && j < firstvalid)
> firstvalid = j;
>
> appendplanstates[j++] = ExecInitNode(initNode, 
> estate, eflags);
> }
> i++;
> }
>
> If number of valid subplans is few compared to node->appendplans, it is a 
> waste to check
> bms_is_member(i, validsubplans) for all node->appendplans.
> Of course, node->appendplans is List struct and we have to loop it to find 
> valid plan,
> that we can't simply use while(bms_next_member(i, validsubplans)) loop.
> I don't have any good idea for it now, but can we improve it?

I do have other ideas for that but not ready to share code yet, but it
basically requires reimplementing List to use arrays as their
underlying implementation. This will allow the bms_next_member() type
loop and list_nth() will be O(1) instead of O(N).

It might be possible to squeeze a bit more performance out of this
code with the current List implementation, but I it might actually
slow performance in some cases, for example, if the only surviving
partition was one of the last ones in the List.  Getting the final
element with list_nth() is optimized, but if you consider a
time-based, say monthly, RANGE partition, a DBA might maintain a
partition ready for the next month of data, so it might be very common
to access the 2nd last element in the list for all queries looking at
"this months" data. In that case, list_nth(), in its current form, is
as slow as can be.  You'd also need to do a bms_num_members() or
bms_get_singleton_member(), in order to decide if the alternative
method is going to be any faster. That call is not going to be free.

It might also be possible to form the loop so that it calls
bms_next_member() then store the result and loop until we reach that
number. That would only save the bms_is_member call per loop, but I'd
much rather do the array idea as it should also speed up plenty of
other things.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: pread() and pwrite()

2018-10-08 Thread Tom Lane
Thomas Munro  writes:
> Rebased again.  Patches that touch AC_CHECK_FUNCS are fun like that!

Yeah, I've been burnt by that too recently.  It occurs to me we could make
that at least a little less painful if we formatted the macro with one
line per function name:

AC_CHECK_FUNCS([
cbrt
clock_gettime
fdatasync
...
wcstombs_l
])

You'd still get conflicts in configure itself, of course, but that
doesn't require manual work to resolve -- just re-run autoconf.

regards, tom lane



Re: Postgres 11 release notes

2018-10-08 Thread Bruce Momjian
On Mon, Oct  8, 2018 at 05:44:32PM +0200, Adrien NAYRAT wrote:
> On 10/8/18 5:20 PM, Bruce Momjian wrote:
> >Uh, where are you looking?  We don't rebuild the beta docs until the
> >next beta release, but you can see the changes in the developer docs:
> >
> > https://www.postgresql.org/docs/devel/static/release-11.html
> 
> I looked in doc/src/sgml/release-11.sgml
> 
> And even on https://www.postgresql.org/docs/devel/static/release-11.html I
> don't see mention of 1f6d515a67

I did not think there was sufficient interest in adding this item.  If
we add it we need to discuss adding all similar items.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



RE: Small performance tweak to run-time partition pruning

2018-10-08 Thread Imai, Yoshikazu
Hi, David.
Thanks for the patch!

On Mon, Oct 8, 2018 at 1:00 AM, David Rowley wrote:
> I was looking at this again and I realised that we can completely skip
> the re-sequence of the subplan map when we're not going to perform any
> further pruning during execution.

I checked codes and I think so too.

Confirmation of my understanding and For more information to others:

The subplan map is used when we call ExecFindInitialMatchingSubPlans or 
ExecFindMatchingSubPlans.
ExecFindInitialMatchingSubPlans is called by ExecInit(Merge)Append.
ExecFindmatchingSubPlans is called by below fours which is executed after
ExecInit(Merge)Append and it is called when the 
as_valid_subplans(or ms_valid_subplans) is not NULL.

1. choose_next_subplan_locally(AppendState *node)
2. choose_next_subplan_for_leader(AppendState *node)
3. choose_next_subplan_for_worker(AppendState *node)
4. ExecMergeAppend(PlanState *pstate)

The as_valid_subplans(or ms_valid_subplans) is set in ExecInit(Merge)Append
if pruning during execution is not required.

That's why we can completely skip the re-sequence of the subplan map 
when we're not going to perform any further pruning during execution.


> I've attached an updated patch which skips the re-sequence work when
> doing that is not required for anything.

I saw the patch and it seems good to me about the codes.
I still couldn't check additional test cases in the patch.


BTW, when I looking the codes, I thought there is further improvements in
ExecInitAppend which is described below. 

j = i = 0;
firstvalid = nplans;
foreach(lc, node->appendplans)
{
if (bms_is_member(i, validsubplans))
{
Plan   *initNode = (Plan *) lfirst(lc);

/*
 * Record the lowest appendplans index which is a valid 
partial
 * plan.
 */
if (i >= node->first_partial_plan && j < firstvalid)
firstvalid = j;

appendplanstates[j++] = ExecInitNode(initNode, estate, 
eflags);
}
i++;
}

If number of valid subplans is few compared to node->appendplans, it is a waste 
to check
bms_is_member(i, validsubplans) for all node->appendplans.
Of course, node->appendplans is List struct and we have to loop it to find 
valid plan,
that we can't simply use while(bms_next_member(i, validsubplans)) loop.
I don't have any good idea for it now, but can we improve it?


--
Yoshikazu Imai



Re: pread() and pwrite()

2018-10-08 Thread Thomas Munro
On Fri, Sep 28, 2018 at 2:03 AM Jesper Pedersen
 wrote:
> Thanks for v5 too.

Rebased again.  Patches that touch AC_CHECK_FUNCS are fun like that!

-- 
Thomas Munro
http://www.enterprisedb.com


0001-Use-pread-pwrite-instead-of-lseek-read-write-v6.patch
Description: Binary data


Re: DSM segment handle generation in background workers

2018-10-08 Thread Tom Lane
Thomas Munro  writes:
> On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro
>  wrote:
>> That's because the bgworker startup path doesn't contain a call to
>> srandom(...distinguishing stuff...), unlike BackendRun().  I suppose
>> do_start_bgworker() could gain a similar call... or perhaps that call
>> should be moved into InitPostmasterChild().  If we put it in there
>> right after MyStartTime is assigned a new value, we could use the same
>> incantation that PostmasterMain() uses.

> Maybe something like this?

I think the bit with

#ifndef HAVE_STRONG_RANDOM
random_seed = 0;
random_start_time.tv_usec = 0;
#endif

should also be done in every postmaster child, no?  If we want to hide the
postmaster's state from child processes, we ought to hide it from all.

regards, tom lane



RE: Function for listing archive_status directory

2018-10-08 Thread Iwata, Aya
Hi Christoph,

> > All similar function are named pg_ls_***dir. It is clear these
> > functions return directory contents information.
> > If the new function intends to display the contents of the directory,
> > pg_ls_***dir style might be better (e.g. pg_ls_archive_statusdir).
> > But everyone know archive_status is a directory...
> > If you want to follow the standard naming, then you may add the dir.
> 
> I conciously omitted the "_dir" suffix - I'm not a great fan of long function
> names, and we want to inspect the contents of archive_status to find out about
> the status of the archiving process. But then, my main concern is the
> functionality, not the name, we can easily change the name. Is there any other
> opinion pro/contra the name?

I understand the reason why you have decided that name. And I agree with your 
opinion.

This function is useful for knowing about the status of archive log.
I didn't find any problems with the patch, so I'm marking it as "Ready for 
Committer".

Regards,
Aya Iwata


Re: Allowing printf("%m") only where it actually works

2018-10-08 Thread Tom Lane
So, circling back to the very beginning of this thread where I worried
about all the compile warnings we get on NetBSD-current, I'm pleased
to report that HEAD compiles warning-free so long as you define
PG_PRINTF_ATTRIBUTE to "__syslog__" rather than "gnu_printf".

So attached is a proposed patch to make configure check whether %m
works without a warning, and try "__syslog__" if "gnu_printf" does
not work for that.  (I did it in that order so that if NetBSD get
their heads screwed back on straight and stop complaining about
perfectly GNU-compliant code, we'll go back to selecting "gnu_printf".)

Any objections?

regards, tom lane

diff --git a/config/c-compiler.m4 b/config/c-compiler.m4
index fb58c94..af2dea1 100644
*** a/config/c-compiler.m4
--- b/config/c-compiler.m4
*** fi])# PGAC_C_SIGNED
*** 21,41 
  # ---
  # Select the format archetype to be used by gcc to check printf-type functions.
  # We prefer "gnu_printf", as that most closely matches the features supported
! # by src/port/snprintf.c (particularly the %m conversion spec).
  AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
  [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
  [ac_save_c_werror_flag=$ac_c_werror_flag
  ac_c_werror_flag=yes
  AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [extern int
! pgac_write(int ignore, const char *fmt,...)
! __attribute__((format(gnu_printf, 2, 3)));], [])],
!   [pgac_cv_printf_archetype=gnu_printf],
!   [pgac_cv_printf_archetype=printf])
! ac_c_werror_flag=$ac_save_c_werror_flag])
! AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
![Define to gnu_printf if compiler supports it, else printf.])
! ])# PGAC_PRINTF_ARCHETYPE
  
  
  # PGAC_TYPE_64BIT_INT(TYPE)
--- 21,56 
  # ---
  # Select the format archetype to be used by gcc to check printf-type functions.
  # We prefer "gnu_printf", as that most closely matches the features supported
! # by src/port/snprintf.c (particularly the %m conversion spec).  However,
! # on some NetBSD versions, that doesn't work while "__syslog__" does.
! # If all else fails, use "printf".
  AC_DEFUN([PGAC_PRINTF_ARCHETYPE],
  [AC_CACHE_CHECK([for printf format archetype], pgac_cv_printf_archetype,
+ [pgac_cv_printf_archetype=gnu_printf
+ PGAC_TEST_PRINTF_ARCHETYPE
+ if [[ "$ac_archetype_ok" = no ]]; then
+   pgac_cv_printf_archetype=__syslog__
+   PGAC_TEST_PRINTF_ARCHETYPE
+   if [[ "$ac_archetype_ok" = no ]]; then
+ pgac_cv_printf_archetype=printf
+   fi
+ fi])
+ AC_DEFINE_UNQUOTED([PG_PRINTF_ATTRIBUTE], [$pgac_cv_printf_archetype],
+ [Define to best printf format archetype, usually gnu_printf if available.])
+ ])# PGAC_PRINTF_ARCHETYPE
+ 
+ # Subroutine: test $pgac_cv_printf_archetype, set $ac_archetype_ok to yes or no
+ AC_DEFUN([PGAC_TEST_PRINTF_ARCHETYPE],
  [ac_save_c_werror_flag=$ac_c_werror_flag
  ac_c_werror_flag=yes
  AC_COMPILE_IFELSE([AC_LANG_PROGRAM(
! [extern void pgac_write(int ignore, const char *fmt,...)
! __attribute__((format($pgac_cv_printf_archetype, 2, 3)));],
! [pgac_write(0, "error %s: %m", "foo");])],
!   [ac_archetype_ok=yes],
!   [ac_archetype_ok=no])
! ac_c_werror_flag=$ac_save_c_werror_flag
! ])# PGAC_TEST_PRINTF_ARCHETYPE
  
  
  # PGAC_TYPE_64BIT_INT(TYPE)
diff --git a/configure b/configure
index 0448c6b..b7250d7 100755
*** a/configure
--- b/configure
*** $as_echo_n "checking for printf format a
*** 13583,13610 
  if ${pgac_cv_printf_archetype+:} false; then :
$as_echo_n "(cached) " >&6
  else
!   ac_save_c_werror_flag=$ac_c_werror_flag
  ac_c_werror_flag=yes
  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
! extern int
! pgac_write(int ignore, const char *fmt,...)
! __attribute__((format(gnu_printf, 2, 3)));
  int
  main ()
  {
  
;
return 0;
  }
  _ACEOF
  if ac_fn_c_try_compile "$LINENO"; then :
!   pgac_cv_printf_archetype=gnu_printf
  else
!   pgac_cv_printf_archetype=printf
  fi
  rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
  ac_c_werror_flag=$ac_save_c_werror_flag
  fi
  { $as_echo "$as_me:${as_lineno-$LINENO}: result: $pgac_cv_printf_archetype" >&5
  $as_echo "$pgac_cv_printf_archetype" >&6; }
--- 13583,13639 
  if ${pgac_cv_printf_archetype+:} false; then :
$as_echo_n "(cached) " >&6
  else
!   pgac_cv_printf_archetype=gnu_printf
! ac_save_c_werror_flag=$ac_c_werror_flag
  ac_c_werror_flag=yes
  cat confdefs.h - <<_ACEOF >conftest.$ac_ext
  /* end confdefs.h.  */
! extern void pgac_write(int ignore, const char *fmt,...)
! __attribute__((format($pgac_cv_printf_archetype, 2, 3)));
  int
  main ()
  {
+ pgac_write(0, "error %s: %m", "foo");
+   ;
+   return 0;
+ }
+ _ACEOF
+ if ac_fn_c_try_compile "$LINENO"; then :
+   ac_archetype_ok=yes
+ else
+   ac_archetype_ok=no
+ fi
+ rm -f core conftest.err conftest.$ac_objext conftest.$ac_ext
+ ac_c_werror_flag=$ac_save_c_werror_flag
  

Re: DSM segment handle generation in background workers

2018-10-08 Thread Thomas Munro
On Mon, Oct 8, 2018 at 1:17 AM Thomas Munro
 wrote:
> That's because the bgworker startup path doesn't contain a call to
> srandom(...distinguishing stuff...), unlike BackendRun().  I suppose
> do_start_bgworker() could gain a similar call... or perhaps that call
> should be moved into InitPostmasterChild().  If we put it in there
> right after MyStartTime is assigned a new value, we could use the same
> incantation that PostmasterMain() uses.
>
> I noticed that the comment in PostmasterMain() refers to
> PostmasterRandom(), which is gone.

Maybe something like this?

-- 
Thomas Munro
http://www.enterprisedb.com


move-srandom.patch
Description: Binary data


Re: Relax transactional restrictions on ALTER ENUM ... ADD TYPE (redux)

2018-10-08 Thread Thomas Munro
On Sun, Oct 7, 2018 at 3:30 PM Thomas Munro
 wrote:
> On Sun, Oct 7, 2018 at 5:53 AM Tom Lane  wrote:
> > Thomas Munro  writes:
> > > Thanks.  Here is a version squashed into one commit, with a decent
> > > commit message and a small improvement: the code to create the hash
> > > table is moved into a static function, to avoid repetition.  I will
> > > push this to master early next week, unless there is more feedback or
> > > one of you would prefer to do that.
> >
> > Nitpicky gripes:
>
> Thanks for the review.

Pushed.

-- 
Thomas Munro
http://www.enterprisedb.com



Re: Partial index plan/cardinality costing

2018-10-08 Thread James Coleman
Bump, and curious if anyone on hackers has any ideas here: of particular
interest is why the (pk, created_at) index can possibly be more valuable
than the (created_at, pk) variant since the former effectively implies
having to scan the entire index.
On Fri, Sep 7, 2018 at 12:17 PM James Coleman  wrote:

> I have the following tables:
> - m(pk bigserial primary key, status text): with a single row
> - s(pk bigserial primary key, status text, action_at date, m_fk bigint):
>   * 80% of the data has action_at between the current date and 1 year ago
>  and status of E or C
>   * 20% of the data has action_at between 5 days ago and 25 days into the
>  future and status of P, PD, or A
>
> I have two partial indexes:
> - s_pk_action_at on s(pk, action_at) where status in ('P', 'PD', 'A')
> - s_action_at_pk on s(action_at, pk) where status in ('P', 'PD', 'A')
>
> With the query:
> SELECT s.pk FROM s
> INNER JOIN m ON m.pk = s.m_fk
> WHERE
>   s.status IN ('A', 'PD', 'P')
>   AND (action_at <= '2018-09-06')
>   AND s.status IN ('A', 'P')
>   AND m.status = 'A';
>
> I generally expect the index s_action_at_pk to always be preferred over
> s_pk_action_at. And on stock Postgres it does in fact use that index (with
> a bitmap index scan).
>
> We like to set random_page_cost = 2 since we use fast SSDs only. With that
> change Postgres strongly prefers the index s_pk_action_at unless I both
> disable the other index and turn off bitmap heap scans.
>
> I'm attaching the following plans:
> - base_plan.txt: default costs; both indexes available
> - base_plan_rpc2.txt: random_page_cost = 2; both indexes available
> - inddisabled_plan_rpc2.txt: random_page_cost = 2; only s_action_at_pk
> available
> - inddisabled_bhsoff_plan_rpc2.txt: random_page_cost = 2;
> enable_bitmapscan = false;  only s_action_at_pk available
>
> A couple of questions:
> - How is s_pk_action_at ever efficient to scan? Given that the highest
> cardinality (primary key) column is first, wouldn't an index scan
> effectively have to scan the entire index?
> - Why does index scan on s_action_at_pk reads over 2x as many blocks as
> the bitmap heap scan with the same index?
> - Would you expect Postgres to generally always prefer using the
> s_action_at_pk index over the s_pk_action_at index for this query? I
> realize changing the random page cost is part of what's driving this, but I
> still can't imagine reading the full s_pk_action_at index (assuming that's
> what it is doing) could ever be more valuable.
>
> As a side note, the planner is very bad at understanding a query that
> happens (I realize you wouldn't write this by hand, but ORMs) when you have
> a where clause like:
> s.status IN ('A', 'PD', 'P') AND s.status IN ('A', 'P')
> the row estimates are significantly different from a where clause with
> only:
> s.status IN ('A', 'P')
> even though semantically those are identical.
>
>
>


Re: [HACKERS] [WIP] Zipfian distribution in pgbench

2018-10-08 Thread Peter Geoghegan
On Fri, Jul 7, 2017 at 12:45 AM Alik Khilazhev
 wrote:
> PostgreSQL shows very bad results in YCSB Workload A (50% SELECT and 50% 
> UPDATE of random row by PK) on benchmarking with big number of clients using 
> Zipfian distribution. MySQL also has decline but it is not significant as it 
> is in PostgreSQL. MongoDB does not have decline at all. And if pgbench would 
> have Zipfian distribution random number generator, everyone will be able to 
> make research on this topic without using YCSB.

I've been thinking about this again, in light of my recent work to
improve the B-Tree code by making all entries have fully unique keys,
adding suffix truncation, and making better choices about split points
[1].

Somebody posted a patch that fixed the issue by making the heavyweight
lock manager lock a value rather than a heap TID within the last year.
I can't find that thread right now -- perhaps someone can point it out
now. I suspect that my patch will have a similar effect to this other
patch I'm thinking of with the Zipfian workload, though it will do so
without touching anything outside of the B-Tree code, and without
changing the user-visible semantics of locking.

Can we possibly find a way to rerun this Zipfian experiment/test case?
I bet Alexander's recent B-Tree patch will also help.

Here is why I suspect my patch will help with the Zipfian workload,
particularly on a multi-socket machine:

The logic for following downlinks in Postgres is that downlinks are a
lower bound, but we still cannot follow them when the key is equal to
our insertion scankey -- we must go left, even when we have all keys
filled out. This means that we sometimes end up holding an exclusive
buffer lock on "the first leaf page the value could be on", even
though that page doesn't have any such values (they're all in later
pages, to the left). So we accidentally lock-out insertions of the
value 1 when we insert the value 2, and of the value 2 when we insert
the value 3, and of the value 3 when we insert the value 4, and so on
down the line. This is the worse possible thing for the Zipfian test
case, I think, since most of the contention is artificial.

I think that my patch may ameliorate the problem, because:

* We make the tie-breaker heap TID attribute have DESC sort order, so
generally speaking contention starts on the leftmost page among leaf
pages full of duplicates -- for reads, and for writes.

* The patch works very hard to make sure that large groups of
duplicates are not spread across pages. There would be no mixing of
leaf pages containing the value 1 and the value 2 with the Zipfian
test case, for example. Old duplicates would be packed into full
pages.

* We can invent a fake lower-than-any-real-value heap TID in the
insertion scankey in certain cases that don't have one. This way, the
scan key as a whole is *not* equal to pivot tuples that are at least
equal on non-heap-TID attributes, so we *can* follow a downlink that
is "equal" to our scankey, provided it has a truncated-away heap TID
attribute value.

In short, the artificial "false sharing" that we see in the B-Tree for
the Zipfian test case may go away. There will be *no* contention
between an inserter (or really UPDATE-er) that inserts the value 2
with concurrent inserters of the value 1. All of the ways in which
that was likely to happen are each fixed by the patch. There is still
buffer lock contention on heap pages, and the contention of waiting
for a row lock. Maybe the busy waiting will automatically be fixed,
though, since you have one point of contention for each of the really
hot values at the far left of the leaf level.

[1] https://commitfest.postgresql.org/20/1787/
--
Peter Geoghegan



Re: transction_timestamp() inside of procedures

2018-10-08 Thread Tom Lane
Andres Freund  writes:
> On October 8, 2018 10:14:34 AM PDT, Tom Lane  wrote:
>> Surely there is some way that we can directly test whether we're inside
>> a procedure or not?  I think the logic should be basically
>> ...

> Seems more reasonable from here.

We are rapidly running out of time to get this fixed before RC1.
In the interests of getting across the finish line, I took a look
around, and found that indeed there does not seem to be any exported
way to detect whether we're inside a procedure or not.  Which seems
pretty darn dumb from here.

The best way to determine that seems to be to check that that the SPI
stack is (a) nonempty and (b) has a "nonatomic" topmost entry.

Barring objections, I'm going to make a quick hack that adds a SPI
entry point along the lines of "bool SPI_inside_nonatomic_context(void)"
to do that test, adapt the xact.c code as I said upthread, and commit
with Peter's regression test case.

regards, tom lane



Percona is Seeking a PostgreSQL Consultant [North AMER based]

2018-10-08 Thread Jennifer Miller
Hi Everyone!

At Percona, we are seeking a remote (work from home) Consultant.  The
PostgreSQL Consultant is an individual who is focused on providing the
highest quality technical advice to our Percona customers. You’re the face
of Percona to our customers, and their success in achieving their goals, is
your success.

Percona has been in business since 2006.  We provide Software, Support,
Consulting, and Managed Services to large, well-known global brands such as
Cisco Systems, Time Warner Cable, Alcatel-Lucent, Rent the Runway and the
BBC, as well as smaller enterprises looking to maximize application
performance while streamlining database efficiencies. Well established as
thought leaders, Percona experts author content for the Percona Database
Performance Blog and the Percona Live Open Source Database Conferences draw
attendees and expert technical speakers from around the world.

 We offer flexible hours, competitive salaries and a great experience of
working on a multinational team of experts.

This position reports to a Managing Consultant. You will work both from
customer premises and remotely, with customer onsite work estimated at
around 50% of the time. Most people work from their homes when not onsite.
Travel will be required, both for onsite projects and internal meetings.
Access to a reliable high-speed Internet connection is required. This
position is for the US, and the consultant must be able to work full-time
during US business hours.

Core Job Duties:

   - PostgreSQL, MySQL and Linux systems consulting
   - Professional Development  - Skills, knowledge and processes
   - Marketing activities - blogging, conference attendance, etc
   - Administrative Work

Technical Specific Requirements:

   - 5+ Years PostgreSQL Administration
   - Detailed understanding of the PostgreSQL Architecture and internals.
   - Understanding and working knowledge of various extensions in PostgreSQL
   - 3+ Years MySQL Administration
   - Understanding of replication and high availability tools and methods.
   Including knowledge on High Availability and Connection Pooling & Load
   Balancing PostgreSQL workloads.
   - Query tuning
   - PostgreSQL Parameter Tuning.
   - Must have a detailed understanding on tuning Memory parameters in
   PostgreSQL.
   - PostgreSQL Performance Analysis.
   - Problem solver skills
   - Basic understanding of Innodb
   - Knowledge on plpgsql advantageous
   - Strong shell skills
   - Strong ability to troubleshoot in a linux environment


*See full description and apply here
.*
Website:  percona.com

Warm regards,
Jennifer

*Jennifer Miller*
Talent Acquisition Specialist, Percona
Skype: percona.jmiller
Email: jennifer.mil...@percona.com
Phone:  +1 888-401-3401 ext. 046
Standard Hours: Monday - Friday, 8:00AM - 5:00PM Eastern Time Zone, USA


*Interested in attending Percona Live Europe? Find out more here!
 *
*Sponsorship opportunities can be found here.
*


Removing variant expected-output files for float-output differences

2018-10-08 Thread Tom Lane
In the wake of commit 6eb3eb577, I believe we have no remaining buildfarm
animals that don't handle minus zero per spec.  gaur is the only one that
was failing on the minus-zero-dependent geometry test cases introduced by
a3d284485, and I've already verified that this makes it pass again.

I think therefore that we ought to remove the variant regression test
output files that are there only to cater for failing to print minus zero
as such.  AFAICT the ones meeting that description are numerology_1.out
and float8-small-is-zero_1.out.  If we keep them, we're morally obliged
to also cater for no-minus-zero in the geometry tests, and I don't think
we really want to, especially if we have no way to verify the variant
file.

(It might also be interesting to see what happens if we remove geo_ops.c's
hacks to avoid minus zero results.  But that's not what I'm on about
today.)

Also, we have quite a few variant expected-files that exist only to cater
for Windows' habit of printing three exponent digits where everybody else
prints just two.  It struck me that it would not be hard, or expensive,
to undo that choice in snprintf.c (see attached untested patch).  So we
could considerably reduce future maintenance pain for the affected tests
by getting rid of those files.

As against that, Windows users might possibly complain that float output
looks different than they're used to.  I'm not sure how much sympathy
I have for that position.  If we reimplement float output for more speed,
as is under discussion in nearby threads, I doubt we'd trouble to preserve
this Windows-ism in the rewrite.

Comments?

regards, tom lane

diff --git a/src/port/snprintf.c b/src/port/snprintf.c
index 58300ea..0eb70c2 100644
--- a/src/port/snprintf.c
+++ b/src/port/snprintf.c
@@ -1173,6 +1173,22 @@ fmtfloat(double value, char type, int forcesign, int leftjust,
 		}
 		if (vallen < 0)
 			goto fail;
+
+		/*
+		 * Windows, alone among our supported platforms, likes to emit
+		 * three-digit exponent fields even when two digits would do.  Hack
+		 * such results to look like the way everyone else does it.
+		 */
+#ifdef WIN32
+		if (vallen >= 6 &&
+			convert[vallen - 5] == 'e' &&
+			convert[vallen - 3] == '0')
+		{
+			convert[vallen - 3] = convert[vallen - 2];
+			convert[vallen - 2] = convert[vallen - 1];
+			vallen--;
+		}
+#endif
 	}
 
 	padlen = compute_padlen(minlen, vallen + zeropadlen, leftjust);
@@ -1290,6 +1306,17 @@ pg_strfromd(char *str, size_t count, int precision, double value)
 target.failed = true;
 goto fail;
 			}
+
+#ifdef WIN32
+			if (vallen >= 6 &&
+convert[vallen - 5] == 'e' &&
+convert[vallen - 3] == '0')
+			{
+convert[vallen - 3] = convert[vallen - 2];
+convert[vallen - 2] = convert[vallen - 1];
+vallen--;
+			}
+#endif
 		}
 	}
 


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
>
> >I am thinking so simple number should be good enough. We can require
> >equality - Just I need a signal so some is wrong - better than Postgres
> >crash.
>
> It'd cause constant conflicts and / or we would regularly forget updating
> it. It's not that trivial to determine what influences ABI compatibility.
>

This can be checked by regress tests? I don't know. Maybe I am not too
friendly, I am sorry. I spent 20 minutes by searching phantom, because JIT
was active, although I wanted.

So any help against this situation can be welcome.

Regards

Pavel


> Andres
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Tom Lane
Andres Freund  writes:
>> I am thinking so simple number should be good enough. We can require
>> equality - Just I need a signal so some is wrong - better than Postgres
>> crash.

> It'd cause constant conflicts and / or we would regularly forget updating it. 
> It's not that trivial to determine what influences ABI compatibility.

There already is a PG major-version-number check embedded in the
PG_MODULE_MAGIC infrastructure, which is plenty for regular users.
It's not sufficient for developers working with HEAD, of course.

We could consider making that work off of catversion instead, but I don't
think it'd really improve matters to do so.  catversion doesn't cover most
of what can break loadable modules, such as changes of Node data
structures.

regards, tom lane



Re: Function to promote standby servers

2018-10-08 Thread Laurenz Albe
Masahiko Sawada wrote:
> Maybe the patch needs regression tests for the new function. And I'd
> suggest to make the function name more clear by changing to
> pg_promote_server(), pg_promote_standby() and so on.

Thanks for the review.

The attached patch has regression tests - I though it would be good
to change some of the existing tests that run standby promotion to
use the SQL function instead of pg_ctl.

I have left the name though -- as far as I can tell, "promote" has
no other meaning in PostgreSQL than standby promotion, and I believe
it is only good to be more verbose if that avoids confusion.

Yours,
Laurenz Albe
From a5de6f9893e049bf97810e41530907e237f909d7 Mon Sep 17 00:00:00 2001
From: Laurenz Albe 
Date: Mon, 8 Oct 2018 17:59:37 +0200
Subject: [PATCH] Add pg_promote() to promote standby servers

---
 doc/src/sgml/func.sgml | 20 +
 doc/src/sgml/high-availability.sgml|  2 +-
 doc/src/sgml/recovery-config.sgml  |  3 +-
 src/backend/access/transam/xlog.c  |  2 -
 src/backend/access/transam/xlogfuncs.c | 48 ++
 src/include/access/xlog.h  |  4 ++
 src/include/catalog/pg_proc.dat|  4 ++
 src/test/perl/PostgresNode.pm  | 22 ++
 src/test/recovery/t/004_timeline_switch.pl |  2 +-
 src/test/recovery/t/009_twophase.pl|  2 +-
 10 files changed, 103 insertions(+), 6 deletions(-)

diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 9a7f683658..ae8a9b4ccb 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -18731,6 +18731,9 @@ SELECT set_config('log_statement_stats', 'off', false);

 pg_terminate_backend

+   
+pg_promote
+   
 

 signal
@@ -18790,6 +18793,16 @@ SELECT set_config('log_statement_stats', 'off', false);
 however only superusers can terminate superuser backends.

   
+  
+   
+pg_promote()
+
+   boolean
+   Promote a physical standby server.  This function can only be
+called by superusers and will only have an effect when run on
+a standby server.
+   
+  
  
 

@@ -18827,6 +18840,13 @@ SELECT set_config('log_statement_stats', 'off', false);
 subprocess.

 
+   
+pg_promote() sends a signal to the server that causes it
+to leave standby mode.  Since sending signals is by nature asynchronous,
+successful execution of the function does not guarantee that the server has
+already been promoted.
+   
+
   
 
   
diff --git a/doc/src/sgml/high-availability.sgml b/doc/src/sgml/high-availability.sgml
index 6f57362df7..8ccd1ffcd1 100644
--- a/doc/src/sgml/high-availability.sgml
+++ b/doc/src/sgml/high-availability.sgml
@@ -1472,7 +1472,7 @@ synchronous_standby_names = 'ANY 2 (s1, s2, s3)'
 

 To trigger failover of a log-shipping standby server,
-run pg_ctl promote or create a trigger
+run pg_ctl promote, call pg_promote(), or create a trigger
 file with the file name and path specified by the trigger_file
 setting in recovery.conf. If you're planning to use
 pg_ctl promote to fail over, trigger_file is
diff --git a/doc/src/sgml/recovery-config.sgml b/doc/src/sgml/recovery-config.sgml
index 92825fdf19..d06cd0b08e 100644
--- a/doc/src/sgml/recovery-config.sgml
+++ b/doc/src/sgml/recovery-config.sgml
@@ -439,7 +439,8 @@ restore_command = 'copy "C:\\server\\archivedir\\%f" "%p"'  # Windows
  
   Specifies a trigger file whose presence ends recovery in the
   standby.  Even if this value is not set, you can still promote
-  the standby using pg_ctl promote.
+  the standby using pg_ctl promote or calling
+  pg_promote().
   This setting has no effect if standby_mode is off.
  
 
diff --git a/src/backend/access/transam/xlog.c b/src/backend/access/transam/xlog.c
index 7375a78ffc..3a1f49e83a 100644
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -81,8 +81,6 @@ extern uint32 bootstrap_data_checksum_version;
 /* File path names (all relative to $PGDATA) */
 #define RECOVERY_COMMAND_FILE	"recovery.conf"
 #define RECOVERY_COMMAND_DONE	"recovery.done"
-#define PROMOTE_SIGNAL_FILE		"promote"
-#define FALLBACK_PROMOTE_SIGNAL_FILE "fallback_promote"
 
 
 /* User-settable parameters */
diff --git a/src/backend/access/transam/xlogfuncs.c b/src/backend/access/transam/xlogfuncs.c
index 9731742978..b448d0b515 100644
--- a/src/backend/access/transam/xlogfuncs.c
+++ b/src/backend/access/transam/xlogfuncs.c
@@ -35,6 +35,7 @@
 #include "storage/fd.h"
 #include "storage/ipc.h"
 
+#include 
 
 /*
  * Store label file and tablespace map during non-exclusive backups.
@@ -697,3 +698,50 @@ pg_backup_start_time(PG_FUNCTION_ARGS)
 
 	PG_RETURN_DATUM(xtime);
 }
+
+/*
+ * Promote a standby server.
+ *
+ * A result of "true" means that promotion has been initiated.
+ */
+Datum
+pg_promote(PG_FUNCTION_ARGS)
+{
+	

Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund



On October 8, 2018 10:29:56 AM PDT, Pavel Stehule  
wrote:
>po 8. 10. 2018 v 19:24 odesílatel Andres Freund 
>napsal:
>
>>
>>
>> On October 8, 2018 10:16:54 AM PDT, Pavel Stehule
>
>> wrote:
>> >po 8. 10. 2018 v 17:59 odesílatel Andres Freund 
>> >napsal:
>> >
>> >> Hi,
>> >>
>> >> On 2018-10-08 11:43:42 -0400, Tom Lane wrote:
>> >> > Andres Freund  writes:
>> >> > > On October 8, 2018 8:03:56 AM PDT, Tom Lane
>
>> >wrote:
>> >> > >> A look in guc.c shows that jit defaults to "on" whether or
>not
>> >JIT is
>> >> > >> enabled at compile time.
>> >> > >> This seems, at best, rather user-unfriendly.
>> >> > >> And it's not like our conventions for other
>> >compile-option-affected
>> >> > >> GUCs, eg the SSL ones.
>> >> >
>> >> > > That was intentional, even though it perhaps should be better
>> >> documented. That allows a distro to build and distribute pg
>without
>> >llvm
>> >> enabled, but then have the JIT package with all the dependencies
>> >> separately. The pg yum packages do so.
>> >> >
>> >> > I'm not following.  A distro wishing to do that would configure
>> >> > --with-llvm, not without, and then simply split up the build
>> >results
>> >> > so that the JIT stuff is in a separate subpackage.
>> >>
>> >> Well, that'd then leave you with one more state (LLVM configured
>but
>> >not
>> >> installed, LLVM configured and installed, LLVM disabled and
>arguably
>> >> LLVM disabled but installed), but more importantly if you compile
>> >> without llvm enabled, you can still install a different extension
>> >that
>> >> would do JIT. You'd just have to set jit_provider = xyz, and it'd
>> >> work. If we made the generic JIT code depend on LLVM that'd end up
>> >> working pretty weirdly.  I guess we could set jit_provider = ''
>> >> something if instead of hardcoding llvmjit if LLVM is disabled.
>> >>
>> >
>> >>
>> >> > If you configured --without-llvm then the resulting core code is
>> >(one
>> >> > hopes) entirely incapable of using JIT, and it'd be better if
>the
>> >GUC
>> >> > settings reflected that..
>> >>
>> >> That's not really the case, no. It controls whether the LLVM using
>> >jit
>> >> provider is built, not whether the generic JIT handling code is
>> >> available.  Given that the generic code has no dependencies, there
>> >seems
>> >> little reason to do that differently?
>> >>
>> >
>> >I can accept this logic, but it looks very fragile. Can be there
>some
>> >safeguard against using wrong version or wrong API?
>>
>> To me that seems like an llvm / JIT independent piece of
>infrastructure.
>> It'd probably be good if we had a catversion like thing to track ABI
>> compatibility, but how to do so without making development unduly
>painful
>> is less clear to me.
>>
>
>I am thinking so simple number should be good enough. We can require
>equality - Just I need a signal so some is wrong - better than Postgres
>crash.

It'd cause constant conflicts and / or we would regularly forget updating it. 
It's not that trivial to determine what influences ABI compatibility.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 19:24 odesílatel Andres Freund  napsal:

>
>
> On October 8, 2018 10:16:54 AM PDT, Pavel Stehule 
> wrote:
> >po 8. 10. 2018 v 17:59 odesílatel Andres Freund 
> >napsal:
> >
> >> Hi,
> >>
> >> On 2018-10-08 11:43:42 -0400, Tom Lane wrote:
> >> > Andres Freund  writes:
> >> > > On October 8, 2018 8:03:56 AM PDT, Tom Lane 
> >wrote:
> >> > >> A look in guc.c shows that jit defaults to "on" whether or not
> >JIT is
> >> > >> enabled at compile time.
> >> > >> This seems, at best, rather user-unfriendly.
> >> > >> And it's not like our conventions for other
> >compile-option-affected
> >> > >> GUCs, eg the SSL ones.
> >> >
> >> > > That was intentional, even though it perhaps should be better
> >> documented. That allows a distro to build and distribute pg without
> >llvm
> >> enabled, but then have the JIT package with all the dependencies
> >> separately. The pg yum packages do so.
> >> >
> >> > I'm not following.  A distro wishing to do that would configure
> >> > --with-llvm, not without, and then simply split up the build
> >results
> >> > so that the JIT stuff is in a separate subpackage.
> >>
> >> Well, that'd then leave you with one more state (LLVM configured but
> >not
> >> installed, LLVM configured and installed, LLVM disabled and arguably
> >> LLVM disabled but installed), but more importantly if you compile
> >> without llvm enabled, you can still install a different extension
> >that
> >> would do JIT. You'd just have to set jit_provider = xyz, and it'd
> >> work. If we made the generic JIT code depend on LLVM that'd end up
> >> working pretty weirdly.  I guess we could set jit_provider = ''
> >> something if instead of hardcoding llvmjit if LLVM is disabled.
> >>
> >
> >>
> >> > If you configured --without-llvm then the resulting core code is
> >(one
> >> > hopes) entirely incapable of using JIT, and it'd be better if the
> >GUC
> >> > settings reflected that..
> >>
> >> That's not really the case, no. It controls whether the LLVM using
> >jit
> >> provider is built, not whether the generic JIT handling code is
> >> available.  Given that the generic code has no dependencies, there
> >seems
> >> little reason to do that differently?
> >>
> >
> >I can accept this logic, but it looks very fragile. Can be there some
> >safeguard against using wrong version or wrong API?
>
> To me that seems like an llvm / JIT independent piece of infrastructure.
> It'd probably be good if we had a catversion like thing to track ABI
> compatibility, but how to do so without making development unduly painful
> is less clear to me.
>

I am thinking so simple number should be good enough. We can require
equality - Just I need a signal so some is wrong - better than Postgres
crash.


>
> Andres
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: transction_timestamp() inside of procedures

2018-10-08 Thread Andres Freund
Hi,

On October 8, 2018 10:14:34 AM PDT, Tom Lane  wrote:
>Peter Eisentraut  writes:
>> On 02/10/2018 16:58, Andres Freund wrote:
>>> It's a bit weird to make this decision based on these two timestamps
>>> differing.  For one, it only indirectly seems to be guaranteed that
>>> xactStartTimestamp is even set to anything here (to 0 by virtue of
>being
>>> a global var).
>
>> Maybe but it seems to be the simplest way without doing major surgery
>to
>> all this code.
>
>This patch doesn't apply over 07ee62ce9.  Also, I like the
>timestamp-comparison approach even less than Andres does: I think it's
>probably outright broken, especially since it treats the equality case
>as license to advance xactStartTimestamp.
>
>Surely there is some way that we can directly test whether we're inside
>a
>procedure or not?  I think the logic should be basically
>
>if (!IsParallelWorker())
>+   {
>+   if (!InsideProcedure())
>xactStartTimestamp = stmtStartTimestamp;
>+   else
>+   xactStartTimestamp = GetCurrentTimestamp();
>+   }
>else
>   Assert(xactStartTimestamp != 0);

Seems more reasonable from here.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund



On October 8, 2018 10:16:54 AM PDT, Pavel Stehule  
wrote:
>po 8. 10. 2018 v 17:59 odesílatel Andres Freund 
>napsal:
>
>> Hi,
>>
>> On 2018-10-08 11:43:42 -0400, Tom Lane wrote:
>> > Andres Freund  writes:
>> > > On October 8, 2018 8:03:56 AM PDT, Tom Lane 
>wrote:
>> > >> A look in guc.c shows that jit defaults to "on" whether or not
>JIT is
>> > >> enabled at compile time.
>> > >> This seems, at best, rather user-unfriendly.
>> > >> And it's not like our conventions for other
>compile-option-affected
>> > >> GUCs, eg the SSL ones.
>> >
>> > > That was intentional, even though it perhaps should be better
>> documented. That allows a distro to build and distribute pg without
>llvm
>> enabled, but then have the JIT package with all the dependencies
>> separately. The pg yum packages do so.
>> >
>> > I'm not following.  A distro wishing to do that would configure
>> > --with-llvm, not without, and then simply split up the build
>results
>> > so that the JIT stuff is in a separate subpackage.
>>
>> Well, that'd then leave you with one more state (LLVM configured but
>not
>> installed, LLVM configured and installed, LLVM disabled and arguably
>> LLVM disabled but installed), but more importantly if you compile
>> without llvm enabled, you can still install a different extension
>that
>> would do JIT. You'd just have to set jit_provider = xyz, and it'd
>> work. If we made the generic JIT code depend on LLVM that'd end up
>> working pretty weirdly.  I guess we could set jit_provider = ''
>> something if instead of hardcoding llvmjit if LLVM is disabled.
>>
>
>>
>> > If you configured --without-llvm then the resulting core code is
>(one
>> > hopes) entirely incapable of using JIT, and it'd be better if the
>GUC
>> > settings reflected that..
>>
>> That's not really the case, no. It controls whether the LLVM using
>jit
>> provider is built, not whether the generic JIT handling code is
>> available.  Given that the generic code has no dependencies, there
>seems
>> little reason to do that differently?
>>
>
>I can accept this logic, but it looks very fragile. Can be there some
>safeguard against using wrong version or wrong API?

To me that seems like an llvm / JIT independent piece of infrastructure.  It'd 
probably be good if we had a catversion like thing to track ABI compatibility, 
but how to do so without making development unduly painful is less clear to me.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 17:59 odesílatel Andres Freund  napsal:

> Hi,
>
> On 2018-10-08 11:43:42 -0400, Tom Lane wrote:
> > Andres Freund  writes:
> > > On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
> > >> A look in guc.c shows that jit defaults to "on" whether or not JIT is
> > >> enabled at compile time.
> > >> This seems, at best, rather user-unfriendly.
> > >> And it's not like our conventions for other compile-option-affected
> > >> GUCs, eg the SSL ones.
> >
> > > That was intentional, even though it perhaps should be better
> documented. That allows a distro to build and distribute pg without llvm
> enabled, but then have the JIT package with all the dependencies
> separately. The pg yum packages do so.
> >
> > I'm not following.  A distro wishing to do that would configure
> > --with-llvm, not without, and then simply split up the build results
> > so that the JIT stuff is in a separate subpackage.
>
> Well, that'd then leave you with one more state (LLVM configured but not
> installed, LLVM configured and installed, LLVM disabled and arguably
> LLVM disabled but installed), but more importantly if you compile
> without llvm enabled, you can still install a different extension that
> would do JIT. You'd just have to set jit_provider = xyz, and it'd
> work. If we made the generic JIT code depend on LLVM that'd end up
> working pretty weirdly.  I guess we could set jit_provider = ''
> something if instead of hardcoding llvmjit if LLVM is disabled.
>

>
> > If you configured --without-llvm then the resulting core code is (one
> > hopes) entirely incapable of using JIT, and it'd be better if the GUC
> > settings reflected that..
>
> That's not really the case, no. It controls whether the LLVM using jit
> provider is built, not whether the generic JIT handling code is
> available.  Given that the generic code has no dependencies, there seems
> little reason to do that differently?
>

I can accept this logic, but it looks very fragile. Can be there some
safeguard against using wrong version or wrong API?



> Greetings,
>
> Andres Freund
>


Re: transction_timestamp() inside of procedures

2018-10-08 Thread Tom Lane
Peter Eisentraut  writes:
> On 02/10/2018 16:58, Andres Freund wrote:
>> It's a bit weird to make this decision based on these two timestamps
>> differing.  For one, it only indirectly seems to be guaranteed that
>> xactStartTimestamp is even set to anything here (to 0 by virtue of being
>> a global var).

> Maybe but it seems to be the simplest way without doing major surgery to
> all this code.

This patch doesn't apply over 07ee62ce9.  Also, I like the
timestamp-comparison approach even less than Andres does: I think it's
probably outright broken, especially since it treats the equality case
as license to advance xactStartTimestamp.

Surely there is some way that we can directly test whether we're inside a
procedure or not?  I think the logic should be basically

if (!IsParallelWorker())
+   {
+   if (!InsideProcedure())
xactStartTimestamp = stmtStartTimestamp;
+   else
+   xactStartTimestamp = GetCurrentTimestamp();
+   }
else
   Assert(xactStartTimestamp != 0);


regards, tom lane



Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Andres Freund
On 2018-10-08 18:28:52 +0300, Konstantin Knizhnik wrote:
> 
> 
> On 08.10.2018 18:24, Andres Freund wrote:
> > 
> > On October 8, 2018 2:04:28 AM PDT, Konstantin Knizhnik 
> >  wrote:
> > > 
> > > On 05.10.2018 11:04, Michael Paquier wrote:
> > > > On Fri, Oct 05, 2018 at 10:06:45AM +0300, Konstantin Knizhnik wrote:
> > > > > As you can notice, XID 2004495308 is encountered twice which cause
> > > error in
> > > > > KnownAssignedXidsAdd:
> > > > > 
> > > > >       if (head > tail &&
> > > > >           TransactionIdFollowsOrEquals(KnownAssignedXids[head - 1],
> > > from_xid))
> > > > >       {
> > > > >           KnownAssignedXidsDisplay(LOG);
> > > > >           elog(ERROR, "out-of-order XID insertion in
> > > KnownAssignedXids");
> > > > >       }
> > > > > 
> > > > > The probability of this error is very small but it can quite easily
> > > > > reproduced: you should just set breakpoint in debugger after calling
> > > > > MarkAsPrepared in twophase.c and then try to prepare any
> > > transaction.
> > > > > MarkAsPrepared  will add GXACT to proc array and at this moment
> > > there will
> > > > > be two entries in procarray with the same XID:
> > > > > 
> > > > > [snip]
> > > > > 
> > > > > Now generated RUNNING_XACTS record contains duplicated XIDs.
> > > > So, I have been doing exactly that, and if you trigger a manual
> > > > checkpoint then things happen quite correctly if you let the first
> > > > session finish:
> > > > rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
> > > > 0/016150F8, prev 0/01615088, desc: RUNNING_XACTS nextXid 608
> > > > latestCompletedXid 605 oldestRunningXid 606; 2 xacts: 607 606
> > > > 
> > > > If you still maintain the debugger after calling MarkAsPrepared, then
> > > > the manual checkpoint would block.  Now if you actually keep the
> > > > debugger, and wait for a checkpoint timeout to happen, then I can see
> > > > the incorrect record.  It is impressive that your customer has been
> > > able
> > > > to see that first, and then that you have been able to get into that
> > > > state with simple steps.
> > > > 
> > > > > I want to ask opinion of community about the best way of fixing this
> > > > > problem.  Should we avoid storing duplicated XIDs in procarray (by
> > > > > invalidating XID in original pgaxct) or eliminate/change check for
> > > > > duplicate in KnownAssignedXidsAdd (for example just ignore
> > > > > duplicates)?
> > > > Hm...  Please let me think through that first.  It seems to me
> > > that
> > > > the record should not be generated to begin with.  At least I am able
> > > to
> > > > confirm what you see.
> > > The simplest way to fix the problem is to ignore duplicates before
> > > adding them to KnownAssignedXids.
> > > We in any case perform sort i this place...
> > I vehemently object to that as the proper course.
> And what about adding qsort to GetRunningTransactionData or
> LogCurrentRunningXacts and excluding duplicates here?

Sounds less terrible, but still pretty bad.  I think we should fix the
underlying data inconsistency, not paper over it a couple hundred meters
away.

Greetings,

Andres Freund



Re: pg_dumpall --exclude-database option

2018-10-08 Thread Andrew Dunstan



On 08/03/2018 05:08 PM, Fabien COELHO wrote:


Among other use cases, this is useful where a database name is 
visible but the database is not dumpable by the user. Examples of 
this occur in some managed Postgres services.


This looks like a reasonable feature.



Thanks for the review.




I will add this to the September CF.


My 0.02€:

Patch applies cleanly, compiles, and works for me.

A question: would it makes sense to have a symmetrical 
--include-database=PATTERN option as well?



I don't think so. If you only want a few databases, just use pg_dump. 
The premise of pg_dumpall is that you want all of them and this switch 
provides for exceptions to that.




Somehow the option does not make much sense when under -g/-r/-t... 
maybe it should complain, like it does when the others are used together?



Added an error check.



ISTM that it would have been better to issue just one query with an OR 
list, but that would require to extend "processSQLNamePattern" a 
little bit. Not sure whether it is worth it.


I don't think it is. This uses the same pattern that is used in 
pg_dump.c for similar switches.




Function "database_excluded": I'd suggest to consider reusing the
"simple_string_list_member" function instead of reimplementing it in a 
special case.



done.



XML doc: "--exclude-database=dbname", ISTM that 
"--exclude-database=pattern" would be closer to what it is? "Multiple 
database can be matched by writing multiple switches". Sure, but it 
can also be done with a pattern. The documentation seems to assume 
that the argument is one database name, and then changes this 
afterwards. I'd suggest to start by saying that a pattern like psql is 
expected, and then proceed to simply tell that the option can be 
repeated, instead of implying that it is a dbname and then telling 
that it is a pattern.


docco revised.



The simple list is not freed. Ok, it seems to be part of the design of 
the data structure.


I don't see much point in freeing it.

revised patch attached.

cheers

andrew


--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

diff --git a/doc/src/sgml/ref/pg_dumpall.sgml b/doc/src/sgml/ref/pg_dumpall.sgml
index c51a130..f0e1697 100644
--- a/doc/src/sgml/ref/pg_dumpall.sgml
+++ b/doc/src/sgml/ref/pg_dumpall.sgml
@@ -300,6 +300,28 @@ PostgreSQL documentation
   
  
 
+
+ 
+  --exclude-database=pattern
+  
+   
+Do not dump databases whose name matches
+pattern.
+Multiple patterns can be excluded by writing multiple
+--exclude-database switches.  The
+pattern parameter is
+interpreted as a pattern according to the same rules used by
+psql's \d
+commands (see ),
+so multiple databases can also be excluded by writing wildcard
+characters in the pattern.  When using wildcards, be careful to
+quote the pattern if needed to prevent the shell from expanding
+the wildcards.
+   
+  
+ 
+
  
   --if-exists
   
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index eb29d31..2bf8743 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -52,6 +52,8 @@ static PGconn *connectDatabase(const char *dbname, const char *connstr, const ch
 static char *constructConnStr(const char **keywords, const char **values);
 static PGresult *executeQuery(PGconn *conn, const char *query);
 static void executeCommand(PGconn *conn, const char *query);
+static void expand_dbname_patterns(PGconn *conn, SimpleStringList *patterns,
+   SimpleStringList *names);
 
 static char pg_dump_bin[MAXPGPATH];
 static const char *progname;
@@ -87,6 +89,9 @@ static char role_catalog[10];
 static FILE *OPF;
 static char *filename = NULL;
 
+static SimpleStringList database_exclude_patterns = {NULL, NULL};
+static SimpleStringList database_exclude_names = {NULL, NULL};
+
 #define exit_nicely(code) exit(code)
 
 int
@@ -123,6 +128,7 @@ main(int argc, char *argv[])
 		{"column-inserts", no_argument, _inserts, 1},
 		{"disable-dollar-quoting", no_argument, _dollar_quoting, 1},
 		{"disable-triggers", no_argument, _triggers, 1},
+		{"exclude-database",required_argument, NULL, 5},
 		{"if-exists", no_argument, _exists, 1},
 		{"inserts", no_argument, , 1},
 		{"lock-wait-timeout", required_argument, NULL, 2},
@@ -318,6 +324,10 @@ main(int argc, char *argv[])
 appendPQExpBufferStr(pgdumpopts, " --no-sync");
 break;
 
+			case 5:
+simple_string_list_append(_exclude_patterns, optarg);
+break;
+
 			default:
 fprintf(stderr, _("Try \"%s --help\" for more information.\n"), progname);
 exit_nicely(1);
@@ -334,6 +344,16 @@ main(int argc, char *argv[])
 		exit_nicely(1);
 	}
 
+	if (database_exclude_patterns.head != NULL &&
+		(globals_only || roles_only || tablespaces_only))
+	{
+		fprintf(stderr, _("%s: option --exclude-database cannot 

Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund
Hi,

On 2018-10-08 11:43:42 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
> >> A look in guc.c shows that jit defaults to "on" whether or not JIT is
> >> enabled at compile time. 
> >> This seems, at best, rather user-unfriendly.
> >> And it's not like our conventions for other compile-option-affected
> >> GUCs, eg the SSL ones.
> 
> > That was intentional, even though it perhaps should be better documented. 
> > That allows a distro to build and distribute pg without llvm enabled, but 
> > then have the JIT package with all the dependencies separately. The pg yum 
> > packages do so.
> 
> I'm not following.  A distro wishing to do that would configure
> --with-llvm, not without, and then simply split up the build results
> so that the JIT stuff is in a separate subpackage.

Well, that'd then leave you with one more state (LLVM configured but not
installed, LLVM configured and installed, LLVM disabled and arguably
LLVM disabled but installed), but more importantly if you compile
without llvm enabled, you can still install a different extension that
would do JIT. You'd just have to set jit_provider = xyz, and it'd
work. If we made the generic JIT code depend on LLVM that'd end up
working pretty weirdly.  I guess we could set jit_provider = ''
something if instead of hardcoding llvmjit if LLVM is disabled.


> If you configured --without-llvm then the resulting core code is (one
> hopes) entirely incapable of using JIT, and it'd be better if the GUC
> settings reflected that.

That's not really the case, no. It controls whether the LLVM using jit
provider is built, not whether the generic JIT handling code is
available.  Given that the generic code has no dependencies, there seems
little reason to do that differently?

Greetings,

Andres Freund



Re: Postgres 11 release notes

2018-10-08 Thread Adrien NAYRAT

On 10/8/18 5:20 PM, Bruce Momjian wrote:

Uh, where are you looking?  We don't rebuild the beta docs until the
next beta release, but you can see the changes in the developer docs:

https://www.postgresql.org/docs/devel/static/release-11.html


I looked in doc/src/sgml/release-11.sgml

And even on https://www.postgresql.org/docs/devel/static/release-11.html 
I don't see mention of 1f6d515a67


Thanks



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Tom Lane
Andres Freund  writes:
> On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
>> A look in guc.c shows that jit defaults to "on" whether or not JIT is
>> enabled at compile time. 
>> This seems, at best, rather user-unfriendly.
>> And it's not like our conventions for other compile-option-affected
>> GUCs, eg the SSL ones.

> That was intentional, even though it perhaps should be better documented. 
> That allows a distro to build and distribute pg without llvm enabled, but 
> then have the JIT package with all the dependencies separately. The pg yum 
> packages do so.

I'm not following.  A distro wishing to do that would configure
--with-llvm, not without, and then simply split up the build results
so that the JIT stuff is in a separate subpackage.  If you configured
--without-llvm then the resulting core code is (one hopes) entirely
incapable of using JIT, and it'd be better if the GUC settings
reflected that.

regards, tom lane



Re: executor relation handling

2018-10-08 Thread Tom Lane
I wrote:
> Keeping that comparison in mind, I'm inclined to think that 0001
> is the best thing to do for now.  The incremental win from 0002
> is not big enough to justify the API break it creates, while your
> 0005 is not really attacking the problem the right way.

I've pushed 0001 now.  I believe that closes out all the patches
discussed in this thread, so I've marked the CF entry committed.
Thanks for all the hard work!

regards, tom lane



Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Konstantin Knizhnik




On 08.10.2018 18:24, Andres Freund wrote:


On October 8, 2018 2:04:28 AM PDT, Konstantin Knizhnik 
 wrote:


On 05.10.2018 11:04, Michael Paquier wrote:

On Fri, Oct 05, 2018 at 10:06:45AM +0300, Konstantin Knizhnik wrote:

As you can notice, XID 2004495308 is encountered twice which cause

error in

KnownAssignedXidsAdd:

      if (head > tail &&
          TransactionIdFollowsOrEquals(KnownAssignedXids[head - 1],

from_xid))

      {
          KnownAssignedXidsDisplay(LOG);
          elog(ERROR, "out-of-order XID insertion in

KnownAssignedXids");

      }

The probability of this error is very small but it can quite easily
reproduced: you should just set breakpoint in debugger after calling
MarkAsPrepared in twophase.c and then try to prepare any

transaction.

MarkAsPrepared  will add GXACT to proc array and at this moment

there will

be two entries in procarray with the same XID:

[snip]

Now generated RUNNING_XACTS record contains duplicated XIDs.

So, I have been doing exactly that, and if you trigger a manual
checkpoint then things happen quite correctly if you let the first
session finish:
rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
0/016150F8, prev 0/01615088, desc: RUNNING_XACTS nextXid 608
latestCompletedXid 605 oldestRunningXid 606; 2 xacts: 607 606

If you still maintain the debugger after calling MarkAsPrepared, then
the manual checkpoint would block.  Now if you actually keep the
debugger, and wait for a checkpoint timeout to happen, then I can see
the incorrect record.  It is impressive that your customer has been

able

to see that first, and then that you have been able to get into that
state with simple steps.


I want to ask opinion of community about the best way of fixing this
problem.  Should we avoid storing duplicated XIDs in procarray (by
invalidating XID in original pgaxct) or eliminate/change check for
duplicate in KnownAssignedXidsAdd (for example just ignore
duplicates)?

Hm...  Please let me think through that first.  It seems to me

that

the record should not be generated to begin with.  At least I am able

to

confirm what you see.

The simplest way to fix the problem is to ignore duplicates before
adding them to KnownAssignedXids.
We in any case perform sort i this place...

I vehemently object to that as the proper course.
And what about adding qsort to GetRunningTransactionData or 
LogCurrentRunningXacts and excluding duplicates here?




Andres


--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 17:22 odesílatel Andres Freund  napsal:

>
>
> On October 8, 2018 8:16:06 AM PDT, Pavel Stehule 
> wrote:
> >po 8. 10. 2018 v 17:10 odesílatel Andres Freund 
> >napsal:
> >
> >>
> >>
> >> On October 8, 2018 8:03:56 AM PDT, Tom Lane 
> >wrote:
> >> >Andres Freund  writes:
> >> >> Where is the jit=on coming from? Config from before it was turned
> >> >off?
> >> >
> >> >A look in guc.c shows that jit defaults to "on" whether or not JIT
> >is
> >> >enabled at compile time.
> >>
> >> I thought Pavel was talking about 11 somehow...
> >>
> >
> >I am sorry, It was not clear - I talked about master
> >
> >
> >>
> >> > This seems, at best, rather user-unfriendly.
> >> >And it's not like our conventions for other compile-option-affected
> >> >GUCs, eg the SSL ones.
> >>
> >> That was intentional, even though it perhaps should be better
> >documented.
> >> That allows a distro to build and distribute pg without llvm enabled,
> >but
> >> then have the JIT package with all the dependencies separately. The
> >pg yum
> >> packages do so.
> >>
> >
> >I don't like this solution - it is safe on production - but it can
> >breaks
> >my development environment - or we need to change setup and make
> >install
> >will not be enough.
>
> It's not clear what could be done about it. You'll run into other trouble
> if you have half installed build artifacts because you reconfigured. Make
> uninstall from before the reconfigure would uninstall it.
>

It is partially true - when I use wrong extension, then I got error
message. But JIT library kills Postgres.

I expecting permanently disabled JIT if postgres was compiled without JIT
support. Dependency on some external file doesn't looks like safe solution.


> Andres
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Andres Freund



On October 8, 2018 2:04:28 AM PDT, Konstantin Knizhnik 
 wrote:
>
>
>On 05.10.2018 11:04, Michael Paquier wrote:
>> On Fri, Oct 05, 2018 at 10:06:45AM +0300, Konstantin Knizhnik wrote:
>>> As you can notice, XID 2004495308 is encountered twice which cause
>error in
>>> KnownAssignedXidsAdd:
>>>
>>>      if (head > tail &&
>>>          TransactionIdFollowsOrEquals(KnownAssignedXids[head - 1],
>from_xid))
>>>      {
>>>          KnownAssignedXidsDisplay(LOG);
>>>          elog(ERROR, "out-of-order XID insertion in
>KnownAssignedXids");
>>>      }
>>>
>>> The probability of this error is very small but it can quite easily
>>> reproduced: you should just set breakpoint in debugger after calling
>>> MarkAsPrepared in twophase.c and then try to prepare any
>transaction.
>>> MarkAsPrepared  will add GXACT to proc array and at this moment
>there will
>>> be two entries in procarray with the same XID:
>>>
>>> [snip]
>>>
>>> Now generated RUNNING_XACTS record contains duplicated XIDs.
>> So, I have been doing exactly that, and if you trigger a manual
>> checkpoint then things happen quite correctly if you let the first
>> session finish:
>> rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
>> 0/016150F8, prev 0/01615088, desc: RUNNING_XACTS nextXid 608
>> latestCompletedXid 605 oldestRunningXid 606; 2 xacts: 607 606
>>
>> If you still maintain the debugger after calling MarkAsPrepared, then
>> the manual checkpoint would block.  Now if you actually keep the
>> debugger, and wait for a checkpoint timeout to happen, then I can see
>> the incorrect record.  It is impressive that your customer has been
>able
>> to see that first, and then that you have been able to get into that
>> state with simple steps.
>>
>>> I want to ask opinion of community about the best way of fixing this
>>> problem.  Should we avoid storing duplicated XIDs in procarray (by
>>> invalidating XID in original pgaxct) or eliminate/change check for
>>> duplicate in KnownAssignedXidsAdd (for example just ignore
>>> duplicates)?
>> Hm...  Please let me think through that first.  It seems to me
>that
>> the record should not be generated to begin with.  At least I am able
>to
>> confirm what you see.
>
>The simplest way to fix the problem is to ignore duplicates before 
>adding them to KnownAssignedXids.
>We in any case perform sort i this place...

I vehemently object to that as the proper course.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: SCRAM with channel binding downgrade attack

2018-10-08 Thread Bruce Momjian
On Mon, Oct  8, 2018 at 12:42:39PM +0200, Peter Eisentraut wrote:
> On 05/10/2018 19:01, Bruce Momjian wrote:
> > On Fri, Oct  5, 2018 at 04:53:34PM +0200, Peter Eisentraut wrote:
> >> On 23/05/2018 08:46, Heikki Linnakangas wrote:
> >>> "tls-unique" and "tls-server-end-point" are overly technical to users. 
> >>> They don't care which one is used, there's no difference in security. 
> >>
> >> A question was raised about this in a recent user group meeting.
> >>
> >> When someone steals the server certificate from the real database server
> >> and sets up a MITM with that certificate, this would pass
> >> tls-server-end-point channel binding, because both the MITM and the real
> >> server have the same certificate.  But with tls-unique they would have
> >> different channel binding data, so the channel binding would detect this.
> >>
> >> Is that not correct?
> > 
> > Not correct.  First, they need to steal the server certificate and
> > _private_ key that goes with the certificate to impersonate the owner of
> > the certificate.
> 
> Right, I meant to imply that.

Right --- that is often confused so I wanted to clarify.

> > If that happens, with tls-server-end-point, a MITM
> > could replay what the real server sends to the MITM.  You are right that
> > tls-unique makes it harder for a MITM to reproduce the TLS shared key
> > which is mixed with the password hash to prove the server knows the
> > password hash.
> 
> So you appear to be saying the above *is* correct?

Yep, I am afraid so.  tls-unique seems technically superior, but I guess
not operationally superior, e.g., Java can't access the TLS shared
secret.  Supporting both causes the kind of channel binding mode
confusion that we have been dealing with, and the new security trend is
not to support too many options because their interaction is often
surprising, and insecure.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Refactor textToQualifiedNameList()

2018-10-08 Thread Pavel Stehule
The following review has been posted through the commitfest application:
make installcheck-world:  tested, failed
Implements feature:   tested, failed
Spec compliant:   tested, failed
Documentation:tested, failed

Hi

I tested this patch. This patch removes some duplicate rows, what is good -  on 
second hand, after this patch, the textToQualifiedNameList does one more copy 
of input string more. I looked where this function is used, and I don't think 
so it is a issue.

This patch is trivial - all tests passed and I don't see any problem. I'll mark 
as ready for commit.

The new status of this patch is: Ready for Committer


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund



On October 8, 2018 8:16:06 AM PDT, Pavel Stehule  
wrote:
>po 8. 10. 2018 v 17:10 odesílatel Andres Freund 
>napsal:
>
>>
>>
>> On October 8, 2018 8:03:56 AM PDT, Tom Lane 
>wrote:
>> >Andres Freund  writes:
>> >> Where is the jit=on coming from? Config from before it was turned
>> >off?
>> >
>> >A look in guc.c shows that jit defaults to "on" whether or not JIT
>is
>> >enabled at compile time.
>>
>> I thought Pavel was talking about 11 somehow...
>>
>
>I am sorry, It was not clear - I talked about master
>
>
>>
>> > This seems, at best, rather user-unfriendly.
>> >And it's not like our conventions for other compile-option-affected
>> >GUCs, eg the SSL ones.
>>
>> That was intentional, even though it perhaps should be better
>documented.
>> That allows a distro to build and distribute pg without llvm enabled,
>but
>> then have the JIT package with all the dependencies separately. The
>pg yum
>> packages do so.
>>
>
>I don't like this solution - it is safe on production - but it can
>breaks
>my development environment - or we need to change setup and make
>install
>will not be enough.

It's not clear what could be done about it. You'll run into other trouble if 
you have half installed build artifacts because you reconfigured. Make 
uninstall from before the reconfigure would uninstall it.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Postgres 11 release notes

2018-10-08 Thread Bruce Momjian
On Mon, Oct  8, 2018 at 12:05:03PM +0200, Adrien NAYRAT wrote:
> On 9/20/18 8:47 AM, Adrien Nayrat wrote:
> >On 8/25/18 11:24 PM, Peter Geoghegan wrote:
> >>On Sat, Aug 25, 2018 at 2:18 PM, Bruce Momjian  wrote:
> I think that's less "our" logic and more yours, that has become
> established because you've done most of the major release notes for a
> long time. I'm not trying to say that that's wrong or anything, just
> >>>
> >>>I don't do my work in a vacuum.  My behavior is based on what feedback I
> >>>have gotten from the community on what to include/exclude.
> >>
> >>FWIW, I agree with what Andres said about planner changes and the release 
> >>notes.
> >>
> >
> >Hello,
> >
> >Here is the patch that mention 1f6d515a67. Note I am not sure about my 
> >explanation.
> >
> >
> >Thanks
> >
> 
> Hi,
> 
> It seems this change has not been added to release notes? Should I change
> something in my path?

Uh, where are you looking?  We don't rebuild the beta docs until the
next beta release, but you can see the changes in the developer docs:

https://www.postgresql.org/docs/devel/static/release-11.html

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 17:10 odesílatel Andres Freund  napsal:

>
>
> On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
> >Andres Freund  writes:
> >> Where is the jit=on coming from? Config from before it was turned
> >off?
> >
> >A look in guc.c shows that jit defaults to "on" whether or not JIT is
> >enabled at compile time.
>
> I thought Pavel was talking about 11 somehow...
>

I am sorry, It was not clear - I talked about master


>
> > This seems, at best, rather user-unfriendly.
> >And it's not like our conventions for other compile-option-affected
> >GUCs, eg the SSL ones.
>
> That was intentional, even though it perhaps should be better documented.
> That allows a distro to build and distribute pg without llvm enabled, but
> then have the JIT package with all the dependencies separately. The pg yum
> packages do so.
>

I don't like this solution - it is safe on production - but it can breaks
my development environment - or we need to change setup and make install
will not be enough.

Regards

Pavel


> Andres
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund



On October 8, 2018 8:10:45 AM PDT, Andres Freund  wrote:
>
>
>On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
>> This seems, at best, rather user-unfriendly.
>>And it's not like our conventions for other compile-option-affected
>>GUCs, eg the SSL ones.
>
>That was intentional, even though it perhaps should be better
>documented. That allows a distro to build and distribute pg without
>llvm enabled, but then have the JIT package with all the dependencies
>separately. The pg yum packages do so.

There's a function for checking whether JIT is actually available, BTW. 
pg_jit_available(). That also works if somebody installs an extension that's 
not ours (by setting jit_provider =  ...).


Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund



On October 8, 2018 8:03:56 AM PDT, Tom Lane  wrote:
>Andres Freund  writes:
>> Where is the jit=on coming from? Config from before it was turned
>off?
>
>A look in guc.c shows that jit defaults to "on" whether or not JIT is
>enabled at compile time. 

I thought Pavel was talking about 11 somehow...


> This seems, at best, rather user-unfriendly.
>And it's not like our conventions for other compile-option-affected
>GUCs, eg the SSL ones.

That was intentional, even though it perhaps should be better documented. That 
allows a distro to build and distribute pg without llvm enabled, but then have 
the JIT package with all the dependencies separately. The pg yum packages do so.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 16:58 odesílatel Andres Freund  napsal:

> Hi
>
> On October 8, 2018 2:51:15 AM PDT, Pavel Stehule 
> wrote:
> >Hi
> >
> >I configured PostgreSQL without JIT support, but JIT is active by
> >default
> >
> >postgres=# show jit;
> >┌─┐
> >│ jit │
> >╞═╡
> >│ on  │
> >└─┘
> >(1 row)
>
> Where is the jit=on coming from? Config from before it was turned off?
>
>
> >Unfortunately - this combination does crashes on some bigger queries.
>
> You probably have the JIT plugin installed from an earlier time, which
> also would explain why it crashes.
>

I had it there.


>
> Andres
>
> --
> Sent from my Android device with K-9 Mail. Please excuse my brevity.
>


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Tom Lane
Andres Freund  writes:
> Where is the jit=on coming from? Config from before it was turned off?

A look in guc.c shows that jit defaults to "on" whether or not JIT is
enabled at compile time.  This seems, at best, rather user-unfriendly.
And it's not like our conventions for other compile-option-affected
GUCs, eg the SSL ones.

regards, tom lane



Re: merge semi join cost calculation error

2018-10-08 Thread Tom Lane
Pavel Stehule  writes:
> The user sent a plan:

> QUERY PLAN
> Merge Semi Join  (cost=82.97..580.24 rows=580 width=8) (actual
> time=0.503..9557.396 rows=721 loops=1)
>   Merge Cond: (tips.users_id = follows.users_id_to)
>   ->  Index Scan using tips_idx_users_id01 on tips  (cost=0.43..8378397.19
> rows=2491358 width=16) (actual time=0.009..9231.585 rows=2353914 loops=1)
>   ->  Sort  (cost=1.77..1.82 rows=22 width=8) (actual time=0.052..0.089
> rows=28 loops=1)
> Sort Key: follows.users_id_to
> Sort Method: quicksort  Memory: 26kB
> ->  Seq Scan on follows  (cost=0.00..1.27 rows=22 width=8) (actual
> time=0.013..0.020 rows=28 loops=1)
>   Filter: (users_id_from = 1)

> He has PostgreSQL 10.5. I cannot to understand to too low total cost of Merge
> Semi Join because subnode has very high cost 8378397.

The planner seems to be supposing that the merge will stop far short of
scanning the entire LHS table, presumably as a result of thinking that
the maximum value of follows.users_id_to is much less than the maximum
value of tips.users_id.  Given the actual rowcounts, that's seemingly
not true, which suggests out-of-date stats for one table or the other.

regards, tom lane



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Andres Freund
Hi

On October 8, 2018 2:51:15 AM PDT, Pavel Stehule  
wrote:
>Hi
>
>I configured PostgreSQL without JIT support, but JIT is active by
>default
>
>postgres=# show jit;
>┌─┐
>│ jit │
>╞═╡
>│ on  │
>└─┘
>(1 row)

Where is the jit=on coming from? Config from before it was turned off?


>Unfortunately - this combination does crashes on some bigger queries.

You probably have the JIT plugin installed from an earlier time, which also 
would explain why it crashes.

Andres

-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: Removing unneeded self joins

2018-10-08 Thread Alexander Kuzmenkov
Here is a rebased version of the patch. It includes some fixes after an 
off-list review by Konstantin Knizhnik.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/nodes/equalfuncs.c b/src/backend/nodes/equalfuncs.c
index 3bb91c9..62d46a1 100644
--- a/src/backend/nodes/equalfuncs.c
+++ b/src/backend/nodes/equalfuncs.c
@@ -166,10 +166,13 @@ _equalVar(const Var *a, const Var *b)
 	COMPARE_SCALAR_FIELD(vartypmod);
 	COMPARE_SCALAR_FIELD(varcollid);
 	COMPARE_SCALAR_FIELD(varlevelsup);
-	COMPARE_SCALAR_FIELD(varnoold);
-	COMPARE_SCALAR_FIELD(varoattno);
 	COMPARE_LOCATION_FIELD(location);
 
+	/*
+	 * varnoold/varoattno are used only for debugging and may differ even
+	 * when the variables are logically the same.
+	 */
+
 	return true;
 }
 
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index f295558..fb07ca0 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -2964,7 +2964,8 @@ ec_member_matches_indexcol(PlannerInfo *root, RelOptInfo *rel,
  * relation_has_unique_index_for
  *	  Determine whether the relation provably has at most one row satisfying
  *	  a set of equality conditions, because the conditions constrain all
- *	  columns of some unique index.
+ *	  columns of some unique index. If index_info is not null, it is set to
+ *	  point to a new UniqueIndexInfo containing the index and conditions.
  *
  * The conditions can be represented in either or both of two ways:
  * 1. A list of RestrictInfo nodes, where the caller has already determined
@@ -2985,7 +2986,8 @@ ec_member_matches_indexcol(PlannerInfo *root, RelOptInfo *rel,
 bool
 relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 			  List *restrictlist,
-			  List *exprlist, List *oprlist)
+			  List *exprlist, List *oprlist,
+			  UniqueIndexInfo **index_info)
 {
 	ListCell   *ic;
 
@@ -3041,6 +3043,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 	{
 		IndexOptInfo *ind = (IndexOptInfo *) lfirst(ic);
 		int			c;
+		List *matched_restrictlist = NIL;
 
 		/*
 		 * If the index is not unique, or not immediately enforced, or if it's
@@ -3089,6 +3092,7 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 if (match_index_to_operand(rexpr, c, ind))
 {
 	matched = true; /* column is unique */
+	matched_restrictlist = lappend(matched_restrictlist, rinfo);
 	break;
 }
 			}
@@ -3131,7 +3135,22 @@ relation_has_unique_index_for(PlannerInfo *root, RelOptInfo *rel,
 
 		/* Matched all columns of this index? */
 		if (c == ind->ncolumns)
+		{
+			if (index_info != NULL)
+			{
+/* This may be called in GEQO memory context. */
+MemoryContext oldContext = MemoryContextSwitchTo(root->planner_cxt);
+*index_info = palloc(sizeof(UniqueIndexInfo));
+(*index_info)->index = ind;
+(*index_info)->clauses = list_copy(matched_restrictlist);
+MemoryContextSwitchTo(oldContext);
+			}
+			if (matched_restrictlist)
+list_free(matched_restrictlist);
 			return true;
+		}
+		if (matched_restrictlist)
+			list_free(matched_restrictlist);
 	}
 
 	return false;
diff --git a/src/backend/optimizer/path/joinpath.c b/src/backend/optimizer/path/joinpath.c
index 642f951..2edcf22 100644
--- a/src/backend/optimizer/path/joinpath.c
+++ b/src/backend/optimizer/path/joinpath.c
@@ -176,7 +176,8 @@ add_paths_to_joinrel(PlannerInfo *root,
 	innerrel,
 	JOIN_INNER,
 	restrictlist,
-	false);
+	false,
+	NULL /*index_info*/);
 			break;
 		default:
 			extra.inner_unique = innerrel_is_unique(root,
@@ -185,7 +186,8 @@ add_paths_to_joinrel(PlannerInfo *root,
 	innerrel,
 	jointype,
 	restrictlist,
-	false);
+	false,
+	NULL /*index_info*/);
 			break;
 	}
 
diff --git a/src/backend/optimizer/plan/analyzejoins.c b/src/backend/optimizer/plan/analyzejoins.c
index 0e73f9c..2bb3a10 100644
--- a/src/backend/optimizer/plan/analyzejoins.c
+++ b/src/backend/optimizer/plan/analyzejoins.c
@@ -22,12 +22,15 @@
  */
 #include "postgres.h"
 
+#include "catalog/pg_class.h"
 #include "nodes/nodeFuncs.h"
 #include "optimizer/clauses.h"
 #include "optimizer/joininfo.h"
 #include "optimizer/pathnode.h"
 #include "optimizer/paths.h"
 #include "optimizer/planmain.h"
+#include "optimizer/predtest.h"
+#include "optimizer/restrictinfo.h"
 #include "optimizer/tlist.h"
 #include "optimizer/var.h"
 #include "utils/lsyscache.h"
@@ -39,14 +42,15 @@ static void remove_rel_from_query(PlannerInfo *root, int relid,
 static List *remove_rel_from_joinlist(List *joinlist, int relid, int *nremoved);
 static bool rel_supports_distinctness(PlannerInfo *root, RelOptInfo *rel);
 static bool rel_is_distinct_for(PlannerInfo *root, RelOptInfo *rel,
-	List *clause_list);
+	List *clause_list, UniqueIndexInfo 

Re: exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Tom Lane
Andrew Dunstan  writes:
> On 10/08/2018 09:13 AM, Michael Paquier wrote:
>> I had exactly the same thought, but with "git ls-files".  There are
>> still some files which should not be indented, like ppport.h which is
>> generated automatically still part of the tree.

> That's already explicitly excluded.

Yeah.  Doing this might allow us to simplify pgindent's explicit
blacklist, but I doubt it'd net out to a win speed-wise.

regards, tom lane



Re: chained transactions

2018-10-08 Thread Peter Eisentraut
On 02/10/2018 07:38, Michael Paquier wrote:
> The patch set does not apply anymore, so this patch is moved to next CF,
> waiting on author.

Attached is a rebased patch set.  No functionality changes.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From e964e279827635feb95df7d6518a6ba9fa55a4df Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 16 Feb 2018 21:37:55 -0500
Subject: [PATCH v2 1/2] Transaction chaining

Add command variants COMMIT AND CHAIN and ROLLBACK AND CHAIN, which
start new transactions with the same transaction characteristics as the
just finished one, per SQL standard.
---
 doc/src/sgml/ref/abort.sgml|  14 ++-
 doc/src/sgml/ref/commit.sgml   |  14 ++-
 doc/src/sgml/ref/end.sgml  |  14 ++-
 doc/src/sgml/ref/rollback.sgml |  14 ++-
 src/backend/access/transam/xact.c  |  73 ++-
 src/backend/catalog/sql_features.txt   |   2 +-
 src/backend/nodes/copyfuncs.c  |   1 +
 src/backend/nodes/equalfuncs.c |   1 +
 src/backend/parser/gram.y  |  19 ++-
 src/backend/tcop/utility.c |   4 +-
 src/include/access/xact.h  |   4 +-
 src/include/nodes/parsenodes.h |   1 +
 src/test/regress/expected/transactions.out | 139 +
 src/test/regress/sql/transactions.sql  |  49 
 14 files changed, 333 insertions(+), 16 deletions(-)

diff --git a/doc/src/sgml/ref/abort.sgml b/doc/src/sgml/ref/abort.sgml
index 21799d2a83..0372913365 100644
--- a/doc/src/sgml/ref/abort.sgml
+++ b/doc/src/sgml/ref/abort.sgml
@@ -21,7 +21,7 @@
 
  
 
-ABORT [ WORK | TRANSACTION ]
+ABORT [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
 
  
 
@@ -51,6 +51,18 @@ Parameters
  
 

+
+   
+AND CHAIN
+
+ 
+  If AND CHAIN is specified, a new transaction is
+  immediately started with the same transaction characteristics (see ) as the just finished one.  Otherwise,
+  no new transaction is started.
+ 
+
+   
   
  
 
diff --git a/doc/src/sgml/ref/commit.sgml b/doc/src/sgml/ref/commit.sgml
index b2e8d5d180..37c706a66f 100644
--- a/doc/src/sgml/ref/commit.sgml
+++ b/doc/src/sgml/ref/commit.sgml
@@ -21,7 +21,7 @@
 
  
 
-COMMIT [ WORK | TRANSACTION ]
+COMMIT [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
 
  
 
@@ -48,6 +48,18 @@ Parameters
  
 

+
+   
+AND CHAIN
+
+ 
+  If AND CHAIN is specified, a new transaction is
+  immediately started with the same transaction characteristics (see ) as the just finished one.  Otherwise,
+  no new transaction is started.
+ 
+
+   
   
  
 
diff --git a/doc/src/sgml/ref/end.sgml b/doc/src/sgml/ref/end.sgml
index 7523315f34..8b8f4f0dbb 100644
--- a/doc/src/sgml/ref/end.sgml
+++ b/doc/src/sgml/ref/end.sgml
@@ -21,7 +21,7 @@
 
  
 
-END [ WORK | TRANSACTION ]
+END [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
 
  
 
@@ -50,6 +50,18 @@ Parameters
  
 

+
+   
+AND CHAIN
+
+ 
+  If AND CHAIN is specified, a new transaction is
+  immediately started with the same transaction characteristics (see ) as the just finished one.  Otherwise,
+  no new transaction is started.
+ 
+
+   
   
  
 
diff --git a/doc/src/sgml/ref/rollback.sgml b/doc/src/sgml/ref/rollback.sgml
index 3cafb848a9..3019273a47 100644
--- a/doc/src/sgml/ref/rollback.sgml
+++ b/doc/src/sgml/ref/rollback.sgml
@@ -21,7 +21,7 @@
 
  
 
-ROLLBACK [ WORK | TRANSACTION ]
+ROLLBACK [ WORK | TRANSACTION ] [ AND [ NO ] CHAIN ]
 
  
 
@@ -47,6 +47,18 @@ Parameters
  
 

+
+   
+AND CHAIN
+
+ 
+  If AND CHAIN is specified, a new transaction is
+  immediately started with the same transaction characteristics (see ) as the just finished one.  Otherwise,
+  no new transaction is started.
+ 
+
+   
   
  
 
diff --git a/src/backend/access/transam/xact.c 
b/src/backend/access/transam/xact.c
index e3f668bf6a..58565f1f72 100644
--- a/src/backend/access/transam/xact.c
+++ b/src/backend/access/transam/xact.c
@@ -188,6 +188,7 @@ typedef struct TransactionStateData
boolstartedInRecovery;  /* did we start in recovery? */
booldidLogXid;  /* has xid been included in WAL 
record? */
int parallelModeLevel;  /* 
Enter/ExitParallelMode counter */
+   boolchain;  /* start a new block after this 
one */
struct TransactionStateData *parent;/* back link to parent */
 } TransactionStateData;
 
@@ -2746,6 +2747,36 @@ StartTransactionCommand(void)
MemoryContextSwitchTo(CurTransactionContext);
 }
 
+
+/*
+ * Simple system for saving and restoring transaction characteristics
+ * (isolation level, read only, deferrable).  We need this for transaction
+ * chaining, so that we can set the characteristics of the 

merge semi join cost calculation error

2018-10-08 Thread Pavel Stehule
Hi

I try to understand to a issue
https://stackoverflow.com/questions/52685384/subquery-performance-on-simple-case

The user sent a plan:

QUERY PLAN
Merge Semi Join  (cost=82.97..580.24 rows=580 width=8) (actual
time=0.503..9557.396 rows=721 loops=1)
  Merge Cond: (tips.users_id = follows.users_id_to)
  ->  Index Scan using tips_idx_users_id01 on tips  (cost=0.43..8378397.19
rows=2491358 width=16) (actual time=0.009..9231.585 rows=2353914 loops=1)
  ->  Sort  (cost=1.77..1.82 rows=22 width=8) (actual time=0.052..0.089
rows=28 loops=1)
Sort Key: follows.users_id_to
Sort Method: quicksort  Memory: 26kB
->  Seq Scan on follows  (cost=0.00..1.27 rows=22 width=8) (actual
time=0.013..0.020 rows=28 loops=1)
  Filter: (users_id_from = 1)

He has PostgreSQL 10.5. I cannot to understand to too low total cost of Merge
Semi Join because subnode has very high cost 8378397.

I cannot to emulate this case on my comp -  so it looks like maybe some
build error. What do you think about?

Regards

Pavel


Re: exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Andrew Dunstan




On 10/08/2018 09:13 AM, Michael Paquier wrote:

On Mon, Oct 08, 2018 at 08:33:38AM -0400, Andrew Dunstan wrote:

Seems reasonable - I tend to do vpath builds so this hasn't been a problem
for me ;-)

+1.


I wonder if a more general solution might be a good idea. Say like ignoring
everything pointed to by

     git status --porcelain --ignored

with a status of '??' (untracked) or '!!' (ignored).

I had exactly the same thought, but with "git ls-files".  There are
still some files which should not be indented, like ppport.h which is
generated automatically still part of the tree.



That's already explicitly excluded.

cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: pgsql: Fix event triggers for partitioned tables

2018-10-08 Thread Alvaro Herrera
On 2018-Oct-08, Alvaro Herrera wrote:

> On 2018-Oct-08, Michael Paquier wrote:
> 
> > The fix is obvious because currentCommand is a pointer and not an Oid.
> > Please see attached.  Should I fix it myself?

Pushed now, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Tom Lane
Andrew Dunstan  writes:
> On 10/08/2018 05:49 AM, Peter Eisentraut wrote:
>> pgindent spends a long time digging through tmp_check and tmp_install
>> directories and ends up re-indenting installed header files.  How about
>> excluding those directories, like below or directly in the script?

> I wonder if a more general solution might be a good idea. Say like 
> ignoring everything pointed to by
>      git status --porcelain --ignored

Peter's idea sounds better to me.  We don't have that many derived .c
or .h files, so I doubt the "git status" calls would pay for themselves.

(Personally, I don't run pgindent in non-cleaned trees ...)

regards, tom lane



Re: exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 08:33:38AM -0400, Andrew Dunstan wrote:
> Seems reasonable - I tend to do vpath builds so this hasn't been a problem
> for me ;-)

+1.

> I wonder if a more general solution might be a good idea. Say like ignoring
> everything pointed to by
> 
>     git status --porcelain --ignored
> 
> with a status of '??' (untracked) or '!!' (ignored).

I had exactly the same thought, but with "git ls-files".  There are
still some files which should not be indented, like ppport.h which is
generated automatically still part of the tree.
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Fix event triggers for partitioned tables

2018-10-08 Thread Alvaro Herrera
On 2018-Oct-08, Michael Paquier wrote:

> Hi Alvaro,
> 
> On Sat, Oct 06, 2018 at 10:18:46PM +, Alvaro Herrera wrote:
> > Fix event triggers for partitioned tables
> > 
> > Index DDL cascading on partitioned tables introduced a way for ALTER
> > TABLE to be called reentrantly.  This caused an an important deficiency
> > in event trigger support to be exposed: on exiting the reentrant call,
> > the alter table state object was clobbered, causing a crash when the
> > outer alter table tries to finalize its processing.  Fix the crash by
> > creating a stack of event trigger state objects.  There are still ways
> > to cause things to misbehave (and probably other crashers) with more
> > elaborate tricks, but at least it now doesn't crash in the obvious
> > scenario.
> 
> This commit is producing a warning with my compiler:
> event_trigger.c:1764:9: note: in expansion of macro ‘OidIsValid’
>   Assert(OidIsValid(currentEventTriggerState->currentCommand));
> 
> The fix is obvious because currentCommand is a pointer and not an Oid.
> Please see attached.  Should I fix it myself?

If you have a commit ready, please do, thanks.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Andrew Dunstan




On 10/08/2018 05:49 AM, Peter Eisentraut wrote:

pgindent spends a long time digging through tmp_check and tmp_install
directories and ends up re-indenting installed header files.  How about
excluding those directories, like below or directly in the script?

diff --git a/src/tools/pgindent/exclude_file_patterns
b/src/tools/pgindent/exclude_file_patterns
index 65c42c131d..c8efc9a913 100644
--- a/src/tools/pgindent/exclude_file_patterns
+++ b/src/tools/pgindent/exclude_file_patterns
@@ -6,3 +6,5 @@
  /snowball/libstemmer/
  /pl/plperl/ppport\.h$
  /jit/llvmjit\.h$
+/tmp_check/
+/tmp_install/




Seems reasonable - I tend to do vpath builds so this hasn't been a 
problem for me ;-)


I wonder if a more general solution might be a good idea. Say like 
ignoring everything pointed to by


    git status --porcelain --ignored

with a status of '??' (untracked) or '!!' (ignored).


cheers

andrew

--
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: now() vs transaction_timestamp()

2018-10-08 Thread Amit Kapila
On Sun, Oct 7, 2018 at 11:12 AM Konstantin Knizhnik
 wrote:
>
> On 07.10.2018 07:58, Amit Kapila wrote:
> > On Sat, Oct 6, 2018 at 9:40 PM Tom Lane  wrote:
> >> Konstantin Knizhnik  writes:
> >>> On 06.10.2018 00:25, Tom Lane wrote:
>  So maybe the right answer is to change the parallel mode infrastructure
>  so it transmits xactStartTimestamp, making transaction_timestamp()
>  retroactively safe, and then in HEAD only we could re-mark now() as
>  safe.  We might as well do the same for statement_timestamp as well.
> >>> Attached please find very small patch fixing the problem (propagating
> >>> transaction and statement timestamps to workers).
> >> That's a bit too small ;-) ... one demonstrable problem with it is
> >> that the parallel worker will report the wrong xactStartTimestamp
> >> to pgstat_report_xact_timestamp(), since you aren't jamming the
> >> transmitted value in soon enough.  Also, I found that ParallelWorkerMain
> >> executes at least two transactions before it ever gets to the "main"
> >> transaction that does real work, and I didn't much care for the fact
> >> that those were running with worker-local values of xactStartTimestamp
> >> and stmtStartTimestamp.  So I rearranged things a bit to ensure that
> >> parallel workers wouldn't generate their own values for either
> >> timestamp, and pushed it.
> >>
> > Currently, we serialize the other transaction related stuff via
> > PARALLEL_KEY_TRANSACTION_STATE.
> Yes, it was my first though to add serialization of timestamps to
> SerializeTransactionState.
> But it performs serialization into array of TransactionId, which is
> 32-bit (except PGProEE), and so too small for TimestampTz.

Right, it seems using another format to add timestampTz in that
serialization routine might turn out to be more invasive especially
considering that we have to back-patch this fix.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: WIP: Avoid creation of the free space map for small tables

2018-10-08 Thread John Naylor
On 10/7/18, Tom Lane  wrote:
> John Naylor  writes:
>> On 10/6/18, Thomas Munro  wrote:
>>> On Sat, Oct 6, 2018 at 7:47 AM John Naylor  wrote:
 A while back, Robert Haas noticed that the space taken up by very
 small tables is dominated by the FSM [1]. Tom suggested that we could
 prevent creation of the FSM until the heap has reached a certain
 threshold size [2]. Attached is a WIP patch to implement that.
>
> BTW, don't we need a similar hack for visibility maps?

The FSM is the bigger bang for the buck, and fairly simple to do, but
it would be nice to do something about VMs as well. I'm not sure if
simply lacking a VM would be as simple (or as free of downsides) as
for the FSM. I haven't studied the VM code in detail, however.

-John Naylor



Re: SCRAM with channel binding downgrade attack

2018-10-08 Thread Peter Eisentraut
On 05/10/2018 19:01, Bruce Momjian wrote:
> On Fri, Oct  5, 2018 at 04:53:34PM +0200, Peter Eisentraut wrote:
>> On 23/05/2018 08:46, Heikki Linnakangas wrote:
>>> "tls-unique" and "tls-server-end-point" are overly technical to users. 
>>> They don't care which one is used, there's no difference in security. 
>>
>> A question was raised about this in a recent user group meeting.
>>
>> When someone steals the server certificate from the real database server
>> and sets up a MITM with that certificate, this would pass
>> tls-server-end-point channel binding, because both the MITM and the real
>> server have the same certificate.  But with tls-unique they would have
>> different channel binding data, so the channel binding would detect this.
>>
>> Is that not correct?
> 
> Not correct.  First, they need to steal the server certificate and
> _private_ key that goes with the certificate to impersonate the owner of
> the certificate.

Right, I meant to imply that.

> If that happens, with tls-server-end-point, a MITM
> could replay what the real server sends to the MITM.  You are right that
> tls-unique makes it harder for a MITM to reproduce the TLS shared key
> which is mixed with the password hash to prove the server knows the
> password hash.

So you appear to be saying the above *is* correct?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
po 8. 10. 2018 v 12:06 odesílatel Thomas Munro <
thomas.mu...@enterprisedb.com> napsal:

> On Mon, Oct 8, 2018 at 10:52 PM Pavel Stehule 
> wrote:
> >
> > Hi
> >
> > I configured PostgreSQL without JIT support, but JIT is active by default
>
> I think that happens when llvmjit.so is present (ie from last time you
> built with JIT support and ran make install).  You need to remove it.
>

aha. I'll do it. But it doesn't look like robust solution :-(.

Maybe clean, or distclean should to remove this file.

Thank you for info

Pavel


> --
> Thomas Munro
> http://www.enterprisedb.com
>


Re: PostgreSQL 12, JIT defaults

2018-10-08 Thread Thomas Munro
On Mon, Oct 8, 2018 at 10:52 PM Pavel Stehule  wrote:
>
> Hi
>
> I configured PostgreSQL without JIT support, but JIT is active by default

I think that happens when llvmjit.so is present (ie from last time you
built with JIT support and ran make install).  You need to remove it.

--
Thomas Munro
http://www.enterprisedb.com



Re: Postgres 11 release notes

2018-10-08 Thread Adrien NAYRAT

On 9/20/18 8:47 AM, Adrien Nayrat wrote:

On 8/25/18 11:24 PM, Peter Geoghegan wrote:

On Sat, Aug 25, 2018 at 2:18 PM, Bruce Momjian  wrote:

I think that's less "our" logic and more yours, that has become
established because you've done most of the major release notes for a
long time. I'm not trying to say that that's wrong or anything, just


I don't do my work in a vacuum.  My behavior is based on what feedback I
have gotten from the community on what to include/exclude.


FWIW, I agree with what Andres said about planner changes and the release notes.



Hello,

Here is the patch that mention 1f6d515a67. Note I am not sure about my 
explanation.


Thanks



Hi,

It seems this change has not been added to release notes? Should I 
change something in my path?


Thanks,



PostgreSQL 12, JIT defaults

2018-10-08 Thread Pavel Stehule
Hi

I configured PostgreSQL without JIT support, but JIT is active by default

postgres=# show jit;
┌─┐
│ jit │
╞═╡
│ on  │
└─┘
(1 row)

Unfortunately - this combination does crashes on some bigger queries.

Regards

Pavel


exclude tmp_check and tmp_install from pgindent

2018-10-08 Thread Peter Eisentraut
pgindent spends a long time digging through tmp_check and tmp_install
directories and ends up re-indenting installed header files.  How about
excluding those directories, like below or directly in the script?

diff --git a/src/tools/pgindent/exclude_file_patterns
b/src/tools/pgindent/exclude_file_patterns
index 65c42c131d..c8efc9a913 100644
--- a/src/tools/pgindent/exclude_file_patterns
+++ b/src/tools/pgindent/exclude_file_patterns
@@ -6,3 +6,5 @@
 /snowball/libstemmer/
 /pl/plperl/ppport\.h$
 /jit/llvmjit\.h$
+/tmp_check/
+/tmp_install/

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Konstantin Knizhnik




On 08.10.2018 12:14, Michael Paquier wrote:

On Mon, Oct 08, 2018 at 12:04:28PM +0300, Konstantin Knizhnik wrote:

The simplest way to fix the problem is to ignore duplicates before adding
them to KnownAssignedXids.
We in any case perform sort i this place...

I may of course be missing something, but shouldn't we not have
duplicates in the first place?

The reason of appearing duplicated XIDs in case of 2PC seems to be clear.
It may be possible to eliminate it by clearing XID of MyPgxact for 
prepared transaction.

But there are two problems with it:
1. I am not sure that it will not break something
2. There is obvious race condition between adding GXACT to ProcArrayAdd 
and invalidating XID of current transaction.
If it is cleared before calling ProcArrayAdd, then there will be some 
moment when XID is not present in procarray.
If it is done after calling ProcArrayAdd, then still it is possible to 
see duplicated XID in procarray.


From my point of view it is easier and less invasive to exclude 
duplicates while replaying RUNNING_XIDS record.





--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: Procedure calls are not tracked in pg_stat_user_functions / track_functions

2018-10-08 Thread Peter Eisentraut
On 05/10/2018 14:15, Peter Eisentraut wrote:
> On 04/10/2018 22:07, Andres Freund wrote:
>> On 2018-10-04 12:15:28 -0700, Lukas Fittl wrote:
>>> Was this intentional, or an oversight?
>>>
>>> If welcome, I would be happy to work on a patch. Whilst slightly confusing
>>> in terms of naming, we could just track this together with functions, since
>>> one can always join with pg_proc to determine whether something is a
>>> function or a procedure.
>>
>> Yea, that sounds wrong / not ideal to me.  I think we should just fix
>> this, should be easy enough.
> 
> Here is a patch.

committed

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Segfault when creating partition with a primary key and sql_drop trigger exists

2018-10-08 Thread Arthur Zakirov

Hello,

On 10/6/18 7:50 PM, Alvaro Herrera wrote:

here's my proposed patch.


There is an incorrect assert condition within 
EventTriggerCollectAlterTableSubcmd(). Maybe it should be like this?


-   Assert(OidIsValid(currentEventTriggerState->currentCommand));
+   Assert(currentEventTriggerState->currentCommand);

--
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company



Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 12:04:28PM +0300, Konstantin Knizhnik wrote:
> The simplest way to fix the problem is to ignore duplicates before adding
> them to KnownAssignedXids.
> We in any case perform sort i this place...

I may of course be missing something, but shouldn't we not have
duplicates in the first place?
--
Michael


signature.asc
Description: PGP signature


Re: pgsql: Fix event triggers for partitioned tables

2018-10-08 Thread Michael Paquier
Hi Alvaro,

On Sat, Oct 06, 2018 at 10:18:46PM +, Alvaro Herrera wrote:
> Fix event triggers for partitioned tables
> 
> Index DDL cascading on partitioned tables introduced a way for ALTER
> TABLE to be called reentrantly.  This caused an an important deficiency
> in event trigger support to be exposed: on exiting the reentrant call,
> the alter table state object was clobbered, causing a crash when the
> outer alter table tries to finalize its processing.  Fix the crash by
> creating a stack of event trigger state objects.  There are still ways
> to cause things to misbehave (and probably other crashers) with more
> elaborate tricks, but at least it now doesn't crash in the obvious
> scenario.

This commit is producing a warning with my compiler:
event_trigger.c:1764:9: note: in expansion of macro ‘OidIsValid’
  Assert(OidIsValid(currentEventTriggerState->currentCommand));

The fix is obvious because currentCommand is a pointer and not an Oid.
Please see attached.  Should I fix it myself?
--
Michael
diff --git a/src/backend/commands/event_trigger.c b/src/backend/commands/event_trigger.c
index 2c1dc47541..9a702e4097 100644
--- a/src/backend/commands/event_trigger.c
+++ b/src/backend/commands/event_trigger.c
@@ -1761,7 +1761,7 @@ EventTriggerCollectAlterTableSubcmd(Node *subcmd, ObjectAddress address)
 		return;
 
 	Assert(IsA(subcmd, AlterTableCmd));
-	Assert(OidIsValid(currentEventTriggerState->currentCommand));
+	Assert(currentEventTriggerState->currentCommand != NULL);
 	Assert(OidIsValid(currentEventTriggerState->currentCommand->d.alterTable.objectId));
 
 	oldcxt = MemoryContextSwitchTo(currentEventTriggerState->cxt);


signature.asc
Description: PGP signature


Re: Unclear error message

2018-10-08 Thread Michael Paquier
On Mon, Oct 08, 2018 at 08:40:49AM +0200, Laurenz Albe wrote:
> I'm fine with it.

Thanks, I have pushed this version and back-patched to v11.
--
Michael


signature.asc
Description: PGP signature


Re: out-of-order XID insertion in KnownAssignedXids

2018-10-08 Thread Konstantin Knizhnik



On 05.10.2018 11:04, Michael Paquier wrote:

On Fri, Oct 05, 2018 at 10:06:45AM +0300, Konstantin Knizhnik wrote:

As you can notice, XID 2004495308 is encountered twice which cause error in
KnownAssignedXidsAdd:

     if (head > tail &&
         TransactionIdFollowsOrEquals(KnownAssignedXids[head - 1], from_xid))
     {
         KnownAssignedXidsDisplay(LOG);
         elog(ERROR, "out-of-order XID insertion in KnownAssignedXids");
     }

The probability of this error is very small but it can quite easily
reproduced: you should just set breakpoint in debugger after calling
MarkAsPrepared in twophase.c and then try to prepare any transaction.
MarkAsPrepared  will add GXACT to proc array and at this moment there will
be two entries in procarray with the same XID:

[snip]

Now generated RUNNING_XACTS record contains duplicated XIDs.

So, I have been doing exactly that, and if you trigger a manual
checkpoint then things happen quite correctly if you let the first
session finish:
rmgr: Standby len (rec/tot): 58/58, tx:  0, lsn:
0/016150F8, prev 0/01615088, desc: RUNNING_XACTS nextXid 608
latestCompletedXid 605 oldestRunningXid 606; 2 xacts: 607 606

If you still maintain the debugger after calling MarkAsPrepared, then
the manual checkpoint would block.  Now if you actually keep the
debugger, and wait for a checkpoint timeout to happen, then I can see
the incorrect record.  It is impressive that your customer has been able
to see that first, and then that you have been able to get into that
state with simple steps.


I want to ask opinion of community about the best way of fixing this
problem.  Should we avoid storing duplicated XIDs in procarray (by
invalidating XID in original pgaxct) or eliminate/change check for
duplicate in KnownAssignedXidsAdd (for example just ignore
duplicates)?

Hm...  Please let me think through that first.  It seems to me that
the record should not be generated to begin with.  At least I am able to
confirm what you see.


The simplest way to fix the problem is to ignore duplicates before 
adding them to KnownAssignedXids.

We in any case perform sort i this place...



--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index bd20497..a2ea233 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -807,9 +807,12 @@ ProcArrayApplyRecoveryInfo(RunningTransactions running)
 		/*
 		 * Add the sorted snapshot into KnownAssignedXids
 		 */
-		for (i = 0; i < nxids; i++)
-			KnownAssignedXidsAdd(xids[i], xids[i], true);
-
+		KnownAssignedXidsAdd(xids[0], xids[0], true);
+		for (i = 1; i < nxids; i++)
+		{
+			if (xids[i] != xids[i-1])
+KnownAssignedXidsAdd(xids[i], xids[i], true);
+		}
 		KnownAssignedXidsDisplay(trace_recovery(DEBUG3));
 	}
 


Small run-time pruning doc fix

2018-10-08 Thread David Rowley
Before 5220bb7533f a note in ddl.sgml used to mention that run-time
pruning was only implemented for Append. When we got MergeAppend
support the commit updated this to mention MergeAppend is supported
too. This is slightly weird as it's not all that obvious what exactly
isn't supported when we mention:


 Both of these behaviors are likely to be changed in a future release
 of PostgreSQL.


The attached patch updates this to mention that ModifyTable is
unsupported which I think makes the above fragment make sense again.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


small_runtime_pruning_doc_fix.patch
Description: Binary data


Re[2]: Alter index rename concurrently to

2018-10-08 Thread Andrey Klychkov
> Attached is an updated patch.

That's OK now, the patch applying is without any errors.
I have no more remarks.


>Пятница,  5 октября 2018, 13:04 +03:00 от Peter Eisentraut 
>:
>
>On 03/10/2018 13:51, Andrey Klychkov wrote:
>> 1. Patch was applied without any errors except a part related to
>> documentation:
>> error: patch failed: doc/src/sgml/ref/alter_index.sgml:50
>> error: doc/src/sgml/ref/alter_index.sgml: patch does not apply
>
>Attached is an updated patch.
>
>> 2. The code has been compiled successfully, configured by:
>> # ./configure CFLAGS="-O0" --enable-debug --enable-cassert
>> --enable-depend --without-zlib
>
>Not sure why you use -O0 here.  It's not a good idea for development,
>because it might miss interesting warnings.
>
>> 7. Code style:
>> +RenameRelationInternal(Oid myrelid, const char *newrelname, bool
>> is_internal, bool is_index)
>> This line is longer than 80 chars.
>
>pgindent leaves this line alone.
>
>-- 
>Peter Eisentraut  http://www.2ndQuadrant.com/
>PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Regards,
Andrey Klychkov


Re: Unclear error message

2018-10-08 Thread Laurenz Albe
Michael Paquier wrote:
> On Sun, Oct 07, 2018 at 05:14:30PM +0900, Michael Paquier wrote:
> > Here is a counter-proposal:
> > "cannot use ONLY for foreign key on partitioned table \"%s\" referencing
> > relation \"%s\""
> > 
> > +-- also, adding a NOT VALID foreign key should fail
> > +ALTER TABLE fk_partitioned_fk ADD FOREIGN KEY (a, b) REFERENCES 
> > fk_notpartitioned_pk NOT VALID;
> > +ERROR:  cannot add NOT VALID foreign key to relation "fk_notpartitioned_pk"
> > +DETAIL:  This feature is not yet supported on partitioned tables.
> > 
> > This error should mention "fk_partitioned_fk", and not
> > "fk_notpartitioned_pk", no?  The foreign key is added to the former, not
> > the latter.
> 
> And after some more study, I finish with the attached.  Thoughts?

I'm fine with it.

"cannot use ONLY for foreign key on partitioned table" has a funny ring
to it (after all, ONLY was used for the table, not the foreign key), but
since I could not come up with anything better, I guess there is just
no entirely satisfactory way to phrase it tersely.

Yours,
Laurenz Albe