Re: [HACKERS] taking stdbool.h into use

2017-10-28 Thread Michael Paquier
On Thu, Oct 26, 2017 at 5:41 PM, Tom Lane  wrote:
> While warnings for this would be lovely, I don't see how we can expect to
> get any.  This is perfectly correct C code no matter whether isprimary
> is C99 bool or is typedef'd to char ... you just end up with different
> values of isprimary, should the RHS produce something other than 1/0.
> The compiler has no way to know that assigning, say, 4 in the char
> variable case is not quite your intent.  Maybe you could hope for a
> warning if the bit value were far enough left to actually not fit into
> "char", but otherwise there's nothing wrong.

This reminded me of
https://www.postgresql.org/message-id/20160212144735.7zkg5527i3un3254%40alap3.anarazel.de
which has caused commit af4472bc when using stdbool.h for MSVC
2013/2015 builds. So I would really assume that there are places where
we could see warnings.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [bug fix] postgres.exe crashes with access violation on Windows while starting up

2017-10-28 Thread Michael Paquier
On Thu, Oct 26, 2017 at 7:10 PM, Tsunakawa, Takayuki
 wrote:
> FIX
> ==
>
> Add the check "CurrentMemoryContext != NULL" in write_eventlog() as in 
> write_console().

 * Also verify that we are not on our way into error recursion
trouble due
 * to error messages thrown deep inside pgwin32_message_to_UTF16().
 */
if (!in_error_recursion_trouble() &&
+   CurrentMemoryContext != NULL &&
GetMessageEncoding() != GetACPEncoding())
{
So you are basically ready to lose any message that could be pushed
here if there is no memory context? That does not sound like a good
trade-off to me. A static buffer does not look like the best idea
either to not truncate message, so couldn't we envisage to just use
malloc? pgwin32_message_to_UTF16() is called in two places in elog.c,
and there is a full control on the error code paths.

> NOTE
> ==
>
> The reason is for not outputing the crash dump is a) the crash occurred 
> before installing the Windows exception handler 
> (pgwin32_install_crashdump_handler() call) and b) the effect of the following 
> call in postmaster is inherited in the child process.
>
> /* In case of general protection fault, don't show GUI popup 
> box */
> SetErrorMode(SEM_FAILCRITICALERRORS | SEM_NOGPFAULTERRORBOX);
>
> But I'm not sure in what order we should do 
> pgwin32_install_crashdump_handler(), startup_hacks() and steps therein, 
> MemoryContextInit().  I think that's another patch.

Perhaps. I don't have a final opinion on this matter.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] CUBE seems a bit confused about ORDER BY

2017-10-28 Thread Alexander Korotkov
On Fri, Oct 20, 2017 at 1:01 AM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Fri, Oct 20, 2017 at 12:52 AM, Tomas Vondra <
> tomas.von...@2ndquadrant.com> wrote:
>
>> I've noticed this suspicious behavior of "cube" data type with ORDER BY,
>> which I believe is a bug in the extension (or the GiST index support).
>> The following example comes directly from regression tests added by
>> 33bd250f (so CC Teodor and Stas, who are mentioned in the commit).
>>
>> This query should produce results with ordering "ascending by 2nd
>> coordinate or upper right corner". To make it clear, I've added the
>> "c~>4" expression to the query, otherwise it's right from the test.
>>
>> test=# SELECT c~>4 "c~>4", * FROM test_cube ORDER BY c~>4 LIMIT 15;
>>  c~>4 | c
>> --+---
>>50 | (30333, 50),(30273, 6)
>>75 | (43301, 75),(43227, 43)
>>   142 | (19650, 142),(19630, 51)
>>   160 | (2424, 160),(2424, 81)
>>   171 | (3449, 171),(3354, 108)
>>   155 | (18037, 155),(17941, 109)
>>   208 | (28511, 208),(28479, 114)
>>   217 | (19946, 217),(19941, 118)
>>   191 | (16906, 191),(16816, 139)
>>   187 | (759, 187),(662, 163)
>>   266 | (22684, 266),(22656, 181)
>>   255 | (24423, 255),(24360, 213)
>>   249 | (45989, 249),(45910, 222)
>>   377 | (11399, 377),(11360, 294)
>>   389 | (12162, 389),(12103, 309)
>> (15 rows)
>>
>> As you can see, it's not actually sorted by the c~>4 coordinate (but by
>> c~>2, which it the last number).
>>
>> Moreover, disabling index scans fixes the ordering:
>>
>> test=# set enable_indexscan = off;
>> SET
>> test=# SELECT c~>4, * FROM test_cube ORDER BY c~>4 LIMIT 15; --
>> ascending by 2nd coordinate or upper right corner
>>  ?column? | c
>> --+---
>>50 | (30333, 50),(30273, 6)
>>75 | (43301, 75),(43227, 43)
>>   142 | (19650, 142),(19630, 51)
>>   155 | (18037, 155),(17941, 109)
>>   160 | (2424, 160),(2424, 81)
>>   171 | (3449, 171),(3354, 108)
>>   187 | (759, 187),(662, 163)
>>   191 | (16906, 191),(16816, 139)
>>   208 | (28511, 208),(28479, 114)
>>   217 | (19946, 217),(19941, 118)
>>   249 | (45989, 249),(45910, 222)
>>   255 | (24423, 255),(24360, 213)
>>   266 | (22684, 266),(22656, 181)
>>   367 | (31018, 367),(30946, 333)
>>   377 | (11399, 377),(11360, 294)
>> (15 rows)
>>
>>
>> Seems like a bug somewhere in gist_cube_ops, I guess?
>>
>
> +1,
> that definitely looks like a bug. Thank you for reporting!
> I'll take a look on it in couple days.
>

I've reviewed code of ~> operator and its KNN-GiST support.  Unfortunately,
it appears that it's broken in design...  The explanation is above.

We've following implementation of ~> operator.

if (coord <= DIM(cube))
> {
> if (IS_POINT(cube))
> PG_RETURN_FLOAT8(cube->x[coord - 1]);
> else
> PG_RETURN_FLOAT8(Min(cube->x[coord - 1],
> cube->x[coord - 1 + DIM(cube)]));
> }
> else
> {
> if (IS_POINT(cube))
> PG_RETURN_FLOAT8(cube->x[(coord - 1) % DIM(cube)]);
> else
> PG_RETURN_FLOAT8(Max(cube->x[coord - 1],
> cube->x[coord - 1 - DIM(cube)]));
> }


Thus, for cube of N dimensions [1; N - 1] are lower bounds and [N; 2*N - 1]
are upper bounds.  However, we might have indexed cubes of different
dimensions.  For example, for cube of 2 dimensions "cube ~> 3" selects
upper bound of 1st dimension.  For cube of 3 dimensions "cube ~> 3" selects
lower bound of 3rd dimension.

Therefore, despite ~> operator was specially invented to be supported by
KNN-GIST, it can't serve this way.

Regarding particular case discovered by Tomas, I think the error is in the
GiST supporting code.

if (strategy == CubeKNNDistanceCoord)
> {
> int coord = PG_GETARG_INT32(1);
> if (DIM(cube) == 0)
> retval = 0.0;
> else if (IS_POINT(cube))
> retval = cube->x[(coord - 1) % DIM(cube)];
> else
> retval = Min(cube->x[(coord - 1) % DIM(cube)],
> cube->x[(coord - 1) % DIM(cube) + DIM(cube)]);
> }


g_cube_distance() always returns lower bound of cube.  It should return
upper bound for coord > DIM(cube).

It would be also nice to provide descending ordering using KNN-GiST.  It
would be especially effective for upper bound.  Since, KNN-GiST doesn't
support explicit descending ordering, it might be useful to make ~>
operator return negative of coordinate when negative argument is provided.
For instance, '(1,2), (3,4)'::cube ~> -1 return -1.

I'd like to propose following way to resolve design issue.  cube ~> (2*N -
1) should return lower bound of Nth coordinate of the cube while cube ~>
2*N should return upper bound of Nth coordinate.  Then it would be
independent on number of coordinates in particular cube.  For sure, it
would be user-visible incompatible change.  But I think there is not so
many users of this operator yet.  Also, current behavior of ~> seems quite
useless.

Any thoughts?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Implementing pg_receivewal --no-sync

2017-10-28 Thread Michael Paquier
On Fri, Oct 27, 2017 at 12:03 AM, Michael Paquier
 wrote:
> On Thu, Oct 26, 2017 at 10:46 PM, Robert Haas  wrote:
>> On Wed, Oct 25, 2017 at 10:03 PM, Michael Paquier
>>  wrote:
>>> This sentence is actually wrong, a feedback message is never sent with
>>> the feedback message.
>>
>> Eh?
>
> "A feedback message is never sent depending on the status interval".
>
>> I think this looks basically fine, though I'd omit the short option
>> for it.  There are only so many letters in the alphabet, so let's not
>> use them up for developer-convenience options.
>
> No objections to that.

Okay. Here is an updated patch incorporating those comments.
-- 
Michael


pg_receivewal_nosync_v3.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-10-28 Thread Pavel Stehule
2017-10-28 23:35 GMT+02:00 Alexander Korotkov :

> On Sat, Oct 28, 2017 at 3:46 PM, Pavel Stehule 
> wrote:
>
>> 2017-09-22 21:31 GMT+02:00 Pavel Stehule :
>>
>>>
>>>
>>> 2017-09-22 21:12 GMT+02:00 Peter Eisentraut <
>>> peter.eisentr...@2ndquadrant.com>:
>>>
 On 9/22/17 09:16, Pavel Stehule wrote:
 > Example: somebody set SORT_COLUMNS to schema_name value. This is
 > nonsense for \l command
 >
 > Now, I am thinking so more correct and practical design is based on
 > special mode, activated by variable
 >
 > PREFER_SIZE_SORT .. (off, asc, desc)
 >
 > This has sense for wide group of commands that can show size. And when
 > size is not visible, then this option is not active.

 Maybe this shouldn't be a variable at all.  It's not like you'll set
 this as a global preference.  You probably want it for one command only.
  So a per-command option might make more sense.

>>>
>>> Sure, I cannot to know, what users will do. But, when I need to see a
>>> size of objects, then I prefer the sort by size desc every time. If I need
>>> to find some object, then I can to use a searching in pager. So in my case,
>>> this settings will be in psqlrc. In GoodData we used years own
>>> customization - the order by size was hardcoded and nobody reported me any
>>> issue.
>>>
>>> Alexander proposed some per command option, but current syntax of psql
>>> commands don't allows some simple parametrization. If it can be user
>>> friendly, then it should be short. From implementation perspective, it
>>> should be simply parsed. It should be intuitive too - too much symbols
>>> together is not good idea.
>>>
>>> Maybe some prefix design - but it is not design for common people
>>> (although these people don't use psql usually)
>>>
>>> '\sort size \dt ?
>>>
>>> \dt:sort_by_size
>>> \dt+:sort_by_size ?
>>>
>>> I don't see any good design in this direction
>>>
>>>
>> I though about Alexander proposal, and I am thinking so it can be
>> probably best if we respect psql design. I implemented two command suffixes
>> (supported only when it has sense) "s" sorted by size and "d" as descent
>>
>> so list of tables can be sorted with commands:
>>
>> \dt+sd (in this case, the order is not strict), so command
>> \dtsd+ is working too (same \disd+ or \di+sd)
>>
>> These two chars are acceptable. Same principle is used for \l command
>>
>> \lsd+ or \l+sd
>>
>> What do you think about it?
>>
>
> I think \lsd+ command would be another postgres meme :)
> BTW, are you going to provide an ability to sort by name, schema?
>

It has sense only for tables - probably only \dtn "n" like name



> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-10-28 Thread Alexander Korotkov
On Sat, Oct 28, 2017 at 3:46 PM, Pavel Stehule 
wrote:

> 2017-09-22 21:31 GMT+02:00 Pavel Stehule :
>
>>
>>
>> 2017-09-22 21:12 GMT+02:00 Peter Eisentraut <
>> peter.eisentr...@2ndquadrant.com>:
>>
>>> On 9/22/17 09:16, Pavel Stehule wrote:
>>> > Example: somebody set SORT_COLUMNS to schema_name value. This is
>>> > nonsense for \l command
>>> >
>>> > Now, I am thinking so more correct and practical design is based on
>>> > special mode, activated by variable
>>> >
>>> > PREFER_SIZE_SORT .. (off, asc, desc)
>>> >
>>> > This has sense for wide group of commands that can show size. And when
>>> > size is not visible, then this option is not active.
>>>
>>> Maybe this shouldn't be a variable at all.  It's not like you'll set
>>> this as a global preference.  You probably want it for one command only.
>>>  So a per-command option might make more sense.
>>>
>>
>> Sure, I cannot to know, what users will do. But, when I need to see a
>> size of objects, then I prefer the sort by size desc every time. If I need
>> to find some object, then I can to use a searching in pager. So in my case,
>> this settings will be in psqlrc. In GoodData we used years own
>> customization - the order by size was hardcoded and nobody reported me any
>> issue.
>>
>> Alexander proposed some per command option, but current syntax of psql
>> commands don't allows some simple parametrization. If it can be user
>> friendly, then it should be short. From implementation perspective, it
>> should be simply parsed. It should be intuitive too - too much symbols
>> together is not good idea.
>>
>> Maybe some prefix design - but it is not design for common people
>> (although these people don't use psql usually)
>>
>> '\sort size \dt ?
>>
>> \dt:sort_by_size
>> \dt+:sort_by_size ?
>>
>> I don't see any good design in this direction
>>
>>
> I though about Alexander proposal, and I am thinking so it can be probably
> best if we respect psql design. I implemented two command suffixes
> (supported only when it has sense) "s" sorted by size and "d" as descent
>
> so list of tables can be sorted with commands:
>
> \dt+sd (in this case, the order is not strict), so command
> \dtsd+ is working too (same \disd+ or \di+sd)
>
> These two chars are acceptable. Same principle is used for \l command
>
> \lsd+ or \l+sd
>
> What do you think about it?
>

I think \lsd+ command would be another postgres meme :)
BTW, are you going to provide an ability to sort by name, schema?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] Index only scan for cube and seg

2017-10-28 Thread Alexander Korotkov
On Fri, Oct 27, 2017 at 9:54 PM, Tom Lane  wrote:

> Robert Haas  writes:
> > On Thu, Oct 26, 2017 at 12:22 PM, Andrey Borodin 
> wrote:
> >> For cube there is new default opclass.
>
> > I seem to recall that changing the default opclass causes unsolvable
> > problems with upgrades.  You might want to check the archives for
> > previous discussions of this issue; unfortunately, I don't recall the
> > details off-hand.
>
> Quite aside from that, replacing the opclass with a new one creates
> user-visible headaches that I don't think are justified, i.e. having to
> reconstruct indexes in order to get the benefit.
>
> Maybe I'm missing something, but ISTM you could just drop the compress
> function and call it good.  This would mean that an IOS scan would
> sometimes hand back a toast-compressed value, but what's the problem
> with that?
>

+1,
I think in this case replacing default opclass or even duplicating opclass
isn't justified.

(The only reason for making a decompress function that just detoasts
> is that your other support functions then do not have to consider
> the possibility that they're handed a toast-compressed value.  I have
> not checked whether that really matters for cube.)
>

As I can see, cube GiST code always uses DatumGetNDBOX() macro to transform
Datum to (NDBOX *).

#define DatumGetNDBOX(x) ((NDBOX *) PG_DETOAST_DATUM(x))

Thus, it should be safe to just remove both compress/decompress methods
from existing opclass.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


Re: [HACKERS] ALTER COLUMN TYPE vs. domain constraints

2017-10-28 Thread Michael Paquier
On Fri, Oct 27, 2017 at 11:15 AM, Tom Lane  wrote:
> We could consider back-patching the attached to cover this, but
> I'm not entirely sure it's worth the trouble, because I haven't
> thought of any non-silly use-cases in the absence of domains
> over composite.  Comments?

There are no real complaints about the current behavior, aren't there?
So only patching HEAD seems enough to me.

+comment on constraint c1 on domain dcomptype is 'random commentary';
[...]
+alter type comptype alter attribute r type bigint;
You have added a comment on the constraint to make sure that it
remains up on basically this ALTER TYPE. Querying pg_obj_description
would make sure that the comment on the constraint is still here.

+static void
+RebuildDomainConstraintComment(AlteredTableInfo *tab, int pass, Oid objid,
+  List *domname, char *conname)
There is much duplication with RebuildConstraintComment. Why not
grouping both under say RebuildObjectComment()? I would think about
having cmd->objtype and cmd->object passed as arguments, and then
remove rel and domname from the existing arguments.

[nit]
foreach(lcmd, subcmds)
-   ATExecCmd(wqueue, tab, rel, (AlterTableCmd *)
lfirst(lcmd), lockmode);
+   ATExecCmd(wqueue, tab, rel,
+ castNode(AlterTableCmd, lfirst(lcmd)),
+ lockmode);
This does not really belong to this patch.. No objections to group things.
[/nit]
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Current int & float overflow checking is slow.

2017-10-28 Thread Andres Freund
On 2017-10-24 15:28:17 -0400, Tom Lane wrote:
> Also, if I recall the old discussion properly, one concern was getting
> uniform behavior across different platforms.  I'm worried that if we do
> what Andres suggests, we'll get behavior that is not only different but
> platform-specific.  Now, to the extent that you believe that every modern
> platform implements edge-case IEEE float behavior the same way, that worry
> may be obsolete.  But I don't think I believe that.

Hm. Is the current code actually meaningfully less dependent on IEEE
float behaviour? Both with the current behaviour and with the
alternative of not ereporting we rely on infinity op something to result
in infinity.  Given that we're not preventing underflows, imprecise
results, denormals from being continued to use, I don't think we're
avoiding edge cases effectively at the moment.

I just spent the last hours digging through intel's architecture
manual. And discovered way too much weird stuff :/.

There indeed doesn't really seem to be any sort of decent way to
implement the overflow checks in an efficient manner. Clearing & testing
the SSE floating point control register, which contains the overflow
bit, is ~10 cycles each. The way gcc implements the isinf check as a
bunch of compares and bitwizzery with constants - I don't see how to
beat that.


Btw, looking at this code I noticed that the current error messages
aren't meaningful:

=# SELECT '-1e38'::float4  + '-3e38'::float4;
ERROR:  22003: value out of range: overflow


The current code gets slightly better if I put an unlikely() around just
the isinf(val) in CHECKFLOATVAL.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-28 Thread Peter Geoghegan
On Sat, Oct 28, 2017 at 12:49 PM, Simon Riggs  wrote:
> Nothing I am proposing blocks later work.

Actually, many things will block future work if you go down that road.
You didn't respond to the specific points I raised, but that doesn't
mean that they're not real.

> Everything you say makes it clear that a fully generalized solution is
> going to be many years in the making, assuming we agree.

I think that it's formally impossible as long as you preserve the ON
CONFLICT guarantees, unless you somehow define the problems out of
existence. Those are guarantees which no other MERGE implementation
has ever made, and which the SQL standard says nothing about. And for
good reasons.

> "The extent to which an SQL-implementation may disallow independent
> changes that are not significant is implementation-defined”.
>
> So we get to choose. I recommend that we choose something practical.
> We're approaching the 10 year anniversary of my first serious attempt
> to do MERGE. I say that its time to move forwards with useful
> solutions, rather than wait another 10 years for the perfect one, even
> assuming it exists.

As far as I'm concerned, you're the one arguing for an unobtainable
solution over a good one, not me. I *don't* think you should solve the
problems that I raise -- you should instead implement MERGE without
any of the ON CONFLICT guarantees, just like everyone else has.
Building MERGE on top of the ON CONFLICT guarantees, and ultimately
arriving at something that is comparable to other implementations over
many releases might be okay if anyone had the slightest idea of what
that would look like. You haven't even _described the semantics_,
which you could do by addressing the specific points that I raised.

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-28 Thread Peter Geoghegan
On Fri, Oct 27, 2017 at 3:00 PM, Serge Rielau  wrote:
>> What other systems *do* have this restriction? I've never seen one that did.
>
> Not clear what you are leading up to here.
> When I did MERGE in DB2 there was also no limitation:
> "Each row in the target can only be operated on once. A row in the target can 
> only be identified as MATCHED with one row in the result table of the 
> table-reference”
> What there was however was a significant amount of code I had to write and 
> test to enforce the above second sentence.

Then it seems that we were talking about two different things all along.

> Maybe in PG there is a trivial way to detect an expanding join and block it 
> at runtime.

There is for ON CONFLICT. See the cardinality violation logic within
ExecOnConflictUpdate(). (There are esoteric cases where this error can
be raised due to a wCTE that does an insert "from afar", which is
theoretically undesirable but not actually a problem.)

The MERGE implementation that I have in mind would probably do almost
the same thing, and make the "HeapTupleSelfUpdated" case within
ExecUpdate() raise an error when the caller happened to be a MERGE,
rather than following the historic UPDATE behavior. (The behavior is
to silently suppress a second or subsequent UPDATE attempt from the
same command, a behavior that Simon's mock MERGE documentation
references.)

> So the whole point I’m trying to make is that I haven’t seen the need for the 
> extra work I had to do once the feature appeared in the wild.

That seems pretty reasonable to me.

My whole point is that I think it's a mistake to do things like lock
rows ahead of evaluating any UPDATE predicate, in the style of ON
CONFLICT, in order to replicate the ON CONFLICT guarantees [1].

I'm arguing for implementation simplicity, too. Trying to implement
MERGE in a way that extends ON CONFLICT seems like a big mistake to
me, because ON CONFLICT updates rows on the basis of a would-be
duplicate violation, along with all the baggage that that carries.
This is actually enormously different to an equi-join that is fed by a
scan using an MVCC snapshot. The main difference is that there
actually is no MVCC snapshot in play in most cases [2]. If *no* row
with the PK value of 5 is visible to our MVCC snapshot, but an xact
committed having inserted such a row, that still counts as a CONFLICT
with READ COMMITTED.

[1] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_implementation
[2] 
https://www.postgresql.org/docs/devel/static/transaction-iso.html#xact-read-committed
-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-28 Thread Simon Riggs
On 28 October 2017 at 20:39, Peter Geoghegan  wrote:
> On Sat, Oct 28, 2017 at 3:10 AM, Simon Riggs  wrote:
>> SQL:2011 specifically states "The extent to which an
>> SQL-implementation may disallow independent changes that are not
>> significant is implementation-defined”, so in my reading the above
>> behaviour would make us fully spec compliant. Thank you to Peter for
>> providing the infrastructure on which this is now possible for PG11.
>>
>> Serge puts this very nicely by identifying two different use cases for MERGE.
>
> MERGE benefits from having a join that is more or less implemented in
> the same way as any other join. It can be a merge join, hash join, or
> nestloop join. ON CONFLICT doesn't work using a join.
>
> Should I to take it that you won't be supporting any of these
> alternative join algorithms? If not, then you'll have something that
> really isn't comparable to MERGE as implemented in Oracle, SQL Server,
> or DB2. They *all* do this.
>
> Would the user be able to omit WHEN NOT MATCHED/INSERT, as is the case
> with every existing MERGE implementation? If so, what actually happens
> under the hood when WHEN NOT MATCHED is omitted? For example, would
> you actually use a regular "UPDATE FROM" style join, as opposed to the
> ON CONFLICT infrastructure? And, if that is so, can you justify the
> semantic difference for rows that are updated in each scenario
> (omitted vs. not omitted) in READ COMMITTED mode? Note that this could
> be the difference between updating a row when *no* version is visible
> to our MVCC snapshot, as opposed to doing the EPQ stuff and updating
> the latest row version if possible. That's a huge, surprising
> difference. On top of all this, you risk live-lock if INSERT isn't a
> possible outcome (this is also why ON CONFLICT can never accept a
> predicate on its INSERT portion -- again, quite unlike MERGE).
>
> Why not just follow what other systems do? It's actually easier to go
> that way, and you get a better outcome. ON CONFLICT involves what you
> could call a sleight of hand, and I fear that you don't appreciate
> just how specialized the internal infrastructure is.
>
>> Now, I accept that you might also want a MERGE statement that
>> continues to work even if there is no unique constraint, but it would
>> need to have different properties to the above. I do not in any way
>> argue against adding that.
>
> Maybe you *should* be arguing against it, though, and arguing against
> ever supporting anything but equijoins, because these things will
> *become* impossible if you go down that road. By starting with the ON
> CONFLICT infrastructure, while framing no-unique-index-support as work
> for some unspecified future release, you're leaving it up to someone
> else to resolve the problems. Someone else must square the circle of
> mixing ON CONFLICT semantics with fully generalized MERGE semantics.
> But who?

Nothing I am proposing blocks later work.

Everything you say makes it clear that a fully generalized solution is
going to be many years in the making, assuming we agree.

"The extent to which an SQL-implementation may disallow independent
changes that are not significant is implementation-defined”.

So we get to choose. I recommend that we choose something practical.
We're approaching the 10 year anniversary of my first serious attempt
to do MERGE. I say that its time to move forwards with useful
solutions, rather than wait another 10 years for the perfect one, even
assuming it exists.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-28 Thread Peter Geoghegan
On Sat, Oct 28, 2017 at 3:10 AM, Simon Riggs  wrote:
> SQL:2011 specifically states "The extent to which an
> SQL-implementation may disallow independent changes that are not
> significant is implementation-defined”, so in my reading the above
> behaviour would make us fully spec compliant. Thank you to Peter for
> providing the infrastructure on which this is now possible for PG11.
>
> Serge puts this very nicely by identifying two different use cases for MERGE.

MERGE benefits from having a join that is more or less implemented in
the same way as any other join. It can be a merge join, hash join, or
nestloop join. ON CONFLICT doesn't work using a join.

Should I to take it that you won't be supporting any of these
alternative join algorithms? If not, then you'll have something that
really isn't comparable to MERGE as implemented in Oracle, SQL Server,
or DB2. They *all* do this.

Would the user be able to omit WHEN NOT MATCHED/INSERT, as is the case
with every existing MERGE implementation? If so, what actually happens
under the hood when WHEN NOT MATCHED is omitted? For example, would
you actually use a regular "UPDATE FROM" style join, as opposed to the
ON CONFLICT infrastructure? And, if that is so, can you justify the
semantic difference for rows that are updated in each scenario
(omitted vs. not omitted) in READ COMMITTED mode? Note that this could
be the difference between updating a row when *no* version is visible
to our MVCC snapshot, as opposed to doing the EPQ stuff and updating
the latest row version if possible. That's a huge, surprising
difference. On top of all this, you risk live-lock if INSERT isn't a
possible outcome (this is also why ON CONFLICT can never accept a
predicate on its INSERT portion -- again, quite unlike MERGE).

Why not just follow what other systems do? It's actually easier to go
that way, and you get a better outcome. ON CONFLICT involves what you
could call a sleight of hand, and I fear that you don't appreciate
just how specialized the internal infrastructure is.

> Now, I accept that you might also want a MERGE statement that
> continues to work even if there is no unique constraint, but it would
> need to have different properties to the above. I do not in any way
> argue against adding that.

Maybe you *should* be arguing against it, though, and arguing against
ever supporting anything but equijoins, because these things will
*become* impossible if you go down that road. By starting with the ON
CONFLICT infrastructure, while framing no-unique-index-support as work
for some unspecified future release, you're leaving it up to someone
else to resolve the problems. Someone else must square the circle of
mixing ON CONFLICT semantics with fully generalized MERGE semantics.
But who?

-- 
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unsafe tuple releasing in get_default_partition_oid

2017-10-28 Thread Julien Rouhaud
On Sat, Oct 28, 2017 at 11:13 AM, Robert Haas  wrote:
> On Sat, Oct 28, 2017 at 10:03 AM, Julien Rouhaud  wrote:
>> I just noticed that get_default_partition_oid() tries to release the
>> tuple even if it isn't valid.
>> Trivial patch attached.
>
> Oops.  I wonder how that managed to survive testing.
>
> Committed.

Thanks!


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema variables

2017-10-28 Thread Pavel Stehule
Hi

2017-10-28 16:24 GMT+02:00 Chris Travers :

>
>
> On Thu, Oct 26, 2017 at 9:21 AM, Pavel Stehule 
> wrote:
>
>> Hi,
>>
>> I propose a  new database object - a variable. The variable is persistent
>> object, that holds unshared session based not transactional in memory value
>> of any type. Like variables in any other languages. The persistence is
>> required for possibility to do static checks, but can be limited to session
>> - the variables can be temporal.
>>
>> My proposal is related to session variables from Sybase, MSSQL or MySQL
>> (based on prefix usage @ or @@), or package variables from Oracle (access
>> is controlled by scope), or schema variables from DB2. Any design is coming
>> from different sources, traditions and has some advantages or
>> disadvantages. The base of my proposal is usage schema variables as session
>> variables for stored procedures. It should to help to people who try to
>> port complex projects to PostgreSQL from other databases.
>>
>> The Sybase  (T-SQL) design is good for interactive work, but it is weak
>> for usage in stored procedures - the static check is not possible. Is not
>> possible to set some access rights on variables.
>>
>> The ADA design (used on Oracle) based on scope is great, but our
>> environment is not nested. And we should to support other PL than PLpgSQL
>> more strongly.
>>
>> There is not too much other possibilities - the variable that should be
>> accessed from different PL, different procedures (in time) should to live
>> somewhere over PL, and there is the schema only.
>>
>> The variable can be created by CREATE statement:
>>
>> CREATE VARIABLE public.myvar AS integer;
>> CREATE VARIABLE myschema.myvar AS mytype;
>>
>> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type
>>   [ DEFAULT expression ] [[NOT] NULL]
>>   [ ON TRANSACTION END { RESET | DROP } ]
>>   [ { VOLATILE | STABLE } ];
>>
>> It is dropped by command DROP VARIABLE  [ IF EXISTS] varname.
>>
>> The access rights is controlled by usual access rights - by commands
>> GRANT/REVOKE. The possible rights are: READ, WRITE
>>
>> The variables can be modified by SQL command SET (this is taken from
>> standard, and it natural)
>>
>> SET varname = expression;
>>
>> Unfortunately we use the SET command for different purpose. But I am
>> thinking so we can solve it with few tricks. The first is moving our GUC to
>> pg_catalog schema. We can control the strictness of SET command. In one
>> variant, we can detect custom GUC and allow it, in another we can disallow
>> a custom GUC and allow only schema variables. A new command LET can be
>> alternative.
>>
>> The variables should be used in queries implicitly (without JOIN)
>>
>> SELECT varname;
>>
>> The SEARCH_PATH is used, when varname is located. The variables can be
>> used everywhere where query parameters are allowed.
>>
>> I hope so this proposal is good enough and simple.
>>
>> Comments, notes?
>>
>
>
> I have a question on this.  Since one can issue set commands on arbitrary
> settings (and later ALTER database/role/system on settings you have created
> in the current session) I am wondering how much overlap there is between a
> sort of extended GUC with custom settings and variables.
>
> Maybe it would be simpler to treat variables and GUC settings to be
> similar and see what can be done to extend GUC in this way?
>
> I mean if instead we allowed restricting SET to known settings then we
> could have a CREATE SETTING command which would behave like this and then
> use SET the same way across both.
>
> In essence I am wondering if this really needs to be as separate from GUC
> as you are proposing.
>
> If done this way then:
>
> 1.  You could issue grant or revoke on GUC settings, allowing some users
> but not others to set things like work_mem for their queries
> 2.  You could specify allowed types in custom settings.
> 3.  In a subsequent stage you might be able to SELECT  INTO
> setting_name FROM ;  allowing access to setting writes based on queries.
>
>
The creating database objects and necessary infrastructure is the most
simple task of this project. I'll be more happy if there are zero
intersection because variables and GUC are designed for different purposes.
But due SET keyword the intersection there is.

When I thinking about it, I have only one, but important reason, why I
prefer design new type of database object -the GUC are stack based with
different default granularity - global, database, user, session, function.
This can be unwanted behave for variables - it can be source of hard to
detected bugs. I afraid so this behave can be too messy for usage as
variables.

@1 I have not clean opinion about it - not sure if rights are good enough -
probably some user limits can be more practical - but can be hard to choose
result when some user limits and GUC will be against
@2 With variables typed custom GUC are not necessary
@3 Why you need it? It is possible with 

Re: [HACKERS] Parallel safety for extern params

2017-10-28 Thread Amit Kapila
On Sat, Oct 28, 2017 at 2:02 AM, Robert Haas  wrote:
> On Mon, Oct 16, 2017 at 12:59 PM, Amit Kapila  wrote:
> When I tried back-porting the patch to v10 I discovered that the
> plpgsql changes conflict heavily and that ripping them out all the way
> produces regression failures under force_parallel_mode.  I think I see
> why that's happening but it's not obvious how to me how to adapt
> b73f1b5c29e0ace5a281bc13ce09dea30e1b66de to the v10 code.  Do you want
> to have a crack at it or should we just leave this as a master-only
> fix?
>

I think we need to make changes in exec_simple_recheck_plan to make
the behavior similar to HEAD.  With the attached patch, all tests
passed with force_parallel_mode.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


fix_parallel_safety_for_extern_params_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema variables

2017-10-28 Thread Chris Travers
On Thu, Oct 26, 2017 at 9:21 AM, Pavel Stehule 
wrote:

> Hi,
>
> I propose a  new database object - a variable. The variable is persistent
> object, that holds unshared session based not transactional in memory value
> of any type. Like variables in any other languages. The persistence is
> required for possibility to do static checks, but can be limited to session
> - the variables can be temporal.
>
> My proposal is related to session variables from Sybase, MSSQL or MySQL
> (based on prefix usage @ or @@), or package variables from Oracle (access
> is controlled by scope), or schema variables from DB2. Any design is coming
> from different sources, traditions and has some advantages or
> disadvantages. The base of my proposal is usage schema variables as session
> variables for stored procedures. It should to help to people who try to
> port complex projects to PostgreSQL from other databases.
>
> The Sybase  (T-SQL) design is good for interactive work, but it is weak
> for usage in stored procedures - the static check is not possible. Is not
> possible to set some access rights on variables.
>
> The ADA design (used on Oracle) based on scope is great, but our
> environment is not nested. And we should to support other PL than PLpgSQL
> more strongly.
>
> There is not too much other possibilities - the variable that should be
> accessed from different PL, different procedures (in time) should to live
> somewhere over PL, and there is the schema only.
>
> The variable can be created by CREATE statement:
>
> CREATE VARIABLE public.myvar AS integer;
> CREATE VARIABLE myschema.myvar AS mytype;
>
> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type
>   [ DEFAULT expression ] [[NOT] NULL]
>   [ ON TRANSACTION END { RESET | DROP } ]
>   [ { VOLATILE | STABLE } ];
>
> It is dropped by command DROP VARIABLE  [ IF EXISTS] varname.
>
> The access rights is controlled by usual access rights - by commands
> GRANT/REVOKE. The possible rights are: READ, WRITE
>
> The variables can be modified by SQL command SET (this is taken from
> standard, and it natural)
>
> SET varname = expression;
>
> Unfortunately we use the SET command for different purpose. But I am
> thinking so we can solve it with few tricks. The first is moving our GUC to
> pg_catalog schema. We can control the strictness of SET command. In one
> variant, we can detect custom GUC and allow it, in another we can disallow
> a custom GUC and allow only schema variables. A new command LET can be
> alternative.
>
> The variables should be used in queries implicitly (without JOIN)
>
> SELECT varname;
>
> The SEARCH_PATH is used, when varname is located. The variables can be
> used everywhere where query parameters are allowed.
>
> I hope so this proposal is good enough and simple.
>
> Comments, notes?
>


I have a question on this.  Since one can issue set commands on arbitrary
settings (and later ALTER database/role/system on settings you have created
in the current session) I am wondering how much overlap there is between a
sort of extended GUC with custom settings and variables.

Maybe it would be simpler to treat variables and GUC settings to be similar
and see what can be done to extend GUC in this way?

I mean if instead we allowed restricting SET to known settings then we
could have a CREATE SETTING command which would behave like this and then
use SET the same way across both.

In essence I am wondering if this really needs to be as separate from GUC
as you are proposing.

If done this way then:

1.  You could issue grant or revoke on GUC settings, allowing some users
but not others to set things like work_mem for their queries
2.  You could specify allowed types in custom settings.
3.  In a subsequent stage you might be able to SELECT  INTO
setting_name FROM ;  allowing access to setting writes based on queries.



> regards
>
> Pavel
>
>
>


-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-10-28 Thread Pavel Stehule
Hi

2017-09-22 21:31 GMT+02:00 Pavel Stehule :

>
>
> 2017-09-22 21:12 GMT+02:00 Peter Eisentraut  com>:
>
>> On 9/22/17 09:16, Pavel Stehule wrote:
>> > Example: somebody set SORT_COLUMNS to schema_name value. This is
>> > nonsense for \l command
>> >
>> > Now, I am thinking so more correct and practical design is based on
>> > special mode, activated by variable
>> >
>> > PREFER_SIZE_SORT .. (off, asc, desc)
>> >
>> > This has sense for wide group of commands that can show size. And when
>> > size is not visible, then this option is not active.
>>
>> Maybe this shouldn't be a variable at all.  It's not like you'll set
>> this as a global preference.  You probably want it for one command only.
>>  So a per-command option might make more sense.
>>
>
> Sure, I cannot to know, what users will do. But, when I need to see a size
> of objects, then I prefer the sort by size desc every time. If I need to
> find some object, then I can to use a searching in pager. So in my case,
> this settings will be in psqlrc. In GoodData we used years own
> customization - the order by size was hardcoded and nobody reported me any
> issue.
>
> Alexander proposed some per command option, but current syntax of psql
> commands don't allows some simple parametrization. If it can be user
> friendly, then it should be short. From implementation perspective, it
> should be simply parsed. It should be intuitive too - too much symbols
> together is not good idea.
>
> Maybe some prefix design - but it is not design for common people
> (although these people don't use psql usually)
>
> '\sort size \dt ?
>
> \dt:sort_by_size
> \dt+:sort_by_size ?
>
> I don't see any good design in this direction
>
>
I though about Alexander proposal, and I am thinking so it can be probably
best if we respect psql design. I implemented two command suffixes
(supported only when it has sense) "s" sorted by size and "d" as descent

so list of tables can be sorted with commands:

\dt+sd (in this case, the order is not strict), so command
\dtsd+ is working too (same \disd+ or \di+sd)

These two chars are acceptable. Same principle is used for \l command

\lsd+ or \l+sd

What do you think about it?

Regards

Pavel



> Regards
>
> Pavel
>
>
>
>
>
>
>
>>
>> --
>> Peter Eisentraut  http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 041b5e0c87..548b0d8d41 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -349,8 +349,9 @@ exec_command(const char *cmd,
 		status = exec_command_include(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "if") == 0)
 		status = exec_command_if(scan_state, cstack, query_buf);
-	else if (strcmp(cmd, "l") == 0 || strcmp(cmd, "list") == 0 ||
-			 strcmp(cmd, "l+") == 0 || strcmp(cmd, "list+") == 0)
+	else if (strcmp(cmd, "list") == 0 || strcmp(cmd, "list+") == 0 ||
+			 strcmp(cmd, "l") == 0 || strncmp(cmd, "l+", 2) == 0 ||
+			 strncmp(cmd, "ls", 2) == 0)
 		status = exec_command_list(scan_state, active_branch, cmd);
 	else if (strncmp(cmd, "lo_", 3) == 0)
 		status = exec_command_lo(scan_state, active_branch, cmd);
@@ -702,7 +703,9 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	{
 		char	   *pattern;
 		bool		show_verbose,
-	show_system;
+	show_system,
+	sort_size,
+	sort_desc;
 
 		/* We don't do SQLID reduction on the pattern yet */
 		pattern = psql_scan_slash_option(scan_state,
@@ -711,6 +714,9 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 		show_verbose = strchr(cmd, '+') ? true : false;
 		show_system = strchr(cmd, 'S') ? true : false;
 
+		sort_size = false;
+		sort_desc = false;
+
 		switch (cmd[1])
 		{
 			case '\0':
@@ -720,7 +726,8 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	success = describeTableDetails(pattern, show_verbose, show_system);
 else
 	/* standard listing of interesting things */
-	success = listTables("tvmsE", NULL, show_verbose, show_system);
+	success = listTables("tvmsE", NULL, show_verbose, show_system,
+		 false, false);
 break;
 			case 'A':
 success = describeAccessMethods(pattern, show_verbose);
@@ -789,12 +796,19 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 success = describeTypes(pattern, show_verbose, show_system);
 break;
 			case 't':
-			case 'v':
 			case 'm':
 			case 'i':
+if (strlen(cmd) >= 2)
+{
+	sort_size = strchr([2], 's') ? true : false;
+	sort_desc = strchr([2], 'd') ? true : false;
+}
+
+			case 'v':
 			case 's':
 			case 'E':
-success = listTables([1], pattern, show_verbose, show_system);
+success = listTables([1], pattern, show_verbose, show_system,
+	 sort_size, sort_desc);
 break;
 			case 'r':
 if (cmd[2] == 'd' && 

[HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-10-28 Thread Chris Travers
Hi;

There are still some cleanup bits needed here but I wanted to get feedback
on my general direction.

I hope to submit for commit fest soon if the general feedback is good.
Tests are passing (with adjustments intended for change of behaviour in one
test script).  I want to note that Crimson Thompson (also with Adjust) has
provided some valuable discussion on this code.

The Problem:

pg_rewind makes no real guarantees regarding the final state of non-data
files or directories.  It.checks to see if the timeline has incremented
(and therefore guarantees that if successful the data directories are on
the same timeline) but for non-data files, these are clobbered if we rewind
and left intact if not.  These other files include postgresql.auto.conf,
replication slots, and can include log files.

Copying logs over to the new slave is something one probably never wants to
do (same with replication slots), and the behaviours here can mask
troubleshooting regarding what a particular master failed, cause wal
segments to build up, automatic settings changes, and other undesirable
behaviours.  Together these make pg_rewind very difficult to use properly
and push tasks to replication management tooling that the management tools
are not in a good position to handle correctly.

Designing the Fix:

Two proposed fixes have been considered and one selected:  Either we
whitelist directories and only rewind those.  The other was to allow shell
globs to be provided that could be used to exclude files.  The whitelisting
solution was chosen for a number of reasons.

When pg_rewind "succeeds" but not quite correctly the result is usually a
corrupted installation which requires a base backup to replace it anyway.
In a recovery situation, sometimes pressures occur which render human
judgment less effective, and therefore glob exclusion strikes me as
something which would probably do more harm than good, but maybe I don't
understand the use case (comments as to why some people look at the other
solution as preferable would be welcome).

In going with the whitelisting solution, we chose to include all
directories with WAL-related information.This allows more predicable
interaction with other parts of the replication chain.  Consequently not
only do we copy pg_wal and pg_xact but also commit timestamps and so forth.

The Solution:

The solution is a whitelist of directories specified which are the only
ones which are synchronised.  The relevant part of this patch is:

+/* List of directories to synchronize:

+ * base data dirs (and ablespaces)

+ * wal/transaction data

+ * and that is it.

+ *

+ * This array is null-terminated to make

+ * it easy to expand

+ */

+

+const char *rewind_dirs[] = {

+"base",

+"global",

+"pg_commit_ts",

+"pg_logical",

+"pg_multixact",

+"pg_serial",

+"pg_subtrans",

+"pg_tblspc",

+"pg_twophase",

+"pg_wal",

+"pg_xact",

+NULL

+};


From there we iterate over this array for directory-based approaches in
copy_fetch.c and with one query per directory in libpq_fetch.c.  This also
means shifting from the basic interface from PQexec to PQexecParams.  It
would be possible to move to binary formats too, but this was not done
currently in order to simplify code review (that could be a separate
independent patch at a later time).

Testing Done:

The extra files tests correctly test this change in behaviour.  The tests
have been modified, and diagnostics in cases of failures expanded, in this
case.  The other tests provide good coverage of whether pg_rewind does what
it is supposed to do.

Cleanup still required:

I accidentally left Carp::Always in the PM in this perl module.  It will be
fixed.

I renamed one of the functions used to have a more descriptive name but
currently did not remove the old function yet.


Feedback is very welcome.  pg_rewind is a very nice piece of software.  I
am hoping that these sorts of changes will help ensure that it is easier to
use and provides more predictable results.
-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


pg_rewind_restrict.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] MERGE SQL Statement for PG11

2017-10-28 Thread Simon Riggs
On 28 October 2017 at 00:31, Michael Paquier  wrote:
> On Fri, Oct 27, 2017 at 9:00 AM, Robert Haas  wrote:
>> On Fri, Oct 27, 2017 at 4:41 PM, Simon Riggs  wrote:
>>> I didn't say it but my intention was to just throw an ERROR if no
>>> single unique index can be identified.
>>>
>>> It could be possible to still run MERGE in that situaton but we would
>>> need to take a full table lock at ShareRowExclusive. It's quite likely
>>> that such statements would throw duplicate update errors, so I
>>> wouldn't be aiming to do anything with that for PG11.
>>
>> Like Peter, I think taking such a strong lock for a DML statement
>> doesn't sound like a very desirable way forward.  It means, for
>> example, that you can only have one MERGE in progress on a table at
>> the same time, which is quite limiting.  It could easily be the case
>> that you have multiple MERGE statements running at once but they touch
>> disjoint groups of rows and therefore everything works.  I think the
>> code should be able to cope with concurrent changes, if nothing else
>> by throwing an ERROR, and then if the user wants to ensure that
>> doesn't happen by taking ShareRowExclusiveLock they can do that via an
>> explicit LOCK TABLE statement -- or else they can prevent concurrency
>> by any other means they see fit.
>
> +1, I would suspect users to run this query in parallel of the same
> table for multiple data sets.
>
> Peter has taken some time to explain me a bit his arguments today, and
> I agree that it does not sound much appealing to have constraint
> limitations for MERGE. Particularly using the existing ON CONFLICT
> structure gets the feeling of having twice a grammar for what's
> basically the same feature, with pretty much the same restrictions.
>
> By the way, this page sums up nicely the situation about many
> implementations of UPSERT taken for all systems:
> https://en.wikipedia.org/wiki/Merge_(SQL)

That Wikipedia article is badly out of date and regrettably does NOT
sum up the current situation nicely any more since MERGE has changed
in definition in SQL:2011 since its introduction in SQL:2003.

I'm proposing a MERGE statement for PG11 that
i) takes a RowExclusiveLock on rows, so can be run concurrently
ii) uses the ON CONFLICT infrastructure to do that
and so requires a unique constraint.

The above is useful behaviour that will be of great benefit to
PostgreSQL users. There are no anomalies remaining.

SQL:2011 specifically states "The extent to which an
SQL-implementation may disallow independent changes that are not
significant is implementation-defined”, so in my reading the above
behaviour would make us fully spec compliant. Thank you to Peter for
providing the infrastructure on which this is now possible for PG11.

Serge puts this very nicely by identifying two different use cases for MERGE.

Now, I accept that you might also want a MERGE statement that
continues to work even if there is no unique constraint, but it would
need to have different properties to the above. I do not in any way
argue against adding that. I also agree that adding RETURNING at a
later stage would be fine as well. I am proposing that those and any
other additional properties people come up with can be added in later
releases once we have the main functionality in core in PG11.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Fix typo in blvacuum.c

2017-10-28 Thread Robert Haas
On Tue, Oct 17, 2017 at 3:48 AM, Seki, Eiji  wrote:
> I found a typo in comment of blvacuum.c, so attach the patch for it.

Thanks.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Parallel Append implementation

2017-10-28 Thread Robert Haas
On Thu, Oct 19, 2017 at 9:05 AM, Amit Khandekar  wrote:
>> + *ExecAppendEstimate
>> + *
>> + *estimates the space required to serialize Append node.
>>
>> Ugh, this is wrong, but I notice it follows various other
>> equally-wrong comments for other parallel-aware node types. I guess
>> I'll go fix that.  We are not in serializing the Append node.
>
> I didn't clealy get this. Do you think it should be "space required to
> copy the Append node into the shared memory" ?

No, because the Append node is *NOT* getting copied into shared
memory.  I have pushed a comment update to the existing functions; you
can use the same comment for this patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-10-28 Thread Robert Haas
On Fri, Oct 27, 2017 at 1:01 PM, Jeevan Chalke
 wrote:
> 1. Added separate patch for costing Append node as discussed up-front in the
> patch-set.
> 2. Since we now cost Append node, we don't need
> partition_wise_agg_cost_factor
> GUC. So removed that. The remaining patch hence merged into main
> implementation
> patch.
> 3. Updated rows in test-cases so that we will get partition-wise plans.

With 0006 applied, cost_merge_append() is now a little bit confused:

/*
 * Also charge a small amount (arbitrarily set equal to operator cost) per
 * extracted tuple.  We don't charge cpu_tuple_cost because a MergeAppend
 * node doesn't do qual-checking or projection, so it has less overhead
 * than most plan nodes.
 */
run_cost += cpu_operator_cost * tuples;

/* Add MergeAppend node overhead like we do it for the Append node */
run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;

The first comment says that we don't add cpu_tuple_cost, and the
second one then adds half of it anyway.

I think it's fine to have a #define for DEFAULT_APPEND_COST_FACTOR,
because as you say it's used twice, but I don't think that should be
exposed in cost.h; I'd make it private to costsize.c and rename it to
something like APPEND_CPU_COST_MULTIPLIER.  The word DEFAULT, in
particular, seems useless to me, since there's no provision for it to
be overridden by a different value.

What testing, if any, can we think about doing with this plan to make
sure it doesn't regress things?  For example, if we do a TPC-H run
with partitioned tables and partition-wise join enabled, will any
plans change with this patch?  Do they get faster or not?  Anyone have
other ideas for what to test?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Reading timeline from pg_control on replication slave

2017-10-28 Thread Andrey Borodin
Hi, Michael!

Thank you very much for these comments!

> 28 окт. 2017 г., в 3:09, Michael Paquier  
> написал(а):
> 
> ThisTimeLineID is not something you can rely on for standby backends
> as it is not set during recovery. That's the reason behind
> pg_walfile_name disabled during recovery. There are three things
> popping on top of my mind that one could think about:
> 1) Backups cannot be completed when started on a standby in recovery
> and when stopped after the standby has been promoted, meaning that its
> timeline has changed.
> 2) After a standby has been promoted, by using pg_start_backup, you
> issue a checkpoint which makes sure that the control file gets flushed
> with the new information, so when pg_start_backup returns to the
> caller you should have the correct timeline number because the outer
> function gets evaluated last.
> 3) Backups taken from cascading standbys, where a direct parent has
> been promoted.
> 
> 1) and 2) are actually not problems per the restrictions I am giving
> above, but 3) is. If I recall correctly, when a streaming standby does
> a timeline jump, a restart point is not immediately generated, so you
> could have the timeline on the control file not updated to the latest
> timeline value, meaning that you could have the WAL file name you use
> here referring to a previous timeline and not the newest one.
> 
> In short, yes, what you are doing is definitely risky in my opinion,
> particularly for complex cascading setups.

We are using TimeLineId from pg_control only to give a name to backup. Slightly 
stale timeline Id will not incur significant problems as long as pg_control is 
picked up after backup finalization.

But from your words I see that the safest option is to check timeline from 
pg_control after start and after stop. If this timelines differ - invalidate 
backup entirely. This does not seem too hard condition for invalidation, does 
it?

Best regards, Andrey Borodin.

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding table_constraint description in ALTER TABLE synopsis

2017-10-28 Thread Robert Haas
On Thu, Oct 26, 2017 at 3:23 PM, Lætitia Avrot  wrote:
> In documentation, I've found that table_constraint is used in the ALTER
> TABLE synopsis but that definition of table_constraint is missing, so I
> submitted bug #14873.
>
> I found the table_constraint definition in the CREATE TABLE synopsis and I
> just copied/pasted it on the ALTER TABLE synopsis.
>
> The patch should apply to MASTER.I build and tested it successfully on my
> computer.
>
> There shouldn't be any platform-specific content.
>
> You will find enclosed my patch. I tried my best to follow instructions on
> how to submit a patch.

I'd say you did a good job.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Typos in src/backend/optimizer/README

2017-10-28 Thread Robert Haas
On Fri, Oct 27, 2017 at 9:34 AM, Etsuro Fujita
 wrote:
> This sentence in the section of Partition-wise joins in
> src/backend/optimizer/README should be fixed: "This technique of breaking
> down a join between partition tables into join between their partitions is
> called partition-wise join."
>
> (1) s/a join between partition tables/a join between partitioned tables/
> (2) s/join between their partitions/joins between their partitions/
>
> It might be okay to leave #2 as-is, but I'd like to propose to change that
> way to make the meaning clear.

I think you are right.  I have committed the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] unsafe tuple releasing in get_default_partition_oid

2017-10-28 Thread Robert Haas
On Sat, Oct 28, 2017 at 10:03 AM, Julien Rouhaud  wrote:
> I just noticed that get_default_partition_oid() tries to release the
> tuple even if it isn't valid.
> Trivial patch attached.

Oops.  I wonder how that managed to survive testing.

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] unsafe tuple releasing in get_default_partition_oid

2017-10-28 Thread Julien Rouhaud
Hi,

I just noticed that get_default_partition_oid() tries to release the
tuple even if it isn't valid.
Trivial patch attached.
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 07fdf66c38..66ec214e02 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -2831,9 +2831,9 @@ get_default_partition_oid(Oid parentId)
 
part_table_form = (Form_pg_partitioned_table) GETSTRUCT(tuple);
defaultPartId = part_table_form->partdefid;
+   ReleaseSysCache(tuple);
}
 
-   ReleaseSysCache(tuple);
return defaultPartId;
 }
 

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers