Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Tsunakawa, Takayuki
From: pgsql-hackers-ow...@postgresql.org
> [mailto:pgsql-hackers-ow...@postgresql.org] On Behalf Of Vladimir
> Sitnikov
> Tsunakawa> PgJDBC has supported the feature with autosave parameter only
> Tsunakawa> recently
> 
> PgJDBC has the implementation for more than a year (REL9.4.1210, 2016-09-07,
> see https://github.com/pgjdbc/pgjdbc/pull/477 )

And I heard from someone in PgJDBC community that the autosave parameter was 
not documented in the manual for a while, which I confirmed.  So the 
statement-level rollback is newer to users, isn't it?


> The performance overhead for "SELECT" statement (no columns, just select)
> statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined along
> with user-provided query). That is network overhead is close to negligible.

That's good news, because it also means that the overhead of creating a 
savepoint is negligible.



> As far as I understand, the main problem with savepoints is they would
> consume memory even in case the same savepoint is reassigned again and again.
> In other words, "savepoint; insert;savepoint; insert;savepoint;
> insert;savepoint; insert;savepoint; insert;" would allocate xids and might
> blow up backend's memory.
> I see no way driver can workaround that, so it would be great if backend
> could release memory or provide a way to do so.

Doesn't PgJDBC execute RELEASE after each SQL statement?  That said, even with 
RELEASE, the server memory bloat is not solved.  The current server 
implementation allocates a memory chunk of 8KB called CurTranContext for each 
subtransaction, and retains them until the end of top-level transaction.  
That's another (separate) issue to address.

Regards
Takayuki Tsunakawa




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Craig Ringer
On 2 November 2017 at 13:59, Vladimir Sitnikov
 wrote:

> The performance overhead for "SELECT" statement (no columns, just select)
> statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined along
> with user-provided query). That is network overhead is close to negligible.

Yep. Not for psqlODBC or other libpq-based drives that can't pipeline
queries though.

> In other words, "savepoint; insert;savepoint; insert;savepoint;
> insert;savepoint; insert;savepoint; insert;" would allocate xids and might
> blow up backend's memory.

RELEASE SAVEPOINT, like psqlODBC does.

> Adding protocol messages would blow pgbouncer, etc things, so it makes sense
> to refrain from new messages unless it is absolutely required.

Yeah, it'd affect proxies, true. But it'd let us get rid of a lot of
very ugly log spam too. And unlike some of the prior protocol tweaks
I've been interested in, it'd be client-initiated so it should be
pretty safe.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Vladimir Sitnikov
Tsunakawa> PgJDBC has supported the feature with autosave parameter only
recently

PgJDBC has the implementation for more than a year (REL9.4.1210,
2016-09-07, see https://github.com/pgjdbc/pgjdbc/pull/477 )

Tsunakawa> The point raised in this thread was that that creates
Tsunakawa> too much network overhead, so a backend-based solution would be
preferable.
Tsunakawa> We haven't seen any numbers or other evidence to quantify that
claim, so
Tsunakawa> maybe it's worth looking into that some more

The performance overhead for "SELECT" statement (no columns, just select)
statement over localhost is 36±4 us vs 38±3 us (savepoint is pipelined
along with user-provided query). That is network overhead is close to
negligible.

As far as I understand, the main problem with savepoints is they would
consume memory even in case the same savepoint is reassigned again and
again.
In other words, "savepoint; insert;savepoint; insert;savepoint;
insert;savepoint; insert;savepoint; insert;" would allocate xids and might
blow up backend's memory.
I see no way driver can workaround that, so it would be great if backend
could release memory or provide a way to do so.

Adding protocol messages would blow pgbouncer, etc things, so it makes
sense to refrain from new messages unless it is absolutely required.

Vladimir


Re: [HACKERS] Patch: add --if-exists to pg_recvlogical

2017-11-01 Thread Catalin Iacob
On Fri, Sep 29, 2017 at 3:06 PM, Peter Eisentraut
 wrote:
> On 9/22/17 15:31, Peter Eisentraut wrote:
>> I suggest you create a patch that focuses on the --create part.
>>
>> You can create a separate follow-on patch for the --start part if you
>> wish, but I think that that patch would be rejected.
>
> I have moved this entry to the next commit fest, awaiting your updated
> patch.

I'm looking for simple patches to review so doing some gardening. Per
Peter's message, moved this to Waiting on author.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Tsunakawa, Takayuki
From: Craig Ringer [mailto:cr...@2ndquadrant.com]
> The example often cited is some variant of
> 
> BEGIN;
> CREATTE TABLE t2 AS SELECT * FROM t1;
> DROP TABLE t1;
> ALTER TABLE t2 RENAME TO t1;
> COMMIT;
> 
> Right now, we do the right thing here. With default statement level rollback,
> you just dropped t1 and all your data. oops.

That's a horrible example.  So I think the default behavior should be what it 
is now for existing PostgreSQL users.


> On a related note, psql's -v ON_ERROR_STOP=1 is horrible and hard to discover
> UI, and one of the top FAQs on Stack Overflow is some variant of "I'm getting
> random and incomprehensible errors restoring a dump, wtf?". So I'd really
> love to make it the default, but we'd face similar issues where a SQL script
> that's currently correct instead produces dangerously wrong results with
> ON_ERROR_STOP=1 .

Yes.  And although unrelated, psql's FETCH_SIZE is also often invisible to 
users.  They report out-of-memory trouble when they do SELECT on a large table 
with psql.



> What about if we add protocol-level savepoint support? Two new messages:
> 
> BeginUnnamedSavepoint
> 
> and
> 
> EndUnnamedSavepoint
> 
> where the latter does a rollback-to-last-unnamed-savepoint if the txn state
> is bad, or a release-last-unnamed-savepoint if the txn state is ok. That
> means the driver doesn't have to wait for the result of the statement. It
> knows the conn state and query outcome from our prior messages, and knows
> that as a result of this message any failed state has been rolled back.
> 
> This would, with appropriate libpq support, give people who want statement
> level error handling pretty much what they want. And we could expose it
> in psql too. No GUCs needed, no fun surprises for apps. psqlODBC could adopt
> it to replace its current slow and super-log-spammy statement rollback
> model.
> 
> Downside is that it needs support in each client driver.

Yes, I believe we should avoid the downside.  It's tough to develop and 
maintain a client driver, so we should minimize the burdon with server-side 
support.

Regards
Takayuki Tsunakawa



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Tsunakawa, Takayuki
From: Peter Eisentraut [mailto:peter.eisentr...@2ndquadrant.com]
> The difference is how error recovery works.  So this will necessarily be
> tied to how the client code or other surrounding code is structured or what
> the driver or framework is doing in the background to manage transactions.
> It would also be bad if client code was not prepared for this new behavior,
> reported the transaction as complete while some commands in the middle were
> omitted.
> 
> Drivers can already achieve this behavior and do do that by issuing savepoint
> commands internally.  The point raised in this thread was that that creates
> too much network overhead, so a backend-based solution would be preferable.
> We haven't seen any numbers or other evidence to quantify that claim, so
> maybe it's worth looking into that some more.
> 
> In principle, a backend-based solution that drivers just have to opt into
> would save a lot of duplication.  But the drivers that care or require it
> according to their standards presumably already implement this behavior
> in some other way, so it comes back to whether there is a performance or
> other efficiency gain here.
> 
> Another argument was that other SQL implementations have this behavior.
> This appears to be the case.  But as far as I can tell, it is also tied
> to their particular interfaces and the structure and flow control they
> provide.  So a client-side solution like psql already provides or something
> in the various drivers would work just fine here.
> 
> So my summary for the moment is that a GUC or similar run-time setting might
> be fine, with appropriate explanation and warnings.  But it's not clear
> whether it's worth it given the existing alternatives.

I can think of four reasons why the server-side support is necessary or 
desirable.

First, the server log could be filled with SAVEPOINT and RELEASE lines when you 
need to investigate performance or audit activity.

Second, the ease of use for those who migrate from other DBMSs.  With the 
server-side support, only the DBA needs to be aware of the configuration in 
postgresql.conf.  Other people don't need to be aware of the client-side 
parameter when they deploy applications.

Third, lack of server-side support causes trouble to driver developers.  In a 
recent discussion with the psqlODBC committer, he had some trouble improving or 
fixing the statement-rollback support.  Npgsql doesn't have the 
statement-rollback yet.  PgJDBC has supported the feature with autosave 
parameter only recently.  Do the drivers for other languages like Python, Go, 
JavaScript have the feature?  We should reduce the burdon on the driver 
developers.

Fourth, the runtime performance.  In a performance benchmark of one of our 
customers, where a batch application ran 1.5 or 5 million small SELECTs with 
primary key access, the execution time of the whole batch became shorter by 
more than 30% (IIRC) when the local connection was used instead of the remote 
TCP/IP one.  The communication overhead is not small.

Also, in the PostgreSQL documentation, the communication overhead is treated 
seriously as follows:


https://www.postgresql.org/docs/devel/static/plpgsql-overview.html#plpgsql-advantages

[Excerpt]
--
That means that your client application must send each query to the database 
server, wait for it to be processed, receive and process the results, do some 
computation, then send further queries to the server. All this incurs 
interprocess communication and will also incur network overhead if your client 
is on a different machine than the database server.

With PL/pgSQL you can group a block of computation and a series of queries 
inside the database server, thus having the power of a procedural language and 
the ease of use of SQL, but with considerable savings of client/server 
communication overhead.


•Extra round trips between client and server are eliminated


•Intermediate results that the client does not need do not have to be marshaled 
or transferred between server and client


•Multiple rounds of query parsing can be avoided


This can result in a considerable performance increase as compared to an 
application that does not use stored functions.
--


Craig reports the big communication overhead:

PATCH: Batch/pipelining support for libpq
https://www.postgresql.org/message-id/CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=b4dm...@mail.gmail.com#CAMsr+YFUjJytRyV4J-16bEoiZyH=4nj+sQ7JP9ajwz=b4dm...@mail.gmail.com

Re: foreign table batch insert
https://www.postgresql.org/message-id/CAMsr+YFgDUiJ37DEfPRk8WDBuZ58psdAYJd8iNFSaGxtw=w...@mail.gmail.com

[Excerpt]
--
The time difference for 10k inserts on the local host over a unix socket
shows a solid improvement:

batch insert elapsed:  0.244293s
sequential insert elapsed: 0.375402s

... but over, say, a connection to a random AW

Re: [HACKERS] Account for cost and selectivity of HAVING quals

2017-11-01 Thread Ashutosh Bapat
On Wed, Nov 1, 2017 at 8:41 PM, Tom Lane  wrote:
> Ashutosh Bapat  writes:
>> On Wed, Nov 1, 2017 at 3:15 AM, Tom Lane  wrote:
>>> here's a patch to fix the planner so that eval costs and selectivity of
>>> HAVING quals are factored into the appropriate plan node numbers.
>>> ...
>>> + /* Add cost of qual, if any --- but we ignore its selectivity */
>
>> And may be we should try to explain why can we ignore selectivity.
>> Similarly for the changes in create_minmaxagg_path().
>
> I'm sure you realize that's because the estimate is already just one
> row ... but sure, we can spell that out.
>

+1. That would be helpful.



-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] How to implement a SP-GiST index as a extension module?

2017-11-01 Thread Connor Wolf
Ok, more questions.

I've been studying the implementation Alexander Korotkov sent, and I'm not
seeing how to map
some of the components onto the changes in the SP-GiST system that occured
between Postgresql 9.2 and 9.3.

The changes at that point seem to have been to change xxx_inner_consistent
and xxx_leaf_consistent from taking
an input containing a Datum pointing to the query conditional to a list of
ScanKeys which each encapsulate one filter condition.

The issue here is that for VP trees (or the BK tree I want to eventually
implement), the filter condition requires two parameters:
 - The maximum allowed distance from the target value
 - The actual target value.

The ScanKeys struct appears to only be able to contain a conditional type,
and a single parameter, such as "less then *x*", "above *y*",
and so forth. Looking at the existing spgkdtreeproc.c and
spgquadtreeproc.c, their mechanism for passing more complex conditions
through to the xxx_consistent functions appears to be to encapsulate the
entire condition in a custom type. For example,
their mechanism for querying if something is within a certain envelope is
done by having the envelope be described
by the "BOX" type.

As such:
Will compound queries as I describe above basically require a custom type
to make it possible? My (admittedly naive) expectation
is that the eventual query for this index will look something like "SELECT
* FROM example_table WHERE indexed_column <=> target_value < 4;",
with "<=>" being the operator for the relevant distance calculation
(hamming, for the BK tree, numeric for the VP-tree).

The existing VP-tree code appears to not support multiple operators
whatsoever, probably because it was very preliminary.

Thanks!
Connor

On Mon, Oct 30, 2017 at 7:04 PM, Connor Wolf <
conn...@imaginaryindustries.com> wrote:

> I was mostly unclear on how I'd go about attaching the extension functions
> to the relevant indexing mechanism. From the looks of the vptree.tar.gz
> file (which is really, *really* helpful, incidentally!), a it's done via a
> custom operator class, which then gets passed to the actual index creation
> mechanism when you're declaring the index.
>
> I think I had looked at that at one point, but it's been a while. In my
> case, I'm using discrete-cosine-transform based perceptual hashes for
> searching. They are nice and compact (64-bits per hash), while still
> producing good search results. I have a dataset of ~36 million images, and
> it does searches in < 50 milliseconds with a hamming distance of 4, while
> touching ~0.25% of the tree (And occupying ~18 GB of ram).
>
> My BK tree is up on github here
> ,
> if anyone needs something like that (BSD licensed, pretty well tested).
> There's also a python wrapper for it.
>
> I'll probably not have time to poke about until this weekend, but thanks!
> Connor
>
>
>
> On Mon, Oct 30, 2017 at 4:50 AM, Alexander Korotkov <
> a.korot...@postgrespro.ru> wrote:
>
>> Hi!
>>
>> On Sun, Oct 29, 2017 at 12:07 PM, Connor Wolf <
>> w...@imaginaryindustries.com> wrote:
>>
>>> I'm looking at implementing a custom indexing scheme, and I've been
>>> having trouble understanding the proper approach.
>>>
>>> Basically, I need a BK tree, which is a tree-structure useful for
>>> indexing arbitrary discrete metric-spaces (in my case, I'm interested in
>>> indexing across the hamming edit-distance of perceptual hashes, for fuzzy
>>> image searching). I'm *pretty* sure a SP-GiST index is the correct
>>> index type, as my tree is intrinsically unbalanced.
>>>
>>
>> Yes, SP-GiST is appropriate index type for BK tree.  I'm pretty sure BK
>> tree could be implemented as SP-GiST opclass.
>> The only thing worrying me is selection pivot values for nodes.  SP-GiST
>> builds by insertion of index tuples on by one.  First pivot value for root
>> node in SP-GIST would be created once first leaf page overflows.  Thus, you
>> would have to select this pivot value basing on very small fraction in the
>> beginning of dataset.
>> As I know, BK tree is most efficient when root pivot value is selected
>> after looking in whole dataset and then hierarchically to subtrees.
>>
>> BTW, did you try my extension for searching similar images.  It's quite
>> primitive, but works for some cases.
>> https://github.com/postgrespro/imgsmlr
>> https://wiki.postgresql.org/images/4/43/Pgcon_2013_similar_images.pdf
>>
>> I have a functional stand-alone implementation of a BK-Tree, and it works
>>> very well, but the complexity of managing what is basically a external
>>> index for my database has reached the point where it's significantly
>>> problematic, and it seems to be it should be moved into the database.
>>>
>>
>> Sure, moving this index to the database is right decision.
>>
>> Anyways, looking at the contents of postgres/src/backend/access/spgist,
>>> it looks pretty straightforward in terms of the actual C implementation,

Re: [HACKERS] Explicit relation name in VACUUM VERBOSE log

2017-11-01 Thread Masahiko Sawada
On Mon, Oct 2, 2017 at 4:33 PM, Daniel Gustafsson  wrote:
>> On 29 Aug 2017, at 17:21, Robert Haas  wrote:
>>
>> On Tue, Aug 22, 2017 at 2:23 AM, Simon Riggs  wrote:
>>> Yes, we can. I'm not sure why you would do this only for VACUUM
>>> though? I see many messages in various places that need same treatment
>>
>> I'm skeptical about the idea of doing this too generally.
>>
>> rhaas=> select * from foo;
>> ERROR:  permission denied for relation foo
>>
>> Do we really want to start saying public.foo in all such error
>> messages?  To me, that's occasionally helpful but more often just
>> useless chatter.
>>
>> One problem with making this behavior optional is that we'd then need
>> two separate translatable strings in every case, one saying "table %s
>> has problems" and the other saying "table %s.%s has problems".  Maybe
>> we could avoid that via some clever trick, but that's how we're doing
>> it in some existing cases.
>>
>> I have a feeling we're making a small patch with a narrow goal into a
>> giant patch for which everyone has a different goal.
>
> Based on the concerns raised in this thread which are left to address or
> resolve, and the fact that the patch has been without update during the CF, 
> I’m
> marking this Returned with Feedback.  Please re-submit a new version of the
> patch to a future commitfest.
>

After more thought, I'd like to focus this work on only log messages
related to VACUUM. The making all logs show explicit names might not
be good solution so we can leave the more generic solution for other
log messages. But especially for VACUUM VERBOSE log, since the log
messages could be very long when the table has multiple indices this
patch would be useful for users.
Since the previous patch conflicts with current HEAD, attached the
updated patch.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


vacuum_more_explicit_relname_v2.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] ucs_wcwidth vintage

2017-11-01 Thread Thomas Munro
Hi hackers,

src/backend/utils/mb/wchar.c contains a ~16 year old wcwidth
implementation that originally arrived in commit df4cba68, but the
upstream code[1] apparently continued evolving and there have been
more Unicode revisions since.  It probably doesn't matter much: the
observation made by Zr40 in the #postgresql IRC channel that lead me
to guess that this code might be responsible is that emojis screw up
psql's formatting, since current terminal emulators recognise them as
double-width but PostgreSQL doesn't.  Still, it's interesting that we
have artefacts deriving from various different frozen versions of the
Unicode standard in the source tree, and that might affect some proper
languages.

🤔

[1] http://www.cl.cam.ac.uk/~mgk25/ucs/wcwidth.c

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] list of credits for release notes

2017-11-01 Thread Peter Eisentraut
On 10/2/17 18:34, Bruce Momjian wrote:
> My smaller question is how will this list be generated in PG 11?  From
> the commit log when the release notes are created, or some other method?

I don't think it should be done at the same time as the release notes.

Specifically, we have recently put an emphasis on having the release
notes ready for beta, and the natural flow of things should be that
there are relatively few changes to the the substance of the release
notes once beta gets rolling.  On the other hand, I have found that a
significant amount of new contributors appear in commit messages during
beta, which also makes some sense.  So making the contributor list
fairly late and then not changing it much is more efficient.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Try to fix endless loop in ecpg with informix mode

2017-11-01 Thread 高增琦
Diff from the head:
(use the following if to decide true or false)

```
diff --git a/src/interfaces/ecpg/ecpglib/data.c
b/src/interfaces/ecpg/ecpglib/data.c
index 5375934..1621e7b 100644
--- a/src/interfaces/ecpg/ecpglib/data.c
+++ b/src/interfaces/ecpg/ecpglib/data.c
@@ -57,8 +57,7 @@ garbage_left(enum ARRAY_TYPE isarray, char **scan_length,
enum COMPAT_MODE compa
 /* skip invalid characters */
 do {
 (*scan_length)++;
-} while (**scan_length != ' ' && **scan_length != '\0' &&
isdigit(**scan_length));
-return false;
+} while (isdigit(**scan_length));
 }

 if (**scan_length != ' ' && **scan_length != '\0')

```

2017-11-02 11:07 GMT+08:00 高增琦 :

> Thanks for commit.
>
> I am afraid the changes may separate "7.a" to "7" and "a", then error out
> with "invalid input syntax for type int" for "a".
>
> How about changes as below? (use following the if to decide true or false)
>
> ```
> -} while (**scan_length != ' ' && **scan_length != '\0');
> -return false;
> +} while (isdigit(**scan_length));
> ```
>
> 2017-11-01 20:35 GMT+08:00 Michael Meskes :
>
>> > Any comments?
>>
>> Sorry, I've been working through the backlog of three weeks of
>> traveling.
>>
>> > > I tried some tests with ecpg informix mode.
>> > > When trying to store float data into a integer var, I got endless
>> > > loop.
>> > >
>> > > The reason is:
>> > > In informix mode, ecpg can accept
>> > > string form of float number when processing query result.
>> > > During checking the string form of float number, it seems
>> > > that ecpg forgot to skip characters after '.'.
>> > > Then outer loop will never stop because it hopes to see '\0'.
>> > >
>> > > The first patch will reproduce the problem in ecpg's regress test.
>> > > The second patch tries to fix it in simple way.
>>
>> Thanks for spotting and fixing. I changed your patch slightly and made
>> it check if the rest of the data is indeed digits, or else it would
>> accept something like "7.hello" as "7".
>>
>> Committed.
>>
>> Michael
>> --
>> Michael Meskes
>> Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
>> Meskes at (Debian|Postgresql) dot Org
>> Jabber: michael at xmpp dot meskes dot org
>> VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
>>
>
>
>
> --
> GaoZengqi
> pgf...@gmail.com
> zengqi...@gmail.com
>



-- 
GaoZengqi
pgf...@gmail.com
zengqi...@gmail.com


Re: [HACKERS] Try to fix endless loop in ecpg with informix mode

2017-11-01 Thread 高增琦
Thanks for commit.

I am afraid the changes may separate "7.a" to "7" and "a", then error out
with "invalid input syntax for type int" for "a".

How about changes as below? (use following the if to decide true or false)

```
-} while (**scan_length != ' ' && **scan_length != '\0');
-return false;
+} while (isdigit(**scan_length));
```

2017-11-01 20:35 GMT+08:00 Michael Meskes :

> > Any comments?
>
> Sorry, I've been working through the backlog of three weeks of
> traveling.
>
> > > I tried some tests with ecpg informix mode.
> > > When trying to store float data into a integer var, I got endless
> > > loop.
> > >
> > > The reason is:
> > > In informix mode, ecpg can accept
> > > string form of float number when processing query result.
> > > During checking the string form of float number, it seems
> > > that ecpg forgot to skip characters after '.'.
> > > Then outer loop will never stop because it hopes to see '\0'.
> > >
> > > The first patch will reproduce the problem in ecpg's regress test.
> > > The second patch tries to fix it in simple way.
>
> Thanks for spotting and fixing. I changed your patch slightly and made
> it check if the rest of the data is indeed digits, or else it would
> accept something like "7.hello" as "7".
>
> Committed.
>
> Michael
> --
> Michael Meskes
> Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
> Meskes at (Debian|Postgresql) dot Org
> Jabber: michael at xmpp dot meskes dot org
> VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL
>



-- 
GaoZengqi
pgf...@gmail.com
zengqi...@gmail.com


Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-01 Thread Craig Ringer
On 1 November 2017 at 01:55, Peter Geoghegan  wrote:

> The problem here is: Iff the first statement uses ON CONFLICT
> infrastructure, doesn't the absence of WHEN NOT MATCHED imply
> different semantics for the remaining updates and deletes in the
> second version of the query? You've removed what seems like a neat
> adjunct to the MERGE, but it actually changes everything else too when
> using READ COMMITTED.

Would these concerns be alleviated by adding some kind of Pg-specific
decoration that constrained concurrency-safe MERGEs?

So your first statement would be

 MERGE CONCURRENTLY ...

and when you removed the WHEN NOT MATCHED clause it'd ERROR because
that's no longer able to be done with the same concurrency-safe
semantics?

I don't know if this would be helpful TBH, or if it would negate
Simon's compatibility goals. Just another idea.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: long transactions on hot standby feedback replica / proof of concept

2017-11-01 Thread Robert Haas
On Tue, Oct 31, 2017 at 2:47 PM, Alexander Korotkov
 wrote:
> However, from user prospective of view, current behavior of
> hot_standby_feedback is just broken, because it both increases bloat and
> doesn't guarantee that read-only query on standby wouldn't be cancelled
> because of vacuum.  Therefore, we should be looking for solution: if one
> approach isn't good enough, then we should look for another approach.
>
> I can propose following alternative approach: teach read-only queries on hot
> standby to tolerate concurrent relation truncation.  Therefore, when
> non-existent heap page is accessed on hot standby, we can know that it was
> deleted by concurrent truncation and should be assumed to be empty.  Any
> thoughts?

Sounds like it might break MVCC?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Craig Ringer
On 2 November 2017 at 09:33, Peter Eisentraut
 wrote:

> If you turned the autocommit setting off, then this code would
> effectively silently do nothing, and that is obviously quite bad.

Right.

The example often cited is some variant of

BEGIN;
CREATTE TABLE t2 AS SELECT * FROM t1;
DROP TABLE t1;
ALTER TABLE t2 RENAME TO t1;
COMMIT;

Right now, we do the right thing here. With default statement level
rollback, you just dropped t1 and all your data. oops.


On a related note, psql's -v ON_ERROR_STOP=1 is horrible and hard to
discover UI, and one of the top FAQs on Stack Overflow is some variant
of "I'm getting random and incomprehensible errors restoring a dump,
wtf?". So I'd really love to make it the default, but we'd face
similar issues where a SQL script that's currently correct instead
produces dangerously wrong results with ON_ERROR_STOP=1 .

> In principle, a backend-based solution that drivers just have to opt
> into would save a lot of duplication.  But the drivers that care or
> require it according to their standards presumably already implement
> this behavior in some other way, so it comes back to whether there is a
> performance or other efficiency gain here.

There definitely would be over SQL-level savepoints. They're horrible
for performance, especially since libpq can't yet pipeline work so you
need three round-trips for each successful statement: SAVEPOINT,
statement, RELEASE SAVEPOINT. It produces massive log spam too.

What about if we add protocol-level savepoint support? Two new messages:

BeginUnnamedSavepoint

and

EndUnnamedSavepoint

where the latter does a rollback-to-last-unnamed-savepoint if the txn
state is bad, or a release-last-unnamed-savepoint if the txn state is
ok. That means the driver doesn't have to wait for the result of the
statement. It knows the conn state and query outcome from our prior
messages, and knows that as a result of this message any failed state
has been rolled back.

This would, with appropriate libpq support, give people who want
statement level error handling pretty much what they want. And we
could expose it in psql too. No GUCs needed, no fun surprises for
apps. psqlODBC could adopt it to replace its current slow and
super-log-spammy statement rollback model.

Because we'd know it was a special savepoint used for statement level
rollback we might still have some optimisation opportunities.

Downside is that it needs support in each client driver.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Dynamic result sets from procedures

2017-11-01 Thread Robert Haas
On Wed, Nov 1, 2017 at 2:38 AM, Peter Eisentraut
 wrote:
> So this is what it can do:
>
> CREATE PROCEDURE pdrstest1()
> LANGUAGE SQL
> AS $$
> DECLARE c1 CURSOR WITH RETURN FOR SELECT * FROM cp_test2;
> DECLARE c2 CURSOR WITH RETURN FOR SELECT * FROM cp_test3;
> $$;
>
> CALL pdrstest1();
>
> and that returns those two result sets to the client.

That seems like it is at least arguably a wire protocol break.  Today,
if you send a string containing only one command, you will only get
one answer.

I'm not saying that makes this change utterly unacceptable or anything
-- but I wonder how much application code it will break, and whether
any steps need to be taken to reduce breakage.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SQL/JSON in PostgreSQL

2017-11-01 Thread Peter Eisentraut
Could someone clarify the status of this patch set?  It has been in
"Waiting" mode since the previous CF and no new patch, just a few
questions from the author.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancements to passwordcheck

2017-11-01 Thread Bossart, Nathan
On 11/1/17, 9:33 PM, "Peter Eisentraut"  
wrote:
> On 9/25/17 14:04, Bossart, Nathan wrote:
>> I'd like to use this thread to gauge community interest in adding this
>> functionality to this module.  If there is interest, I'll add it to the next
>> commitfest.
>
> This thread has no patch, which is not really the intended use of the
> commit fest, so I'm closing the commit fest entry for now.

Sorry about that.  I'll reopen it when I have a patch to share.

Nathan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-11-01 Thread Robert Haas
On Thu, Nov 2, 2017 at 8:01 AM, Craig Ringer  wrote:
> That forces materialization, and I'm guessing part of Tomas's goal
> here is to prevent the need to materialize into a temp table /
> tuplestore / etc.

I get that, but if you're running a query like "SELECT * FROM
bigtable", you don't need parallel query in the first place, because a
single backend is quite capable of sending back the rows as fast as a
client can read them.  If you're running a query like "SELECT * FROM
bigtable WHERE " then that's a good use
case for parallel query, but then materializing it isn't that bad
because the result set is a lot smaller than the original table.

I am not disputing the idea that there are *some* cases where parallel
query is useful and materialization is still undesirable, of course.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Enhancements to passwordcheck

2017-11-01 Thread Peter Eisentraut
On 9/25/17 14:04, Bossart, Nathan wrote:
> I'd like to use this thread to gauge community interest in adding this
> functionality to this module.  If there is interest, I'll add it to the next
> commitfest.

This thread has no patch, which is not really the intended use of the
commit fest, so I'm closing the commit fest entry for now.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-11-01 Thread Craig Ringer
On 2 November 2017 at 10:01, Robert Haas  wrote:

> I think that still leaves a fair number of scenarios to consider, and
> the error handling by itself seems pretty thorny.  Plus it's kind of a
> weird mode and, like Craig, I'm not really sure what it gets you.
> Maybe if somebody has the use case where this would help, they should
> just do:
>
> CREATE TEMP TABLE x AS SELECT * FROM t2 WHERE ...;
> DECLARE x CURSOR FOR SELECT * FROM x;

That forces materialization, and I'm guessing part of Tomas's goal
here is to prevent the need to materialize into a temp table /
tuplestore / etc.

It's not clear to me why an unbounded portal fetch, using the tcp
socket windows and buffers for flow control, isn't sufficient.

Tomas, can you explain the use case a bit more?


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-11-01 Thread Robert Haas
On Wed, Nov 1, 2017 at 6:20 PM, Jeevan Chalke
 wrote:
> Yep.
> But as David reported earlier, if we remove the first part i.e. adding
> cpu_operator_cost per tuple, Merge Append will be preferred over an Append
> node unlike before. And thus, I thought of better having both, but no so
> sure. Should we remove that part altogether, or add both in a single
> statement with updated comments?

I was only suggesting that you update the comments.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup fails on Windows when using tablespace mapping

2017-11-01 Thread Peter Eisentraut
On 11/1/17 10:29, Michael Paquier wrote:
> On Wed, Nov 1, 2017 at 2:24 PM, Peter Eisentraut
>  wrote:
>> Committed to master.  I suppose this should be backpatched?
> 
> Thanks! Yes this should be back-patched.

OK, done for 10, 9.6, and 9.5.

The tablespace mapping feature started in 9.4 (has it been that long
already?), but the canonicalization was only added in 9.5.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-11-01 Thread Robert Haas
On Wed, Nov 1, 2017 at 3:47 AM, Tomas Vondra
 wrote:
> But maybe there's a simpler option - what if we only allow fetches from
> the PARALLEL cursor while the cursor is open? That is, this would work:
>
> BEGIN;
> ...
> DECLARE x PARALLEL CURSOR FOR SELECT * FROM t2 WHERE ...;
> FETCH 1000 FROM x;
> FETCH 1000 FROM x;
> FETCH 1000 FROM x;
> CLOSE x;
> ...
> COMMIT;
>
> but adding any other command between the OPEN/CLOSE commands would fail.
> That should close all the holes with parallel-unsafe stuff, right?

I think that still leaves a fair number of scenarios to consider, and
the error handling by itself seems pretty thorny.  Plus it's kind of a
weird mode and, like Craig, I'm not really sure what it gets you.
Maybe if somebody has the use case where this would help, they should
just do:

CREATE TEMP TABLE x AS SELECT * FROM t2 WHERE ...;
DECLARE x CURSOR FOR SELECT * FROM x;

Since e9baa5e9fa147e00a2466ab2c40eb99c8a700824, that ought to work.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] PATCH: enabling parallel execution for cursors explicitly (experimental)

2017-11-01 Thread Robert Haas
On Wed, Nov 1, 2017 at 7:49 AM, Craig Ringer  wrote:
> If the client wants to fetch in chunks it can use a portal and limited
> size fetches. That shouldn't (?) be parallel-unsafe, since nothing
> else can happen in the middle anyway.

I believe sending a limited-size fetch forces serial execution
currently.  If it's true that nothing else can happen in the middle
then we could relax that, but I don't see why that should be true?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Statement-level rollback

2017-11-01 Thread Peter Eisentraut
On 10/31/17 13:47, MauMau wrote:
> I'm very sorry I couldn't reply to your kind offer.  I rebased the
> patch and will add it to CF 2017/11.  I hope I will complete the patch
> in this CF.

I've been thinking about this a little bit.  Many are worried about
repeating the mistakes of the autocommit feature, so it's worth
comparing that.

The problem with the autocommit setting, or at least the one I remember,
is that code is currently written expecting that

connect
exec SQL statement
disconnect

will succeed in executing and committing the SQL statement, unless an
error is reported.

If you turned the autocommit setting off, then this code would
effectively silently do nothing, and that is obviously quite bad.  So
the autocommit setting would break a large proportion of all code out
there, and was thus not really usable, and hence it was removed.

The proposed statement-level rollback feature works in a slightly
different context.  It does not change when or how a transaction or
transaction block begins and ends.  It only changes what happens inside
explicit transaction blocks.  Considering code like

START TRANSACTION;
SQL1;
SQL2;
SQL3;
COMMIT;

currently an error would cause all subsequent commands to fail.  Under
statement-level rollback, a failed command would effectively be ignored
and the transaction would continue until COMMIT.

Therefore, a successful transaction block would always work the same way
under either setting.

The difference is how error recovery works.  So this will necessarily be
tied to how the client code or other surrounding code is structured or
what the driver or framework is doing in the background to manage
transactions.  It would also be bad if client code was not prepared for
this new behavior, reported the transaction as complete while some
commands in the middle were omitted.

Drivers can already achieve this behavior and do do that by issuing
savepoint commands internally.  The point raised in this thread was that
that creates too much network overhead, so a backend-based solution
would be preferable.  We haven't seen any numbers or other evidence to
quantify that claim, so maybe it's worth looking into that some more.

In principle, a backend-based solution that drivers just have to opt
into would save a lot of duplication.  But the drivers that care or
require it according to their standards presumably already implement
this behavior in some other way, so it comes back to whether there is a
performance or other efficiency gain here.

Another argument was that other SQL implementations have this behavior.
This appears to be the case.  But as far as I can tell, it is also tied
to their particular interfaces and the structure and flow control they
provide.  So a client-side solution like psql already provides or
something in the various drivers would work just fine here.

So my summary for the moment is that a GUC or similar run-time setting
might be fine, with appropriate explanation and warnings.  But it's not
clear whether it's worth it given the existing alternatives.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema variables

2017-11-01 Thread Gilles Darold
Le 01/11/2017 à 05:15, Pavel Stehule a écrit :
>
>
> 2017-10-31 22:28 GMT+01:00 srielau  >:
>
> Pavel,
>
> There is no
> DECLARE TEMP CURSOR
> or
> DECLARE TEMP variable in PLpgSQL
> and
>
>
> sure .. DECLARE TEMP has no sense, I talked about similarity DECLARE
> and CREATE TEMP
>
>
> CREATE TEMP TABLE has a different meaning from what I understand you
> envision for variables.
>
> But maybe I'm mistaken. Your original post did not describe the entire
> syntax:
> CREATE [TEMP] VARIABLE [IF NOT EXISTS] name AS type
>   [ DEFAULT expression ] [[NOT] NULL]
>   [ ON TRANSACTION END { RESET | DROP } ]
>   [ { VOLATILE | STABLE } ];
>
> Especially the TEMP is not spelled out and how its presence affects or
> doesn't ON TRANSACTION END.
> So may be if you elaborate I understand where you are coming from.
>
>
> TEMP has same functionality (and implementation) like our temp tables
> - so at session end the temp variables are destroyed, but it can be
> assigned to transaction.

Oh ok, I understand thanks for the precision.

-- 
Gilles Darold
Consultant PostgreSQL
http://dalibo.com - http://dalibo.org



Re: [HACKERS] index-only count(*) for indexes supporting bitmap scans

2017-11-01 Thread Tom Lane
Alexander Kuzmenkov  writes:
> On 04.09.2017 15:17, Alexey Chernyshov wrote:
>> make check-world fails on contrib/postgres_fdw because of Subquery Scan on 
>> ... Probably, query plan was changed.

> Thanks for testing! This is the same problem as the one in 
> 'select_distinct' I mentioned before. I changed the test, the updated 
> patch is attached.

I've pushed the executor part of this, but mostly not the planner part,
because I didn't think the latter was anywhere near ready for prime
time: the regression test changes it was causing were entirely bogus.

You had basically two categories of plan changes showing up in the
regression tests.  One was insertion of Subquery Scan nodes where
they hadn't been before; that was because the use_physical_tlist
change broke the optimization that removed no-op Subquery Scans.
I fixed that by narrowing the use_physical_tlist change to apply
only to BitmapHeapPath nodes, which is the only case where there
would be any benefit anyway.  The remaining plan diffs after making
that change all amounted to replacing regular index-only scan plans
with bitmap scans, which seems to me to be silly on its face: if we
can use an IOS then it is unlikely that a bitmap scan will be better.
Those changes indicate that the cost adjustment you'd inserted in
cost_bitmap_heap_scan was way too optimistic, which is hardly
surprising.  I think a realistic adjustment would have to account
for all or most of these factors:

* Whether the index AM is ever going to return recheck = false.
The planner has no way to know that at present, but since there are
lots of cases where it simply won't ever happen, I think assuming
that it always will is not acceptable.  Perhaps we could extend the
AM API so that we could find out whether recheck would happen always,
never, or sometimes.  (Doing better than "sometimes" might be too hard,
but I think most opclasses are going to be "always" or "never" anyway.)

* Whether the bitmap will become lossy, causing us to have to make
rechecks anyway.  We could probably estimate that pretty well based
on comparing the initial number-of-pages estimate to work_mem.

* Whether the plan will need to fetch heap tuples to make filter-qual
checks.  In principle the planner ought to know that.  In practice,
right now it doesn't bother to figure out whether the qual will be
empty until createplan.c time, because that's rather a significant
amount of work and it's undesirable to expend it for paths we might
not end up using.  We might be able to approximate it better than
we do now, though.  It's a live problem for regular indexscan costing
as well as bitmap scans, IIRC.

* What fraction of the table is actually all-visible.  You'd effectively
hard-wired that at 50%, but it's easy to do better --- the regular
IOS code does

if (indexonly)
pages_fetched = ceil(pages_fetched * (1.0 - baserel->allvisfrac));

and it would be appropriate to do the same here if we conclude that
the other gating conditions apply.

Without a patch that deals more realistically with these concerns,
I think we're better off not changing the cost estimate at all.

As far as the executor side goes, I made several cosmetic changes
and one not so cosmetic one: I changed the decision about whether
to prefetch so that it looks at whether the potential prefetch
page is all-visible.  This still relies on the same assumption you
had that the recheck flag will stay the same from one page to the
next, but at least it's not assuming that the all-visible state
will stay the same.

I'm going to mark the CF entry closed, but if you feel like working
on the cost estimate business, feel free to submit another patch
for that.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Custom compression methods

2017-11-01 Thread Peter Eisentraut
On 9/12/17 10:55, Ildus Kurbangaliev wrote:
>> The patch also includes custom compression method for tsvector which
>> is used in tests.
>>
>> [1]
>> https://www.postgresql.org/message-id/CAPpHfdsdTA5uZeq6MNXL5ZRuNx%2BSig4ykWzWEAfkC6ZKMDy6%3DQ%40mail.gmail.com
> Attached rebased version of the patch. Added support of pg_dump, the
> code was simplified, and a separate cache for compression options was
> added.

I would like to see some more examples of how this would be used, so we
can see how it should all fit together.

So far, it's not clear to me that we need a compression method as a
standalone top-level object.  It would make sense, perhaps, to have a
compression function attached to a type, so a type can provide a
compression function that is suitable for its specific storage.

The proposal here is very general: You can use any of the eligible
compression methods for any attribute.  That seems very complicated to
manage.  Any attribute could be compressed using either a choice of
general compression methods or a type-specific compression method, or
perhaps another type-specific compression method.  That's a lot.  Is
this about packing certain types better, or trying out different
compression algorithms, or about changing the TOAST thresholds, and so on?

Ideally, we would like something that just works, with minimal
configuration and nudging.  Let's see a list of problems to be solved
and then we can discuss what the right set of primitives might be to
address them.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema variables

2017-11-01 Thread Pavel Stehule
2017-11-01 19:03 GMT+01:00 Mark Dilger :

>
> > Comments, notes?
>
> How would variables behave on transaction rollback?
>
> CREATE TEMP VARIABLE myvar;
> SET myvar := 1;
> BEGIN;
> SET myvar := 2;
> COMMIT;
> BEGIN;
> SET myvar := 3;
> ROLLBACK;
> SELECT myvar;
>
> How would variables behave when modified in a procedure
> that aborts rather than returning cleanly?
>
>
The result is 3

When you create variable like you did, then there are not any relation
between variable content and transactions. Almost every where session -
package - schema variables are untransactional. It can be changed, but with
negative impact on performance - so I propose relative simply solution -
reset to default on rollback, when variables was changed in transaction -
but it is not default behave.

Variables are variables like you know from PlpgSQL. But the holder is not
the plpgsql function. The holder is a schema in this case. The variable
(meta) is permanent. The content of variable is session based
untransactional.

Regards

Pavel


> mark
>


Re: [HACKERS] [PATCH] Pageinspect - add functions on GIN and GiST indexes from gevel

2017-11-01 Thread Peter Eisentraut
On 9/13/17 03:45, Alexey Chernyshov wrote:
> On Sat, 9 Sep 2017 13:53:35 +0530
> Ashutosh Sharma  wrote:
> 
>>
>> Finally, i got some time to look into this patch and surprisingly I
>> didn't find any function returning information at page level instead
>> all the SQL functions are returning information at index level.
>> Therefore, i too feel that it should be either integrated with
>> pgstattuple which could a better match as Tomas also mentioned or may
>> be add a new contrib module itself. I think, adding a new contrib
>> module looks like a better option. The reason being, it doesn't simply
>> include the function for showing index level statistics (for e.g..
>> index size, no of rows, values..e.t.c) like pgstattuple does but,
>> also, displays information contained in a page for e.g. the object
>> stored in the page,  it's tid e.t.c. So, more or less, I would say
>> that, it's the mixture of pageinspect and pgstattuple module and
>> therefore, i feel, adding it as a new extension would be a better
>> choice. Thought?
> 
> Thank you for your interest, I will add a new contrib module named,
> say, indexstat. I think we can add statistics on other indexes in the
> future.

A few thoughts on this current patch:

The patch no longer compiles, because of changes in the way the tuple
descriptors are accessed (I think).

I agree with the conclusion so far that this should be a new extension,
not being a good fit for an existing extension.

This kind of thing doesn't look like good design:

+
+test=# SELECT spgist_stats('spgist_idx');
+   spgist_stats
+--
+ totalPages:21   +
+ deletedPages:  0+
+ innerPages:3+
+ leafPages: 18   +
+ emptyPages:1+
+ usedSpace: 121.27 kbytes+
+ freeSpace: 46.07 kbytes +
+ fillRatio: 72.47%   +
+ leafTuples:3669 +
+ innerTuples:   20   +
+ innerAllTheSame:   0+
+ leafPlaceholders:  569  +
+ innerPlaceholders: 0+
+ leafRedirects: 0+
+ innerRedirects:0+
+

This function should return a row with columns for each of these pieces
of information.

So to summarize, I think there is interest in this functionality, but
the patch needs some work.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Add some const decorations to prototypes

2017-11-01 Thread Mark Dilger

> On Oct 31, 2017, at 7:46 AM, Peter Eisentraut 
>  wrote:
> 
> Here is a patch that adds const decorations to many char * arguments in
> functions.  It should have no impact otherwise; there are very few code
> changes caused by it.  Some functions have a strtol()-like behavior
> where they take in a const char * and return a pointer into that as
> another argument.  In those cases, I added a cast or two.
> 
> Generally, I find these const decorations useful as easy function
> documentation.

+1

I submitted something similar a while back and got into a back and forth
discussion with Tom about it.  Anyone interested could take a look at

https://www.postgresql.org/message-id/ACF3A030-E3D5-4E68-B744-184E11DE68F3%40gmail.com




-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-01 Thread David Christensen

> On Nov 1, 2017, at 1:02 PM, Nico Williams  wrote:
> 
> On Thu, Mar 19, 2015 at 05:41:02PM -0500, David Christensen wrote:
>> The two-arg form of the current_setting() function will allow a
>> fallback value to be returned instead of throwing an error when an
>> unknown GUC is provided.  This would come in most useful when using
>> custom GUCs; e.g.:
> 
> There already _is_ a two-argument form of current_setting() that yours
> somewhat conflicts with:
> 
>   current_setting(setting_name [, missing_ok ])
> 
> https://www.postgresql.org/docs/current/static/functions-admin.html


Apologies; I have no idea how this email got re-sent, but it is quite old and a 
mistake.

David
--
David Christensen
End Point Corporation
da...@endpoint.com
785-727-1171





signature.asc
Description: Message signed with OpenPGP


Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-01 Thread Peter Geoghegan
On Wed, Nov 1, 2017 at 10:19 AM, Simon Riggs  wrote:
>> The problem here is: Iff the first statement uses ON CONFLICT
>> infrastructure, doesn't the absence of WHEN NOT MATCHED imply
>> different semantics for the remaining updates and deletes in the
>> second version of the query?
>
> Not according to the SQL Standard, no. I have no plans for such
> differences to exist.
>
> Spec says: If we hit HeapTupleSelfUpdated then we throw an ERROR.

Your documentation said that the MERGE was driven by a speculative
insertion (BTW, I don't think that this internal implementation detail
should be referenced in user-facing docs). I inferred that that could
not always be true, since there won't always be an INSERT/WHEN NOT
MATCHED case, assuming that you allow that at all (I now gather that you
will).

>> You've removed what seems like a neat
>> adjunct to the MERGE, but it actually changes everything else too when
>> using READ COMMITTED. Isn't that pretty surprising?
>
> I think you're presuming things I haven't said and don't mean, so
> we're both surprised.

You're right -- I'm surmising what I think might be true, because I
don't have the information available to know one way or the other. As
far as this issue with using speculative insertions in one context but
not in another goes, I still don't really know where you stand. I can
still only surmise that you must want both implementations, and will use
one or the other as circumstances dictate (to avoid dup violations in
the style of ON CONFLICT where that's possible).

This seems true because you now say that it will be possible to omit
WHEN NOT MATCHED, and yet there is no such thing as a speculative
insertion without the insertion. You haven't said that that conclusion
is true yourself, but it's the only conclusion that I can draw based on
what you have said.

> I think we need some way of expressing the problems clearly.

It's certainly hard to talk about these problems. I know this from
experience.

> "a general purpose solution is one that more or
> less works like an UPDATE FROM, with an outer join, whose ModifyTable
> node is capable of insert, update, or delete (and accepts quals for
> MATCHED and NOT matched cases, etc). You could still get duplicate
> violations due to concurrent activity in READ COMMITTED mode".
>
> Surely the whole point of this is to avoid duplicate violations due to
> concurrent activity?

Now we're getting somewhere.

I *don't* think that that's the whole point of MERGE. No other MERGE
implementation does that, or claims to do that. The SQL standard says
nothing about this. Heikki found this to be acceptable when working on
the GSoC MERGE implementation that went nowhere.

My position is that we ought to let MERGE be MERGE, and let ON CONFLICT
be ON CONFLICT.

In Postgres, you can avoid duplicate violations with MERGE by using a
higher isolation level (these days, those are turned into a
serialization error at higher isolation levels when no duplicate is
visible to the xact's snapshot).  MERGE isn't and shouldn't be special
when it comes to concurrency.

> I'm not seeing how either design sketch rigorously avoids live locks,
> but those are fairly unlikely and easy to detect and abort.

My MERGE semantics (which really are not mine at all) avoid live
lock/lock starvation by simply never retrying anything without making
forward progress. MERGE doesn't take any special interest in
concurrency, just like any other DML statement that isn't INSERT with ON
CONFLICT.

ON CONFLICT would have live locks if it didn't always have the choice of
inserting [1]. In many ways, the syntax of INSERT ON CONFLICT DO UPDATE
is restricted in exactly the way it needs to be in order to function
correctly. It wasn't an accident that it didn't end up being UPDATE ...
ON NOUPDATE DO INSERT, or something like that, which Robert proposed at
one point.

ON CONFLICT plays by its own rules to a certain extent, because that's
what you need in order to get the desired guarantees in READ COMMITTED
mode [2]. This is the main reason why it was as painful a project as it
was. Further generalizing that seems fraught with difficulties. It seems
logically impossible to generalize it in a way where you don't end up
with two behaviors masquerading as one.

[1] https://wiki.postgresql.org/wiki/UPSERT#Theoretical_lock_starvation_hazards
[2] https://wiki.postgresql.org/wiki/UPSERT#Goals_for_implementation
--
Peter Geoghegan


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: schema variables

2017-11-01 Thread Mark Dilger

> Comments, notes?

How would variables behave on transaction rollback?

CREATE TEMP VARIABLE myvar;
SET myvar := 1;
BEGIN;
SET myvar := 2;
COMMIT;
BEGIN;
SET myvar := 3;
ROLLBACK;
SELECT myvar;

How would variables behave when modified in a procedure
that aborts rather than returning cleanly?


mark


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-01 Thread Nico Williams
On Thu, Mar 19, 2015 at 05:41:02PM -0500, David Christensen wrote:
> The two-arg form of the current_setting() function will allow a
> fallback value to be returned instead of throwing an error when an
> unknown GUC is provided.  This would come in most useful when using
> custom GUCs; e.g.:

There already _is_ a two-argument form of current_setting() that yours
somewhat conflicts with:

   current_setting(setting_name [, missing_ok ])

https://www.postgresql.org/docs/current/static/functions-admin.html

I often use

  coalesce(current_setting(setting_name, true), default_value_here)

as an implementation of current_setting() with a default value.

You could treat booleans as the second argument as a missing_ok argument
instead of a default value, since _currently_ current_setting() only
returns TEXT.

But if ever GUCs are allowed to have values of other types, then your
two-argument current_setting() will conflict with boolean GUCs.

There are several ways to prevent such a future conflict.  Here are
some:

 - make a two-argument current_setting() for boolean GUCs treat the
   second argument as a default value (since there are no such GUCs
   today, this won't break anything)

 - use a new function name

 - use a three-argument current_setting()

The third option seems very lame to me.  So I'd argue for either of the
other two.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-01 Thread David G. Johnston
On Thu, Mar 19, 2015 at 3:41 PM, David Christensen 
wrote:

> The two-arg form of the current_setting() function will allow a
> fallback value to be returned instead of throwing an error when an
> unknown GUC is provided.  This would come in most useful when using
> custom GUCs; e.g.:
>
>   -- returns current setting of foo.bar, or 'default' if not set
>   SELECT current_setting('foo.bar', 'default')
>

​​This doesn't actually change the GUC in the system, right?  Do we want a
side-effect version of this?

There is already a two-arg form where the second argument is a boolean -
there needs to be tests that ensure proper behavior and function lookup
resolution.

No docs.

David J.
​


Re: [HACKERS] ALTER COLUMN TYPE vs. domain constraints

2017-11-01 Thread Tom Lane
Michael Paquier  writes:
> On Fri, Oct 27, 2017 at 11:15 AM, Tom Lane  wrote:
>> We could consider back-patching the attached to cover this, but
>> I'm not entirely sure it's worth the trouble, because I haven't
>> thought of any non-silly use-cases in the absence of domains
>> over composite.  Comments?

> There are no real complaints about the current behavior, aren't there?
> So only patching HEAD seems enough to me.

Yeah, we can leave it till someone does complain.

> You have added a comment on the constraint to make sure that it
> remains up on basically this ALTER TYPE. Querying pg_obj_description
> would make sure that the comment on the constraint is still here.

Done.

> +RebuildDomainConstraintComment(AlteredTableInfo *tab, int pass, Oid objid,
> +  List *domname, char *conname)
> There is much duplication with RebuildConstraintComment. Why not
> grouping both under say RebuildObjectComment()?

True.  I'd originally expected the code to differ more, but we can
merge these easily enough.

> I would think about
> having cmd->objtype and cmd->object passed as arguments, and then
> remove rel and domname from the existing arguments.

Doing it like that requires the callers to do work (to prepare the object
ID data structure) that's wasted if there's no comment, which most often
there wouldn't be, I suspect.  Seems better to just pass the info the
caller does have and let this function do the rest.

Pushed, thanks for reviewing!

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-01 Thread Pavel Stehule
Hi

better to send it as attachment

Regards

Pavel

2015-03-19 23:41 GMT+01:00 David Christensen :

> The two-arg form of the current_setting() function will allow a
> fallback value to be returned instead of throwing an error when an
> unknown GUC is provided.  This would come in most useful when using
> custom GUCs; e.g.:
>
>   -- errors out if the 'foo.bar' setting is unset
>   SELECT current_setting('foo.bar');
>
>   -- returns current setting of foo.bar, or 'default' if not set
>   SELECT current_setting('foo.bar', 'default')
>
> This would save you having to wrap the use of the function in an
> exception block just to catch and utilize a default setting value
> within a function.
> ---
>  src/backend/utils/misc/guc.c  | 50 ++
> ++---
>  src/include/catalog/pg_proc.h |  2 ++
>  src/include/utils/builtins.h  |  1 +
>  src/include/utils/guc.h   |  1 +
>  src/test/regress/expected/guc.out | 19 +++
>  src/test/regress/sql/guc.sql  | 12 ++
>  6 files changed, 82 insertions(+), 3 deletions(-)
>
> diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
> index 26275bd..002a926 100644
> --- a/src/backend/utils/misc/guc.c
> +++ b/src/backend/utils/misc/guc.c
> @@ -7703,13 +7703,32 @@ ShowAllGUCConfig(DestReceiver *dest)
>  char *
>  GetConfigOptionByName(const char *name, const char **varname)
>  {
> +   return GetConfigOptionByNameFallback(name, NULL, varname);
> +}
> +
> +/*
> + * Return GUC variable value by name; optionally return canonical form of
> + * name.  If GUC is NULL then optionally return a fallback value instead
> of an
> + * error.  Return value is palloc'd.
> + */
> +char *
> +GetConfigOptionByNameFallback(const char *name, const char
> *default_value, const char **varname)
> +{
> struct config_generic *record;
>
> record = find_option(name, false, ERROR);
> if (record == NULL)
> -   ereport(ERROR,
> -   (errcode(ERRCODE_UNDEFINED_OBJECT),
> -  errmsg("unrecognized configuration parameter
> \"%s\"", name)));
> +   {
> +   if (default_value) {
> +   return pstrdup(default_value);
> +   }
> +   else
> +   {
> +   ereport(ERROR,
> +   (errcode(ERRCODE_UNDEFINED_
> OBJECT),
> +  errmsg("unrecognized configuration
> parameter \"%s\"", name)));
> +   }
> +   }
> if ((record->flags & GUC_SUPERUSER_ONLY) && !superuser())
> ereport(ERROR,
> (errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
> @@ -8008,6 +8027,31 @@ show_config_by_name(PG_FUNCTION_ARGS)
>  }
>
>  /*
> + * show_config_by_name_fallback - equiv to SHOW X command but implemented
> as
> + * a function.  If X does not exist, return a fallback datum instead of
> erroring
> + */
> +Datum
> +show_config_by_name_fallback(PG_FUNCTION_ARGS)
> +{
> +   char   *varname;
> +   char   *varfallback;
> +   char   *varval;
> +
> +   /* Get the GUC variable name */
> +   varname = TextDatumGetCString(PG_GETARG_DATUM(0));
> +
> +   /* Get the fallback value */
> +   varfallback = TextDatumGetCString(PG_GETARG_DATUM(1));
> +
> +   /* Get the value */
> +   varval = GetConfigOptionByNameFallback(varname, varfallback,
> NULL);
> +
> +   /* Convert to text */
> +   PG_RETURN_TEXT_P(cstring_to_text(varval));
> +}
> +
> +
> +/*
>   * show_all_settings - equiv to SHOW ALL command but implemented as
>   * a Table Function.
>   */
> diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
> index 6a757f3..71efed2 100644
> --- a/src/include/catalog/pg_proc.h
> +++ b/src/include/catalog/pg_proc.h
> @@ -3025,6 +3025,8 @@ DESCR("convert bitstring to int8");
>
>  DATA(insert OID = 2077 (  current_setting  PGNSP PGUID 12 1 0 0 0 f f
> f f t f s 1 0 25 "25" _null_ _null_ _null_ _null_ show_config_by_name
> _null_ _null_ _null_ ));
>  DESCR("SHOW X as a function");
> +DATA(insert OID = 3280 (  current_setting  PGNSP PGUID 12 1 0 0 0 f f
> f f t f s 2 0 25 "25 25" _null_ _null_ _null_ _null_
> show_config_by_name_fallback _null_ _null_ _null_ ));
> +DESCR("SHOW X as a function");
>  DATA(insert OID = 2078 (  set_config   PGNSP PGUID 12 1 0 0 0 f f
> f f f f v 3 0 25 "25 25 16" _null_ _null_ _null_ _null_ set_config_by_name
> _null_ _null_ _null_ ));
>  DESCR("SET X as a function");
>  DATA(insert OID = 2084 (  pg_show_all_settings PGNSP PGUID 12 1 1000 0 0
> f f f f t t s 0 0 2249 "" 
> "{25,25,25,25,25,25,25,25,25,25,25,1009,25,25,25,23}"
> "{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" "{name,setting,unit,category,
> short_desc,extra_desc,context,vartype,source,min_val,max_
> val,enumvals,boot_val,reset_val,sourcefile,sourceline}" _null_
> show_all_settings _null_ _null_ _null_ ));
> diff --git

[HACKERS] [PATCH] Add two-arg for of current_setting(NAME, FALLBACK)

2017-11-01 Thread David Christensen
The two-arg form of the current_setting() function will allow a
fallback value to be returned instead of throwing an error when an
unknown GUC is provided.  This would come in most useful when using
custom GUCs; e.g.:

  -- errors out if the 'foo.bar' setting is unset
  SELECT current_setting('foo.bar');

  -- returns current setting of foo.bar, or 'default' if not set
  SELECT current_setting('foo.bar', 'default')

This would save you having to wrap the use of the function in an
exception block just to catch and utilize a default setting value
within a function.
---
 src/backend/utils/misc/guc.c  | 50 ---
 src/include/catalog/pg_proc.h |  2 ++
 src/include/utils/builtins.h  |  1 +
 src/include/utils/guc.h   |  1 +
 src/test/regress/expected/guc.out | 19 +++
 src/test/regress/sql/guc.sql  | 12 ++
 6 files changed, 82 insertions(+), 3 deletions(-)

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 26275bd..002a926 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -7703,13 +7703,32 @@ ShowAllGUCConfig(DestReceiver *dest)
 char *
 GetConfigOptionByName(const char *name, const char **varname)
 {
+   return GetConfigOptionByNameFallback(name, NULL, varname);
+}
+
+/*
+ * Return GUC variable value by name; optionally return canonical form of
+ * name.  If GUC is NULL then optionally return a fallback value instead of an
+ * error.  Return value is palloc'd.
+ */
+char *
+GetConfigOptionByNameFallback(const char *name, const char *default_value, 
const char **varname)
+{
struct config_generic *record;
 
record = find_option(name, false, ERROR);
if (record == NULL)
-   ereport(ERROR,
-   (errcode(ERRCODE_UNDEFINED_OBJECT),
-  errmsg("unrecognized configuration parameter 
\"%s\"", name)));
+   {
+   if (default_value) {
+   return pstrdup(default_value);
+   }
+   else
+   {
+   ereport(ERROR,
+   (errcode(ERRCODE_UNDEFINED_OBJECT),
+  errmsg("unrecognized configuration parameter 
\"%s\"", name)));
+   }
+   }
if ((record->flags & GUC_SUPERUSER_ONLY) && !superuser())
ereport(ERROR,
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
@@ -8008,6 +8027,31 @@ show_config_by_name(PG_FUNCTION_ARGS)
 }
 
 /*
+ * show_config_by_name_fallback - equiv to SHOW X command but implemented as
+ * a function.  If X does not exist, return a fallback datum instead of 
erroring
+ */
+Datum
+show_config_by_name_fallback(PG_FUNCTION_ARGS)
+{
+   char   *varname;
+   char   *varfallback;
+   char   *varval;
+   
+   /* Get the GUC variable name */
+   varname = TextDatumGetCString(PG_GETARG_DATUM(0));
+
+   /* Get the fallback value */
+   varfallback = TextDatumGetCString(PG_GETARG_DATUM(1));
+   
+   /* Get the value */
+   varval = GetConfigOptionByNameFallback(varname, varfallback, NULL);
+
+   /* Convert to text */
+   PG_RETURN_TEXT_P(cstring_to_text(varval));
+}
+
+
+/*
  * show_all_settings - equiv to SHOW ALL command but implemented as
  * a Table Function.
  */
diff --git a/src/include/catalog/pg_proc.h b/src/include/catalog/pg_proc.h
index 6a757f3..71efed2 100644
--- a/src/include/catalog/pg_proc.h
+++ b/src/include/catalog/pg_proc.h
@@ -3025,6 +3025,8 @@ DESCR("convert bitstring to int8");
 
 DATA(insert OID = 2077 (  current_setting  PGNSP PGUID 12 1 0 0 0 f f f f 
t f s 1 0 25 "25" _null_ _null_ _null_ _null_ show_config_by_name _null_ _null_ 
_null_ ));
 DESCR("SHOW X as a function");
+DATA(insert OID = 3280 (  current_setting  PGNSP PGUID 12 1 0 0 0 f f f f 
t f s 2 0 25 "25 25" _null_ _null_ _null_ _null_ show_config_by_name_fallback 
_null_ _null_ _null_ ));
+DESCR("SHOW X as a function");
 DATA(insert OID = 2078 (  set_config   PGNSP PGUID 12 1 0 0 0 f f f f 
f f v 3 0 25 "25 25 16" _null_ _null_ _null_ _null_ set_config_by_name _null_ 
_null_ _null_ ));
 DESCR("SET X as a function");
 DATA(insert OID = 2084 (  pg_show_all_settings PGNSP PGUID 12 1 1000 0 0 f f f 
f t t s 0 0 2249 "" "{25,25,25,25,25,25,25,25,25,25,25,1009,25,25,25,23}" 
"{o,o,o,o,o,o,o,o,o,o,o,o,o,o,o,o}" 
"{name,setting,unit,category,short_desc,extra_desc,context,vartype,source,min_val,max_val,enumvals,boot_val,reset_val,sourcefile,sourceline}"
 _null_ show_all_settings _null_ _null_ _null_ ));
diff --git a/src/include/utils/builtins.h b/src/include/utils/builtins.h
index bc4517d..e3d9fbb 100644
--- a/src/include/utils/builtins.h
+++ b/src/include/utils/builtins.h
@@ -1088,6 +1088,7 @@ extern Datum quote_nullable(PG_FUNCTION_ARGS);
 
 /* guc.c */
 extern Datum show_config_by_name(PG_FUNCTION_ARGS);
+extern Datum show_config_by

Re: [HACKERS] MERGE SQL Statement for PG11

2017-11-01 Thread Simon Riggs
On 31 October 2017 at 18:55, Peter Geoghegan  wrote:
> On Tue, Oct 31, 2017 at 2:25 AM, Simon Riggs  wrote:
>> If there are challenges ahead, its reasonable to ask for test cases
>> for that now especially if you think you know what they already are.
>> Imagine we go forwards 2 months - if you dislike my patch when it
>> exists you will submit a test case showing the fault. Why not save us
>> all the trouble and describe that now? Test Driven Development.
>
> I already have, on several occasions now. But if you're absolutely
> insistent on my constructing the test case in terms of a real SQL
> statement, then that's what I'll do.
>
> Consider this MERGE statement, from your mock documentation:
>
> MERGE INTO wines w
> USING wine_stock_changes s
> ON s.winename = w.winename
> WHEN NOT MATCHED AND s.stock_delta > 0 THEN
>   INSERT VALUES(s.winename, s.stock_delta)
> WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
>   UPDATE SET stock = w.stock + s.stock_delta
> ELSE
>   DELETE;
>
> Suppose we remove the WHEN NOT MATCHED case, leaving us with:
>
> MERGE INTO wines w
> USING wine_stock_changes s
> ON s.winename = w.winename
> WHEN MATCHED AND w.stock + s.stock_delta > 0 THEN
>   UPDATE SET stock = w.stock + s.stock_delta
> ELSE
>   DELETE;
>
> We now have a MERGE that will not INSERT, but will continue to UPDATE
> and DELETE.

Agreed

> (It's implied that your syntax cannot do this at all,
> because you propose use the ON CONFLICT infrastructure, but I think we
> have to imagine a world in which that restriction either never existed
> or has subsequently been lifted.)
>
> The problem here is: Iff the first statement uses ON CONFLICT
> infrastructure, doesn't the absence of WHEN NOT MATCHED imply
> different semantics for the remaining updates and deletes in the
> second version of the query?

Not according to the SQL Standard, no. I have no plans for such
differences to exist.

Spec says: If we hit HeapTupleSelfUpdated then we throw an ERROR.

> You've removed what seems like a neat
> adjunct to the MERGE, but it actually changes everything else too when
> using READ COMMITTED. Isn't that pretty surprising?

I think you're presuming things I haven't said and don't mean, so
we're both surprised.

> If you're not
> clear on what I mean, see my previous remarks on EPQ, live lock, and
> what a CONFLICT could be in READ COMMITTED mode. Concurrent activity
> at READ COMMITTED mode can be expected to significantly alter the
> outcome here.

And I still have questions about what exactly you mean, but at least
this post is going in the right direction and I'm encouraged. Thank
you,

I think we need some way of expressing the problems clearly.

> That is rather frustrating.

Guess so.

>> You've said its possible another way. Show that assertion is actually
>> true. We're all listening, me especially, for the technical details.
>
> My proposal, if you want to call it that, has the merit of actually
> being how MERGE works in every other system. Both Robert and Stephen
> seem to be almost sold on what I suggest, so I think that I've
> probably already explained my position quite well.

The only info I have is

"a general purpose solution is one that more or
less works like an UPDATE FROM, with an outer join, whose ModifyTable
node is capable of insert, update, or delete (and accepts quals for
MATCHED and NOT matched cases, etc). You could still get duplicate
violations due to concurrent activity in READ COMMITTED mode".

Surely the whole point of this is to avoid duplicate violations due to
concurrent activity?

I'm not seeing how either design sketch rigorously avoids live locks,
but those are fairly unlikely and easy to detect and abort.

Thank you for a constructive email, we are on the way to somewhere good.

I have more to add, but wanted to get back to you soonish.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Mapping MERGE onto CTEs (Re: MERGE SQL Statement for PG11)

2017-11-01 Thread Alvaro Herrera
Nico Williams wrote:

> As an aside, I'd like to be able to control which CTEs are view-like and
> which are table-like.  In SQLite3, for example, they are all view-like,
> and the optimizer will act accordingly, whereas in PG they are all
> table-like, and thus optimizer barriers.

There was a short and easy to grasp (OK, maybe not) discussion on the
topic of CTEs acting differently.  I think the consensus is that for
CTEs that are read-only and do not use functions that aren't immutable,
they may be considered for inlining.
https://www.postgresql.org/message-id/5351711493487...@web53g.yandex.ru

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Mapping MERGE onto CTEs (Re: MERGE SQL Statement for PG11)

2017-11-01 Thread Nico Williams
Is it possible to map MERGE onto a query with CTEs that does the the
various DMLs, with all but the last RETURNING?  Here's a sketch:

WITH matched_rows AS (
SELECT FROM  t WHERE 
 ),
 updated_rows AS (
UPDATE  t
SET ...
WHERE ... AND t in (SELECT j FROM matched_rows j)
RETURNING t
 ),
 inserted_rows AS (
INSERT INTO  t
SELECT ...
WHERE ... AND t NOT IN (SELECT j FROM matched_rows j)
RETURNING t
 ),
DELETE FROM  t
WHERE ...;

Now, one issue is that in PG CTEs are basically like temp tables, and
also like optimizer barriers, so this construction is not online, and if
matched_rows is very large, that would be a problem.

As an aside, I'd like to be able to control which CTEs are view-like and
which are table-like.  In SQLite3, for example, they are all view-like,
and the optimizer will act accordingly, whereas in PG they are all
table-like, and thus optimizer barriers.

Nico
-- 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Re: PANIC: invalid index offnum: 186 when processing BRIN indexes in VACUUM

2017-11-01 Thread Alvaro Herrera
Alvaro Herrera wrote:
> Tomas Vondra wrote:
> 
> > FWIW I can reproduce this on 9.5, and I don't even need to run the
> > UPDATE part. That is, INSERT + VACUUM running concurrently is enough to
> > produce broken BRIN indexes :-(
> 
> Hmm, I'm pretty sure we stress-tested brin in pretty much the same way.
> But I see this misbehavior too.  Looking ...

Turns out that this is related to concurrent growth of the table while
the summarization process is scanning -- so new pages have appeared at
the end of the table after the end point has been determined.  It would
be a pain to determine number of blocks for each range, so I'm looking
for a simple way to fix it without imposing so much overhead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Account for cost and selectivity of HAVING quals

2017-11-01 Thread Tom Lane
Ashutosh Bapat  writes:
> On Wed, Nov 1, 2017 at 3:15 AM, Tom Lane  wrote:
>> here's a patch to fix the planner so that eval costs and selectivity of
>> HAVING quals are factored into the appropriate plan node numbers.
>> ...
>> + /* Add cost of qual, if any --- but we ignore its selectivity */

> And may be we should try to explain why can we ignore selectivity.
> Similarly for the changes in create_minmaxagg_path().

I'm sure you realize that's because the estimate is already just one
row ... but sure, we can spell that out.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] [PATCH] Document the order of changing certain settings when using hot-standby servers

2017-11-01 Thread Peter Eisentraut
On 9/1/17 13:00, Robert Haas wrote:
> Now you're proposing to add:
> 
> If you want to increase these values you
> should do so on all standby servers first, before applying the changes to
> the primary. If you instead want to decrease these values you should do so
> on the primary first, before applying the changes to all standby servers.
> 
> But that's just the obvious logical consequence of the existing statement.
> 
> If we're going to add this text, I'd move it one sentence earlier and
> stick "Therefore, " at the beginning.

Committed incorporating that suggestion.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Walsender timeouts and large transactions

2017-11-01 Thread Petr Jelinek
Hi,

sorry for the delay but I didn't have much time in past weeks to follow
this thread.

On 02/10/17 05:44, Kyotaro HORIGUCHI wrote:
> Hello Sokolov.
> 
> At Fri, 29 Sep 2017 15:19:23 +0300, Sokolov Yura 
>  wrote in 
> 
>> I don't want to make test to lasts so long and generate so many data.
>> That is why I used such small timeouts for tests.
> 
> I understand your point, but still *I* think such a short timeout
> is out of expectation by design. (But it can be set.)
> 
> Does anyone have opinions on this?

Yes, it's not practically useful to have such small wal_sender_timeout
given that the main purpose of that is to detect network issues.

> 
>> Test is failing if there is "short quit" after
>> `!pq_is_send_pending()`,
>> so I doubt your patch will pass the test.
> 
> It is because I think that the test "should" fail since the
> timeout is out of expected range. I (and perhaps also Petr) is
> thinking that the problem is just that a large transaction causes
> a timeout with an ordinary timeout. My test case is based on the
> assumption.
> 
> Your test is for a timeout during replication-startup with
> extremely short timeout. This may be a different problem to
> discuss, but perhaps better to be solved together.
> 
> I'd like to have opnions from others on this point.

I think it's different problem and because of what I wrote above it does
not seem to me like something we should spend out time on trying to fix.
> 
> Uggh! I misunderstood there. It wais for writing socket so the
> sleep is wrong and WaitLatchOrSocket is right.
> 
> After all, I put +1 for Petr's latest patch. Sorry for my
> carelessness.
> 

Great, I attached version 3 which is just rebased on current master and
also does not move the GetCurrentTimestamp() call because the concern
about skewing the timestamp during config reload (and also network flush
as I realized later) is valid.

It's rather hard to test all the scenarios that this patch fixes in
automated fashion without generating a lot of wal or adding sleeps to
the processing. That's why I didn't produce usable TAP test.

Since it seems like some of my reasoning is unclear I will try to
describe it once more.

The main problem we have is that unless we call the
ProcessRepliesIfAny() before the wal_sender_timeout expires we'll die
because of timeout eventually. The current coding will skip that call if
there is a long transaction being processed (if network is fast enough).
This is what the first part (first 2 hunks) of the patch solves. There
is also issue that while this is happening the walsender ignores signals
so it's impossible to stop it which is why I added the
CHECK_FOR_INTERRUPTS to the fast path.

The second problem is that if we solved just the first one, then if
downstream (and again network) is fast enough, the ProcessRepliesIfAny()
will not do anything useful because downstream is not going to send any
response while the network buffer contains any data. This is caused by
the fact that we normally code the receiver side to receive until there
is something and only send reply when there is a "pause" in the stream.
To get around this problem we also need to make sure that we send
WalSndKeepaliveIfNecessary() periodically and that will not happen on
fast network unless we do the second part of the patch (3rd hunk), ie,
move the pq_is_send_pending() after the keepalive handling.

This code is specific to logical decoding walsender interface so it only
happens for logical decoding/replication (which means it should be
backported all the way to 9.4). The physical one

These two issues happen quite normally in the wild as all we need is big
data load in single transaction, or update of large part of an table or
something similar for this to happen with default settings.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services
From df2d5b09a74cb31537e2bb74895a8e31febce5f8 Mon Sep 17 00:00:00 2001
From: Petr Jelinek 
Date: Tue, 31 Oct 2017 14:00:37 +0100
Subject: [PATCH] Fix walsender timeouts when decoding large transaction

The logical slots have fast code path for sending data in order to not
impose too high per message overhead. The fast path skips checks for
interrupts and timeouts. However, the existing coding failed to consider
the fact that transaction with large number of changes may take very long
to be processed and sent to the client. This causes walsender to ignore
interrupts for potentially long time and more importantly it will cause
walsender being killed due to timeout at the end of such transaction.

This commit changes the fast path to also check for interrupts and only
allows calling the fast path when last keeplaive check happened less
than half of walsender timeout ago, otherwise the slower code path will
be taken.
---
 src/backend/replication/walsender.c | 31 +++
 1 file changed, 19 insertions(+), 12 deletions(-)

diff --git a/src/backend/replication/wal

Re: [HACKERS] Commit fest 2017-11

2017-11-01 Thread Michael Paquier
On Wed, Nov 1, 2017 at 9:04 AM, Michael Paquier
 wrote:
> Anybody willing to take the hat of the commit fest manager? If nobody,
> I am fine to take the hat as default choice this time.

And now it is open. Let's the fest begin.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup fails on Windows when using tablespace mapping

2017-11-01 Thread Michael Paquier
On Wed, Nov 1, 2017 at 2:24 PM, Peter Eisentraut
 wrote:
> Committed to master.  I suppose this should be backpatched?

Thanks! Yes this should be back-patched.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] pg_basebackup fails on Windows when using tablespace mapping

2017-11-01 Thread Peter Eisentraut
On 9/10/17 00:39, Michael Paquier wrote:
>> Okay. I have once again reviewed your patch and tested it on Windows
>> as well as Linux. The patch LGTM. I am now marking it as Ready For
>> Committer. Thanks.
> 
> Thanks for the review, Ashutosh.

Committed to master.  I suppose this should be backpatched?

(I changed the strcpy() to strlcpy() for better karma.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] strange relcache.c debug message

2017-11-01 Thread Tom Lane
Alvaro Herrera  writes:
> While messing with BRIN bugs, I noticed this debug message in the server
> log:
> 2017-11-01 12:33:24.042 CET [361429] DEBUG:  rehashing catalog cache id 14 
> for pg_opclass; 17 tups, 8 buckets en carácter 194
> notice that at the end it says "at character 194".  I suppose that's
> because of some leftover errposition() value, but why is it being
> reported in this message?

IIRC, there are places in the parser that set up an error callback that
will attach an errposition to any message at all that gets reported within
a particular chunk of code.  It's meant to catch something like "name not
found" but could produce this too.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] proposal: extend shm_mq to support more use cases

2017-11-01 Thread Craig Ringer
On 1 November 2017 at 21:24, Ildus Kurbangaliev
 wrote:
> Hello! Apparently the current version of shm_mq supports only one sender
> and one receiver. I think it could be very useful to add possibility to
> change senders and receivers. It could be achieved by adding methods
> that remove sender or receiver for mq.

That'd put the complexity penalty on existing shm_mq users. You'd have
to find a way to make it work cheaply for existing 1:1 lightweight
uses.

I'm using shm_mq pretty extensively now, and mostly in one-to-many
two-way patterns between a persistent endpoint and a pool of transient
peers. It's not simple to get it right, there are a lot of traps
introduced by the queue's design for one-shot contexts where
everything is created together and destroyed together.

I think it'd make a lot of sense to make this easier, but I'd be
inclined to do it by layering on top of shm_mq rather than extending
it.

One thing that would make such patterns much easier would be a way to
safely reset a queue for re-use in a race-free way that didn't need
the use of any external synchronisation primitives. So once the
initial queues are set up, a client can attach, exchange messages on a
queue pair, and detach. And the persistent endpoint process can reset
the queues for re-use. Attempting to attach to a busy queue, or one
recently detached from, should fail gracefully.

Right now it's pretty much necessary to have an per-queue spinlock or
similar that you take before you overwrite a queue and re-create it.
And you need is-in-use flags for both persistent side and transient
side peers. That way the persistent side knows for sure there's no
transient side still attached to the queue when it overwrites it, and
so the transient side knows not to attempt to attach to a queue that's
already been used and detached by someone else because that Assert()s.
And you need the persistent side in-use flag so the transient side
knows the queue exists and is ready to be attached to, and it doesn't
try to attach to a queue that's in the process of being overwritten.

The issues I've had are:

* When one side detaches it can't tell if the other side is still
attached, detached, or has never attached. So you need extra book
keeping to find out "once I've detached, is it possible the other peer
could've attached and not yet detached" and ensure that no peer could
be attached when you zero and overwrite a queue.

* Attempting to set the receive/send proc and attach to an
already-attached queue Assert()s, there's no "try and fail
gracefully". So you need extra book keeping to record "this queue slot
has been used up, detached from, and needs to be reset". You can't
just try a queue until you find a free one, or retry your queue slot
until it's reset, or whatever.

* Receive/send proc is tracked by PGPROC not pid, and those slots are
re-used much faster and more predictably. So failures where you
attempt to set yourself as sender/receiver for a queue that's already
had a sender/receiver set can produce confusing errors.


I'd like to:

* Zero the PGPROC* for the sender when it detaches, and the receiver
when it detaches

* When the queue is marked as peer-detached but the local PGPROC is
still attached, have a function that takes the queue spinlock and
resets the queue to not-detached state with the local proc still
attached, ready for re-use by attaching a new peer.

* (At least optionally) return failure code when attempting to set
sender or receiver on a detached queue, not assert. So you can wait
and try again later when the queue is reset and ready for re-use, or
scan to find another free queue slot to attach to.

If there's a need to keep the sender and receiver PGPROC after detach
we could instead make the detached bool into a flag of
RECEIVER_DETACHED|SENDER_DETACHED. However, this adds
read-modify-write cycles to what's otherwise a simple set, so the
spinlock must be taken on detach an atomic must be used. So I'd rather
just blindly clear the PGPROC pointer for whichever side is detaching.

These changes would make using shm_mq persistently MUCH easier,
without imposing significant cost on existing users. And it'd make it
way simpler to build a layer on top for a 1:m 2-way comms system like
Ildus is talking about.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] PostgreSQL 10 parenthesized single-column updates can produce errors

2017-11-01 Thread Rob McColl
Between 9.6.5 and 10, the handling of parenthesized single-column UPDATE
statements changed. In 9.6.5, they were treated identically to
unparenthesized single-column UPDATES. In 10, they are treated as
multiple-column updates.  This results in this being valid in Postgres
9.6.5, but an error in Postgres 10:

CREATE TABLE test (id INT PRIMARY KEY, data INT);
INSERT INTO test VALUES (1, 1);
UPDATE test SET (data) = (2) WHERE id = 1;

In 10 and the current master, this produces the error:

errmsg("source for a multiple-column UPDATE item must be a sub-SELECT or
ROW() expression")

I believe that this is not an intended change or behavior, but is instead
an unintentional side effect of 906bfcad7ba7cb3863fe0e2a7810be8e3cd84fbd
Improve handling of "UPDATE ... SET (column_list) = row_constructor". (
https://github.com/postgres/postgres/commit/906bfcad7ba7cb3863fe0e2a7810be
8e3cd84fbd).

This is a small patch to the grammar that restores the previous behavior by
adding a rule to the set_clause rule and modifying the final rule of the
set_clause rule to only match lists of more then one element.  I'm not sure
if there are more elegant or preferred ways to address this.

Compiled and tested on Ubuntu 17.04 Linux 4.10.0-33-generic x86_64.

Regression test added under the update test to cover the parenthesized
single-column case.

I see no reason this would affect performance.

Thanks,
-rob

-- 
Rob McColl
@robmccoll
r...@robmccoll.com
205.422.0909 <(205)%20422-0909>


[HACKERS] proposal: extend shm_mq to support more use cases

2017-11-01 Thread Ildus Kurbangaliev
Hello! Apparently the current version of shm_mq supports only one sender
and one receiver. I think it could be very useful to add possibility to
change senders and receivers. It could be achieved by adding methods
that remove sender or receiver for mq.

As one of actual use cases can be some extension that creates several
background workers so backends can communicate with them. At the
startup the extension will create two mq for each bgworker (one for
input data, one for output). When a backend wants to work with the
worker it connects to one mq as sender, and to other as receiver.
After some working with bgworker it disconnects from mqs and the 
worker becomes free for another backend. So the workers can act like a
cache, or keep some long connections with other services and so on.

-- 
---
Ildus Kurbangaliev
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Partition-wise aggregation/grouping

2017-11-01 Thread Jeevan Chalke
On Sat, Oct 28, 2017 at 3:07 PM, Robert Haas  wrote:

> On Fri, Oct 27, 2017 at 1:01 PM, Jeevan Chalke
>  wrote:
> > 1. Added separate patch for costing Append node as discussed up-front in
> the
> > patch-set.
> > 2. Since we now cost Append node, we don't need
> > partition_wise_agg_cost_factor
> > GUC. So removed that. The remaining patch hence merged into main
> > implementation
> > patch.
> > 3. Updated rows in test-cases so that we will get partition-wise plans.
>
> With 0006 applied, cost_merge_append() is now a little bit confused:
>
> /*
>  * Also charge a small amount (arbitrarily set equal to operator cost)
> per
>  * extracted tuple.  We don't charge cpu_tuple_cost because a
> MergeAppend
>  * node doesn't do qual-checking or projection, so it has less overhead
>  * than most plan nodes.
>  */
> run_cost += cpu_operator_cost * tuples;
>
> /* Add MergeAppend node overhead like we do it for the Append node */
> run_cost += cpu_tuple_cost * DEFAULT_APPEND_COST_FACTOR * tuples;
>
> The first comment says that we don't add cpu_tuple_cost, and the
> second one then adds half of it anyway.
>

Yep.
But as David reported earlier, if we remove the first part i.e. adding
cpu_operator_cost per tuple, Merge Append will be preferred over an Append
node unlike before. And thus, I thought of better having both, but no so
sure. Should we remove that part altogether, or add both in a single
statement with updated comments?


> I think it's fine to have a #define for DEFAULT_APPEND_COST_FACTOR,
> because as you say it's used twice, but I don't think that should be
> exposed in cost.h; I'd make it private to costsize.c and rename it to
> something like APPEND_CPU_COST_MULTIPLIER.  The word DEFAULT, in
> particular, seems useless to me, since there's no provision for it to
> be overridden by a different value.
>

Agree. Will make that change.


>
> What testing, if any, can we think about doing with this plan to make
> sure it doesn't regress things?  For example, if we do a TPC-H run
> with partitioned tables and partition-wise join enabled, will any
> plans change with this patch?


I have tried doing this on my local developer machine. For 1GB database
size (tpc-h scale factor 1), I see no plan change with and without this
patch.

I have tried with scale factor 10, but query is not executing well due to
space and memory constraints. Can someone try out that?


>   Do they get faster or not?  Anyone have
> other ideas for what to test?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>


-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


Re: [HACKERS] Oracle to PostGre

2017-11-01 Thread Chris Travers
As a brief note, this is probably not the best list for this.  You would do
better to ask questions like this on -general where you have more
application developers and so forth.  This is more of an SQL question so
asking people who are hacking the codebase may not be the best way to get
it answered.

Also, it is Postgres or PostgreSQL.  People will assume you are totally new
if you call it Postgre.

On Wed, Nov 1, 2017 at 12:55 PM, Brahmam Eswar 
wrote:

> Hi,
>
> App is moving to Postgre from Oracel . After migrating the store procedure
> is throwing an error with collection type.
>
> *Oracle :*
>
> create or replace PROCEDURE"PROC1"
>  (
>
>  , REQ_CURR_CODE IN VARCHAR2
>  , IS_VALID OUT VARCHAR2
>  , ERROR_MSG OUT VARCHAR2
>  ) AS
>
>
>TYPE INV_LINES_RT IS RECORD(
>  VENDOR_NUM AP.CREATURE_TXN_LINE_ITEMS.VENDOR_NUM%TYPE,
>  VENDOR_SITE_CODE AP.CREATURE_TXN_LINE_ITEMS.
> VENDOR_SITE_CODE%TYPE,
>  INVOICE_NUM AP.CREATURE_TXN_LINE_ITEMS.INVOICE_NUM%TYPE,
>  TXN_CNT NUMBER
>);
>TYPE INV_LINES_T IS TABLE OF INV_LINES_RT;
>L_INV_LINES INV_LINES_T;
>IS_MULTI_VENDOR FINO_APRVL_BUS_CHANN_DEFAULTS.
> MULTI_VENDOR_REQ_ALLOWED%TYPE;
>BUS_CHANNEL_RECORD FINO_APRVL_BUS_CHANN_DEFAULTS%ROWTYPE;
> CAL_APRVL_AMT_BY_TOTAL_AMT FINO_APRVL_BUS_CHANN_DEFAULTS.
> CAL_APR_AMT_BY_INV%TYPE;
>
>
> Postgre :
>
> create or replace FUNCTION AP.VALIDATE_CRTR_LINE_ITEMS
> (
> REQ_CURR_CODE IN VARCHAR,
> IS_VALID OUT VARCHAR,
> ERROR_MSG OUT VARCHAR
> ) AS $$
>
> DECLARE
>
> INV_LINES_T ap.validate_crtr_line_items$inv_lines_rt ARRAY;
> L_INV_LINES INV_LINES_T%TYPE;
> L_INV_LINES$temporary_record ap.validate_crtr_line_items$inv_lines_rt;
> IS_MULTI_VENDOR AP.FINO_APRVL_BUS_CHANN_DEFAULTS.MULTI_VENDOR_REQ_
> ALLOWED%TYPE;
> BUS_CHANNEL_RECORD ap.fino_aprvl_bus_chann_defaults%ROWTYPE;
>  CAL_APRVL_AMT_BY_TOTAL_AMT AP.FINO_APRVL_BUS_CHANN_
> DEFAULTS.CAL_APR_AMT_BY_INV%TYPE;
>
>
> but it's throwing an error as : 0 SQLState: 42P01 Message: ERROR:
> relation "l_inv_lines" does not exist
> --
> Thanks & Regards,
> Brahmeswara Rao J.
>

When you ask on -general, please include the query which is actually
causing the problem.  My guess is that either you didn't declare the type
properly or there is some other error in your function, but the information
provided is not sufficient to answer it.

Best or luck asking on -general.

-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Account for cost and selectivity of HAVING quals

2017-11-01 Thread Ashutosh Bapat
On Wed, Nov 1, 2017 at 3:15 AM, Tom Lane  wrote:
> Pursuant to the discussion at
> https://www.postgresql.org/message-id/20171029112420.8920b5f...@mx.zeyos.com
> here's a patch to fix the planner so that eval costs and selectivity of
> HAVING quals are factored into the appropriate plan node numbers.
> Perhaps unsurprisingly, this doesn't change the results of any
> existing regression tests.
>
> It's slightly annoying that this approach will result in calculating
> the eval costs and selectivity several times, once for each aggregation
> plan type we consider.  I thought about inserting RestrictInfo nodes
> into the havingQual so that those numbers could be cached, but that turned
> out to break various code that doesn't expect to see such nodes there.
> I'm not sure it's worth going to the trouble of fixing that; in the big
> scheme of things, the redundant calculations likely don't cost much, since
> we aren't going to have relevant statistics.
>
> Comments?  If anyone wants to do a real review of this, I'm happy to stick
> it into the upcoming CF; but without an expression of interest, I'll just
> push it.  I don't think there's anything terribly controversial here.
>

I am not able to see how is the following hunk related to $subject
*** create_result_path(PlannerInfo *root, Re
*** 1374,1379 
--- 1374,1380 
  pathnode->path.startup_cost = target->cost.startup;
  pathnode->path.total_cost = target->cost.startup +
  cpu_tuple_cost + target->cost.per_tuple;
+ /* Add cost of qual, if any --- but we ignore its selectivity */
  if (resconstantqual)
  {
  QualCostqual_cost;

And may be we should try to explain why can we ignore selectivity.
Similarly for the changes in create_minmaxagg_path().

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Try to fix endless loop in ecpg with informix mode

2017-11-01 Thread Michael Meskes
> Any comments?

Sorry, I've been working through the backlog of three weeks of
traveling.

> > I tried some tests with ecpg informix mode.
> > When trying to store float data into a integer var, I got endless
> > loop.
> > 
> > The reason is: 
> > In informix mode, ecpg can accept
> > string form of float number when processing query result.
> > During checking the string form of float number, it seems
> > that ecpg forgot to skip characters after '.'.
> > Then outer loop will never stop because it hopes to see '\0'.
> > 
> > The first patch will reproduce the problem in ecpg's regress test.
> > The second patch tries to fix it in simple way.

Thanks for spotting and fixing. I changed your patch slightly and made
it check if the rest of the data is indeed digits, or else it would
accept something like "7.hello" as "7". 

Committed.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Try to fix endless loop in ecpg with informix mode

2017-11-01 Thread Julien Rouhaud
On Wed, Nov 1, 2017 at 12:22 PM, 高增琦  wrote:
> Any comments?
>


Hi,

You should register these patches for the next commitfest at
https://commitfest.postgresql.org/15/. As Michael pointed out earlier,
this commitfest will start soon so you should add your patches
quickly.

Regards.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


[HACKERS] Oracle to PostGre

2017-11-01 Thread Brahmam Eswar
Hi,

App is moving to Postgre from Oracel . After migrating the store procedure
is throwing an error with collection type.

*Oracle :*

create or replace PROCEDURE"PROC1"
 (

 , REQ_CURR_CODE IN VARCHAR2
 , IS_VALID OUT VARCHAR2
 , ERROR_MSG OUT VARCHAR2
 ) AS


   TYPE INV_LINES_RT IS RECORD(
 VENDOR_NUM AP.CREATURE_TXN_LINE_ITEMS.VENDOR_NUM%TYPE,
 VENDOR_SITE_CODE AP.CREATURE_TXN_LINE_ITEMS.VENDOR_SITE_CODE%TYPE,
 INVOICE_NUM AP.CREATURE_TXN_LINE_ITEMS.INVOICE_NUM%TYPE,
 TXN_CNT NUMBER
   );
   TYPE INV_LINES_T IS TABLE OF INV_LINES_RT;
   L_INV_LINES INV_LINES_T;
   IS_MULTI_VENDOR
FINO_APRVL_BUS_CHANN_DEFAULTS.MULTI_VENDOR_REQ_ALLOWED%TYPE;
   BUS_CHANNEL_RECORD FINO_APRVL_BUS_CHANN_DEFAULTS%ROWTYPE;
CAL_APRVL_AMT_BY_TOTAL_AMT
FINO_APRVL_BUS_CHANN_DEFAULTS.CAL_APR_AMT_BY_INV%TYPE;


Postgre :

create or replace FUNCTION AP.VALIDATE_CRTR_LINE_ITEMS
(
REQ_CURR_CODE IN VARCHAR,
IS_VALID OUT VARCHAR,
ERROR_MSG OUT VARCHAR
) AS $$

DECLARE

INV_LINES_T ap.validate_crtr_line_items$inv_lines_rt ARRAY;
L_INV_LINES INV_LINES_T%TYPE;
L_INV_LINES$temporary_record ap.validate_crtr_line_items$inv_lines_rt;
IS_MULTI_VENDOR
AP.FINO_APRVL_BUS_CHANN_DEFAULTS.MULTI_VENDOR_REQ_ALLOWED%TYPE;
BUS_CHANNEL_RECORD ap.fino_aprvl_bus_chann_defaults%ROWTYPE;
 CAL_APRVL_AMT_BY_TOTAL_AMT
AP.FINO_APRVL_BUS_CHANN_DEFAULTS.CAL_APR_AMT_BY_INV%TYPE;


but it's throwing an error as : 0 SQLState: 42P01 Message: ERROR: relation
"l_inv_lines" does not exist
-- 
Thanks & Regards,
Brahmeswara Rao J.


Re: [HACKERS] Re: proposal - psql: possibility to specify sort for describe commands, when size is printed

2017-11-01 Thread Pavel Stehule
Hi


In general, this approach looks good for me.
> Regarding current state of patch, I'd like to see new options documented.
> Also, it would be better to replace "bool sort_size" with enum assuming
> there could be other sorting orders in future.
>

I am sending updated patch with some basic doc

Regards

Pavel

>
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index e520cdf3ba..7d816fe701 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -1303,10 +1303,10 @@ testdb=>
 
   
 \dE[S+] [ pattern ]
-\di[S+] [ pattern ]
-\dm[S+] [ pattern ]
+\di[Ssd+] [ pattern ]
+\dm[Ssd+] [ pattern ]
 \ds[S+] [ pattern ]
-\dt[S+] [ pattern ]
+\dt[Ssd+] [ pattern ]
 \dv[S+] [ pattern ]
 
 
@@ -1328,6 +1328,13 @@ testdb=>
 pattern or the S modifier to include system
 objects.
 
+
+
+When command contains s, then a result is
+sorted by size. When command contains d then
+result is in descend order. \dtsd+ shows list
+of tables sorted by size with descend order.
+
 
   
 
@@ -2253,7 +2260,7 @@ SELECT
 
 
   
-\l[+] or \list[+] [ pattern ]
+\l[sd+] or \list[+] [ pattern ]
 
 
 List the databases in the server and show their names, owners,
@@ -2265,6 +2272,12 @@ SELECT
 (Size information is only available for databases that the current
 user can connect to.)
 
+
+
+If s is used in command name, then the list is
+sorted by size. When d is used there, then result
+is in descend order.
+
 
   
 
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 041b5e0c87..aae88b08b4 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -349,8 +349,9 @@ exec_command(const char *cmd,
 		status = exec_command_include(scan_state, active_branch, cmd);
 	else if (strcmp(cmd, "if") == 0)
 		status = exec_command_if(scan_state, cstack, query_buf);
-	else if (strcmp(cmd, "l") == 0 || strcmp(cmd, "list") == 0 ||
-			 strcmp(cmd, "l+") == 0 || strcmp(cmd, "list+") == 0)
+	else if (strcmp(cmd, "list") == 0 || strcmp(cmd, "list+") == 0 ||
+			 strcmp(cmd, "l") == 0 || strncmp(cmd, "l+", 2) == 0 ||
+			 strncmp(cmd, "ls", 2) == 0)
 		status = exec_command_list(scan_state, active_branch, cmd);
 	else if (strncmp(cmd, "lo_", 3) == 0)
 		status = exec_command_lo(scan_state, active_branch, cmd);
@@ -702,7 +703,9 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	{
 		char	   *pattern;
 		bool		show_verbose,
-	show_system;
+	show_system,
+	sort_desc;
+		sortby_type	sortby;
 
 		/* We don't do SQLID reduction on the pattern yet */
 		pattern = psql_scan_slash_option(scan_state,
@@ -711,6 +714,9 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 		show_verbose = strchr(cmd, '+') ? true : false;
 		show_system = strchr(cmd, 'S') ? true : false;
 
+		sortby = SORTBY_SCHEMA_NAME;
+		sort_desc = false;
+
 		switch (cmd[1])
 		{
 			case '\0':
@@ -720,7 +726,8 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	success = describeTableDetails(pattern, show_verbose, show_system);
 else
 	/* standard listing of interesting things */
-	success = listTables("tvmsE", NULL, show_verbose, show_system);
+	success = listTables("tvmsE", NULL, show_verbose, show_system,
+		 false, false);
 break;
 			case 'A':
 success = describeAccessMethods(pattern, show_verbose);
@@ -789,12 +796,20 @@ exec_command_d(PsqlScanState scan_state, bool active_branch, const char *cmd)
 success = describeTypes(pattern, show_verbose, show_system);
 break;
 			case 't':
-			case 'v':
 			case 'm':
 			case 'i':
+if (strlen(cmd) >= 2)
+{
+	if (strchr(&cmd[2], 's') != NULL)
+		sortby = SORTBY_SIZE;
+	sort_desc = strchr(&cmd[2], 'd') ? true : false;
+}
+
+			case 'v':
 			case 's':
 			case 'E':
-success = listTables(&cmd[1], pattern, show_verbose, show_system);
+success = listTables(&cmd[1], pattern, show_verbose, show_system,
+	 sortby, sort_desc);
 break;
 			case 'r':
 if (cmd[2] == 'd' && cmd[3] == 's')
@@ -1655,13 +1670,17 @@ exec_command_list(PsqlScanState scan_state, bool active_branch, const char *cmd)
 	{
 		char	   *pattern;
 		bool		show_verbose;
+		bool		sort_desc;
+		sortby_type	sortby;
 
 		pattern = psql_scan_slash_option(scan_state,
 		 OT_NORMAL, NULL, true);
 
 		show_verbose = strchr(cmd, '+') ? true : false;
+		sortby = strchr(cmd, 's') != NULL ? SORTBY_SIZE : SORTBY_NAME;
+		sort_desc = strchr(cmd, 'd') ? true : false;
 
-		success = listAllDbs(pattern, show_verbose);
+		success = listAllDbs(pattern, 

[HACKERS] strange relcache.c debug message

2017-11-01 Thread Alvaro Herrera
While messing with BRIN bugs, I noticed this debug message in the server
log:

2017-11-01 12:33:24.042 CET [361429] DEBUG:  rehashing catalog cache id 14 for 
pg_opclass; 17 tups, 8 buckets en carácter 194

notice that at the end it says "at character 194".  I suppose that's
because of some leftover errposition() value, but why is it being
reported in this message?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Try to fix endless loop in ecpg with informix mode

2017-11-01 Thread 高增琦
Any comments?

2017-10-26 16:03 GMT+08:00 高增琦 :

> Hi,
>
> I tried some tests with ecpg informix mode.
> When trying to store float data into a integer var, I got endless loop.
>
> The reason is:
> In informix mode, ecpg can accept
> string form of float number when processing query result.
> During checking the string form of float number, it seems
> that ecpg forgot to skip characters after '.'.
> Then outer loop will never stop because it hopes to see '\0'.
>
> The first patch will reproduce the problem in ecpg's regress test.
> The second patch tries to fix it in simple way.
>
> --
> GaoZengqi
> pgf...@gmail.com
> zengqi...@gmail.com
>



-- 
GaoZengqi
pgf...@gmail.com
zengqi...@gmail.com


Re: [HACKERS] Dynamic result sets from procedures

2017-11-01 Thread Pavel Stehule
2017-10-31 22:08 GMT+01:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> This patch is more of a demo of what could be done, not my primary
> focus, but if there is interest and some assistance, maybe we can make
> something out of it.  This patch also goes on top of "SQL procedures"
> version 1.
>
> The purpose is to return multiple result sets from a procedure.  This
> is, I think, a common request when coming from MS SQL and DB2.  MS SQL
> has a completely different procedure syntax, but this proposal is
> compatible with DB2, which as usual was the model for the SQL standard.
> So this is what it can do:
>
> CREATE PROCEDURE pdrstest1()
> LANGUAGE SQL
> AS $$
> DECLARE c1 CURSOR WITH RETURN FOR SELECT * FROM cp_test2;
> DECLARE c2 CURSOR WITH RETURN FOR SELECT * FROM cp_test3;
> $$;
>
> CALL pdrstest1();
>
> and that returns those two result sets to the client.
>
> That's all it does for now.  Things get more complex when you consider
> nested calls.  The SQL standard describes additional facilities how an
> outer procedure can accept a called procedure's result sets, or not.  In
> the thread on transaction control, I mentioned that we might need some
> kind of procedure call stack.  Something like that would be needed here
> as well.  There are also probably some namespacing issues around the
> cursors that need more investigation.
>
> A more mundane issue is how we get psql to print multiple result sets.
> I have included here a patch that does that, and you can see that new
> result sets start popping up in the regression tests already.  There is
> also one need error that needs further investigation.
>
> We need to think about how the \timing option should work in such
> scenarios.  Right now it does
>
> start timer
> run query
> fetch result
> stop timer
> print result
>
> If we had multiple result sets, the most natural flow would be
>
> start timer
> run query
> while result sets
> fetch result
> print result
> stop timer
> print time
>
> but that would include the printing time in the total time, which the
> current code explicitly does not.  We could also temporarily save the
> result sets, like
>
> start timer
> run query
> while result sets
> fetch result
> stop timer
> foreach result set
> print result
>
> but that would have a lot more overhead, potentially.
>
> Thoughts?
>

Has the total time sense  in this case?

should not be total time related to any fetched result?

Regards

Pavel


> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers
>
>


Re: [HACKERS] Adding column_constraint description in ALTER TABLE synopsis

2017-11-01 Thread Lætitia Avrot
Hi all,

Thanks Stephen for the suggestion. I wan't thinking globally enough. I was
planning to look at it today but Amit was faster. So thanks Amit too!

Have a nice day (UGT),

Lætitia

2017-11-01 1:35 GMT+01:00 Amit Langote :

> On 2017/10/31 21:31, Stephen Frost wrote:
> > * Lætitia Avrot (laetitia.av...@gmail.com) wrote:
> >> As Amit Langot pointed out, the column_constraint definition is missing
> >> whereas it is used in ALTER TABLE synopsis. It can be easily found in
> the
> >> CREATE TABLE synopsis, but it's not very user friendly.
> >
> > Thanks, this looks pretty reasonable, but did you happen to look for any
> > other keywords in the ALTER TABLE that should really be in ALTER TABLE
> > also?
> >
> > I'm specifically looking at, at least, partition_bound_spec.  Maybe you
> > could propose an updated patch which addresses that also, and any other
> > cases you find?
>
> Ah, yes.  I remember having left out partition_bound_spec simply because I
> thought it was kind of how it was supposed to be done, seeing that neither
> column_constraint and table_constraint were expanded in the ALTER TABLE's
> synopsis.
>
> It seems that there are indeed a couple of other things that need to be
> brought over to ALTER TABLE synopsis including partition_bound_spec.
> 9f295c08f877 [1] added table_constraint, but missed to add the description
> of index_parameters and exclude_element which are referenced therein.
>
> Attached find updated version of the Lætitia's patch.
>
> Thanks,
> Amit
>
> [1] https://git.postgresql.org/gitweb/?p=postgresql.git;a=commit;h=9f295c
>


[HACKERS] Commit fest 2017-11

2017-11-01 Thread Michael Paquier
Hi all,

At the moment of writing this email, it is 9PM AoE (Anywhere on Earth)
31st of October. This means that the next commit fest will begin in 3
hours, and that any hackers willing to register patches for this
commit fest have roughly three hours to do so (plus/minus N hours).

This current score is the following:
Needs review: 125.
Waiting on Author: 22.
Ready for Committer: 39.
This represents a total of 186 patches still pending for review, for
the second largest commit fest ever.

Anybody willing to take the hat of the commit fest manager? If nobody,
I am fine to take the hat as default choice this time.

Thanks,
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] WIP: Restricting pg_rewind to data/wal dirs

2017-11-01 Thread Chris Travers
Attached is the patch as submitted for commitfest.

Please note, I am not adverse to adding an additional --Include-path
directive if that avoids backwards-compatibility problems.  However the
patch is complex enough I would really prefer review on the rest of it to
start first.  This doesn't strike me as perfectly trivial and I think it is
easier to review pieces separately.  I have already started on the
--Include-path directive and could probably get it attached to a later
version of the patch very soon.

I would also like to address a couple of important points here:

1.  I think default restrictions plus additional paths is the best, safest
way forward.  Excluding shell-globs doesn't solve the "I need this
particular config file" very well particularly if we want to support this
outside of an internal environment.  Shell globs also tend to be overly
inclusive and so if you exclude based on them, you run into a chance that
your rewind is corrupt for being overly exclusive.

2.  I would propose any need for an additional paths be specified using an
--Include-path directive.  This could be specified multiple times and could
point to a file or directory which would be added to the paths rewound.  I
have a patch for this mostly done but I am concerned that these sorts of
changes result in a combination of changes that are easier to review
separately than together.  So if needed, I can add it or in a separate
patch following.

3.  I think it would be a mistake to tie backup solutions in non-replicated
environments to replication use cases, and vice versa.  Things like
replication slots (used for streaming backups) have different
considerations in different environments.  Rather than build the same
infrastructure first, I think it is better to support different use cases
well and then build common infrastructure to support the different cases.
I am not against building common infrastructure for pg_rewind and
pg_basebackup.  I am very much against having the core guarantees being the
exact same.

Best Wishes,
Chris Travers

On Sat, Oct 28, 2017 at 1:22 PM, Chris Travers 
wrote:

> Hi;
>
> There are still some cleanup bits needed here but I wanted to get feedback
> on my general direction.
>
> I hope to submit for commit fest soon if the general feedback is good.
> Tests are passing (with adjustments intended for change of behaviour in one
> test script).  I want to note that Crimson Thompson (also with Adjust) has
> provided some valuable discussion on this code.
>
> The Problem:
>
> pg_rewind makes no real guarantees regarding the final state of non-data
> files or directories.  It.checks to see if the timeline has incremented
> (and therefore guarantees that if successful the data directories are on
> the same timeline) but for non-data files, these are clobbered if we rewind
> and left intact if not.  These other files include postgresql.auto.conf,
> replication slots, and can include log files.
>
> Copying logs over to the new slave is something one probably never wants
> to do (same with replication slots), and the behaviours here can mask
> troubleshooting regarding what a particular master failed, cause wal
> segments to build up, automatic settings changes, and other undesirable
> behaviours.  Together these make pg_rewind very difficult to use properly
> and push tasks to replication management tooling that the management tools
> are not in a good position to handle correctly.
>
> Designing the Fix:
>
> Two proposed fixes have been considered and one selected:  Either we
> whitelist directories and only rewind those.  The other was to allow shell
> globs to be provided that could be used to exclude files.  The whitelisting
> solution was chosen for a number of reasons.
>
> When pg_rewind "succeeds" but not quite correctly the result is usually a
> corrupted installation which requires a base backup to replace it anyway.
> In a recovery situation, sometimes pressures occur which render human
> judgment less effective, and therefore glob exclusion strikes me as
> something which would probably do more harm than good, but maybe I don't
> understand the use case (comments as to why some people look at the other
> solution as preferable would be welcome).
>
> In going with the whitelisting solution, we chose to include all
> directories with WAL-related information.This allows more predicable
> interaction with other parts of the replication chain.  Consequently not
> only do we copy pg_wal and pg_xact but also commit timestamps and so forth.
>
> The Solution:
>
> The solution is a whitelist of directories specified which are the only
> ones which are synchronised.  The relevant part of this patch is:
>
> +/* List of directories to synchronize:
>
> + * base data dirs (and ablespaces)
>
> + * wal/transaction data
>
> + * and that is it.
>
> + *
>
> + * This array is null-terminated to make
>
> + * it easy to expand
>
> + */
>
> +
>
> +const char *rewind_dirs[] = {
>
> +"base",
>
> +"global",
>

Re: [HACKERS] Patch: restrict pg_rewind to whitelisted directories

2017-11-01 Thread Chris Travers
On Tue, Oct 31, 2017 at 1:38 PM, Robert Haas  wrote:

> On Mon, Oct 30, 2017 at 6:44 PM, Chris Travers 
> wrote:
> > The attached patch is cleaned up and filed for the commit fest this next
> > month:
>
> It's generally better to post the patch on the same message as the
> discussion thread, or at least link back to the discussion thread from
> the new one.  Otherwise people may have trouble understanding the
> history.
>

Fair point.  Mostly concerned about the WIP marker there. This is my first
time around this.

I will then retire this thread and attach the patch on the other one.

>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Best Regards,
Chris Travers
Database Administrator

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin


Re: [HACKERS] Account for cost and selectivity of HAVING quals

2017-11-01 Thread Tels
Hello David,

On Tue, October 31, 2017 7:54 pm, David G. Johnston wrote:
> On Tue, Oct 31, 2017 at 4:31 PM, Tels 
> wrote:
>
>>
>> ​​
>> That looks odd to me, it first uses output_tuples in a formula, then
>> overwrites the value with a new value. Should these lines be swapped?
>>
>
> ​IIUC it is correct: the additional total_cost comes from processing every
> output group to check whether it is qualified - since every group is
> checked the incoming output_tuples from the prior grouping is used.  The
> side-effect of the effort is that the number of output_tuples has now been
> reduced to only those matching the qual - and so it now must take on a new
> value to represent this.

Ah, makes sense. Learned something new today.

Maybe it's worth to add a comment, or would everybody else beside me
understand it easily by looking at the code? :)

Thank you,

Tels


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers