Re: [HACKERS] New partitioning - some feedback

2017-07-15 Thread Mark Kirkwood

On 16/07/17 05:24, David Fetter wrote:


On Fri, Jul 14, 2017 at 09:49:25PM -0500, Robert Haas wrote:

On Mon, Jul 10, 2017 at 5:46 PM, David Fetter  wrote:

With utmost respect, it's less messy than adding '!' to the already
way too random and mysterious syntax of psql's \ commands.  What
should '\det!' mean?  What about '\dT!'?

Since \det lists foreign tables, \det! would list foreign tables even
if they are partitions.  Plain \det would show only the ones that are
not partitions.

\dT! wouldn't be meaningful, since \dT lists data types and data types
can't be partitions.  If you're trying to conjure up a rule that every
\d command must accept the same set of modifiers, a quick
look at the output of \? and a little experimentation will quickly
show you that neither S nor + apply to all command types, so I see no
reason why that would need to be true for a new modifier either.

TBH, I think we should just leave this well enough alone.  We're
post-beta2 now, there's no clear consensus on what to do here, and
there will be very little opportunity for users to give us feedback if
we stick a change into an August beta3 before a September final
release.

I think a new modifier would be too rushed at this stage, but there's
no reason to throw out the progress on \d vs \dt.




+1

And similarly, there seemed to be a reasonably clear push to label the 
'partitions' as such.


regards

Mark


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-15 Thread Jeff Janes
On Fri, Jul 14, 2017 at 7:48 AM, Vladimir Borodin  wrote:

>
> 14 июля 2017 г., в 1:33, Stephen Frost  написал(а):
>
> What would be really nice for such cases is support for Kerberos and
> delegated Kerberos credentials.  Having pgpool support that would remove
> the need to deal with passwords at all.
>
>
> Since nearly all systems with some kind of load nowadays use connection
> poolers (pgpool-II or pgbouncer) between applications and postgres, it is a
> pretty big pain to re-implement all authentication methods supported by
> postgres in such poolers. Kerberos is cool but not the only thing that
> should be supported by FDWs or connection poolers. I.e. many users would
> want to have support for LDAP and SCRAM.
>

For the postgres_fdw, LDAP and SCRAM just work.  In the case of SCRAM (and
MD5), it would be nice if you could store something other than the
plain-text password, but that is a different matter.   If other FDW connect
to something which can do LDAP or SCRAM, I don't see why those FDW would
have any difficulty, either.


> And every time when there would be some changes in postgres auth methods,
> exactly the same work (or even worse) should be done in many (at least two)
> other products widely used by people.
>

That is not all that often.


>
> It seems that postgres either should provide connection pooling feature in
> core
>

That would be nice, but since pgpool and pgbouncer co-exist with each
other, I see no reason to think they wouldn't continue to exist even if
there were an in-core pooler.

Cheers,

Jeff


Re: [HACKERS] Pluggable storage

2017-07-15 Thread Peter Geoghegan
On Sat, Jul 15, 2017 at 3:36 PM, Alexander Korotkov
 wrote:
> I think that pruning and vacuum are artifacts of not only current heap
> formats, but they are also artifacts of current index AM API.  And this is
> more significant circumstance given that we're going to preserve
> compatibility of new storage engines with current index AMs.  Our current
> index AM API assumes that we can delete from index only in bulk manner.  Our
> payload to index key is TID, not arbitrary piece of data.  And that payload
> can't be updated.

I agree that this is a big set of problems. This is where I think we
can get the most benefit.

One nice thing about having a fully logical TID is that you don't have
to bloat indexes just because a HOT update was not possible. You bloat
something else instead, certainly, but that can be optimized for
garbage collection. Not all bloat is equal. MVCC based on UNDO can be
very fast because UNDO is very well optimized for garbage collection,
and so can be bloated with no long term consequences, and minor short
term consequences. Still, it isn't fair to blame the Postgres VACUUM
design for the fact that Postgres bloats indexes so easily. Nothing
stops us from adding a new layer of indirection, so that bloating an
index degrades into something that is not even as bad as bloating the
heap [1]. We may just have a data structure problem, which is not
nearly the same thing as a fundamental design problem in the storage
layer.

This approach could pay off soon if we start with unique indexes,
where there is "queue" of row versions that can be pruned with simple
logic, temporal locality helps a lot, only zero or one versions can be
visible to your MVCC snapshot, etc. This might require only minimal
revisions the index AM API, to help nbtree. We could later improve
this so that you bloat UNDO instead of bloating a heap-like structure,
both for indexes and for the actual heap. That seems less urgent.

To repeat myself, for emphasis: *Not all bloat is equal*. Index bloat
makes the way a B-Tree's keyspace is split up far too granular, making
pages sparely packed, a problem that is more or less *irreversible* by
VACUUM or any garbage collection process [2]. That's how B-Trees work
-- they're optimized for concurrency, not for garbage collection.

>> InnoDB isn't much like the PostgreSQL heap, and
>> neither is SQL Server, IIUC.  If we're creating a heap format that can
>> only be different in trivial ways from what we have now, and anything
>> that really changes the paradigm is off-limits, well, that's not
>> really interesting enough to justify the work of creating a heap
>> storage API.
>
>
> My concern is that we probably can't do anything that really changes
> paradigm while preserving compatibility with index AM API.  If you don't
> agree with that, it would be good to provide some examples.  It seems
> unlikely for me that we're going to have something like InnoDB or SQL Server
> table with our current index AM API.  InnoDB utilizes index-organized tables
> where primary and secondary indexes are versioned independently.  SQL Server
> utilizes flat data structure similar to our heap, but MVCC implementation
> also seems very different.

I strongly agree. I simply don't understand how you can adopt UNDO for
MVCC, and yet expect to get a benefit commensurate with the effort
without also implementing "retail index tuple deletion" first.
Pursuing UNDO this way has the same problem that WARM likely has -- it
doesn't really help with the worst case, where users get big,
unpleasant surprises. Postgres is probably the only major database
system that doesn't support retail index tuple deletion. It's a basic
thing, that has nothing to do with MVCC. Really, what do we have to
lose?

The biggest weakness of the current design is IMV how it fails to
prevent index bloat in the first place, but avoiding bloating index
leaf pages in the first place doesn't seem incompatible with how
VACUUM works. Or at least, let's not assume that it is. We should
avoid throwing the baby out with the bathwater.

> I think in general there are two ways dealing with out index AM API
> limitation.  One of them is to extend index AM API.  At least, we would need
> a method for deletion of individual index tuple (for sure, we already have
> kill_prior_tuple but it's just a hint for now).

kill_prior_tuple can work well, but, like HOT, it works
inconsistently, in a way that is hard to predict.

> Also, it would be nice to
> have arbitrary payload to index tuples instead of TID, and method to update
> that payload.  But that would be quite big amount of work.  Alternatively,
> we could allow pluggable storages to have their own index AMs, and that will
> move this amount of work to the pluggable storage side.

I agree with Robert that being able to store an arbitrary payload as a
TID is probably not going to ever work very well. However, I don't
think that that's a reason to give up on the underlying idea: creating
a new layer of ind

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-15 Thread Julien Rouhaud
On 15/07/2017 17:22, Tom Lane wrote:
> Julien Rouhaud  writes:
>> Since b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755 (Andres in Cc), if you
>> write queries which result in infinite recursion (or just too many
>> nested function calls), execution ends with segfault instead of intended
>> exhausted max_stack_depth:
> 
> Yes.  We discussed this before the patch went in [1].

Ah, I totally forgot about it, sorry.

>  I wanted to put
> a stack depth check in ExecProcNode(), and still do.  Andres demurred,
> claiming that that was too much overhead, but didn't really provide a
> credible alternative.  The thread drifted off without any resolution,
> but clearly we need to do something before 10.0 final.
> 

I wanted to add an open item but I see you already did, thanks!

>> Please find attached a trivial patch to fix this.  I'm not sure
>> ExecMakeTableFunctionResult() is the best or only place that needs to
>> check the stack depth.
> 
> I don't think that that's adequate at all.
> 
> I experimented with a variant that doesn't go through
> ExecMakeTableFunctionResult:
> 
> [...]
> This manages not to crash, but I think the reason it does not is purely
> accidental: "SELECT so()" has a nontrivial targetlist so we end up running
> ExecBuildProjectionInfo on that, meaning that a fresh expression
> compilation happens at every nesting depth, and there are
> check_stack_depth calls in expression compilation.  Surely that's
> something we'd try to improve someday.  Or it could accidentally get
> broken by unrelated changes in the way plpgsql sets up queries to be
> executed.
> 
> I still think that we really need to add a check in ExecProcNode().
> Even if there's an argument to be made that every recursion would
> somewhere go through ExecMakeTableFunctionResult, very large/complex
> queries could result in substantial stack getting chewed up before
> we get to that --- and we don't have an infinite amount of stack slop.

I was pretty sceptical about checking depth in
ExecMakeTableFunctionResult() only vs ExecProcNode(), and this has
finished convincing me so I definitely agree.

-- 
Julien Rouhaud
http://dalibo.com - http://dalibo.org


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More flexible LDAP auth search filters?

2017-07-15 Thread Thomas Munro
On Fri, Jul 14, 2017 at 11:04 PM, Magnus Hagander  wrote:
> On Thu, Jul 13, 2017 at 9:31 AM, Thomas Munro
>  wrote:
>> A post on planet.postgresql.org today reminded me that a colleague had
>> asked me to post this POC patch here for discussion.  It allows custom
>> filters with ldapsearchprefix and ldapsearchsuffix.  Another approach
>> might be to take a filter pattern with "%USERNAME%" or whatever in it.
>> There's an existing precedent for the prefix and suffix approach, but
>> on the other hand a pattern approach would allow filters where the
>> username is inserted more than once.
>
>
> Do we even need prefix/suffix? If we just make it "ldapsearchpattern", then
> you could have something like:
>
> ldapsearchattribute="uid"
> ldapsearchfilter="|(memberof=cn=Paris DBA Team)(memberof=cn=Tokyo DBA Team)"
>
> We could then always to substitution of the kind:
> (&(attr=)())
>
> which would in this case give:
> (&(uid=mha)(|(memberof=cn=Paris DBA Team)(memberof=cn=Tokyo DBA Team)))
>
>
> Basically we'd always AND together the username lookup with the additional
> filter.

Ok, so we have 3 ideas put forward:

1.  Wrap username with ldapsearchprefix ldapsearchsuffix to build
filter (as implemented by POC patch)
2.  Optionally AND ldapsearchfilter with the existing
ldapsearchattribute-based filter (Magnus's proposal)
3.  Pattern-based ldapsearchfilter so that %USER% is replaced with
username (my other suggestion)

The main argument for approach 1 is that it follows the style of the
bind-only mode.

With idea 2, I wonder if there are some more general kinds of things
that people might want to do that that wouldn't be possible because it
has to include (attribute=user)... perhaps something involving a
substring or other transformation functions (but I'm no LDAP expert,
that may not make sense).

With idea 3 it would allow "(|(foo=%USER%)(bar=%USER%))", though I
don't know if any such multiple-mention filters would ever make sense
in a sane LDAP configuration.

Any other views from LDAP-users?

-- 
Thomas Munro
http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Pluggable storage

2017-07-15 Thread Alexander Korotkov
On Sat, Jul 15, 2017 at 5:14 AM, Robert Haas  wrote:

> On Thu, Jun 22, 2017 at 9:30 AM, Alexander Korotkov
>  wrote:
> > If #1 is only about changing tuple and page formats, then could be much
> > simpler than the patch upthread?  We can implement "page format access
> > methods" with routines for insertion, update, pruning and deletion of
> tuples
> > *in particular page*.  There is no need to redefine high-level logic for
> > scanning heap, doing updates and so on...
>
> That assumes that every tuple format does those things in the same
> way, which I suspect is not likely to be the case.  I think that
> pruning and vacuum are artifacts of the current heap format, and they
> may be nonexistent or take some altogether different form in some
> other storage engine.


I think that pruning and vacuum are artifacts of not only current heap
formats, but they are also artifacts of current index AM API.  And this is
more significant circumstance given that we're going to preserve
compatibility of new storage engines with current index AMs.  Our current
index AM API assumes that we can delete from index only in bulk manner.
Our payload to index key is TID, not arbitrary piece of data.  And that
payload can't be updated.

InnoDB isn't much like the PostgreSQL heap, and
> neither is SQL Server, IIUC.  If we're creating a heap format that can
> only be different in trivial ways from what we have now, and anything
> that really changes the paradigm is off-limits, well, that's not
> really interesting enough to justify the work of creating a heap
> storage API.


My concern is that we probably can't do anything that really changes
paradigm while preserving compatibility with index AM API.  If you don't
agree with that, it would be good to provide some examples.  It seems
unlikely for me that we're going to have something like InnoDB or SQL
Server table with our current index AM API.  InnoDB utilizes
index-organized tables where primary and secondary indexes are versioned
independently.  SQL Server utilizes flat data structure similar to our
heap, but MVCC implementation also seems very different.

I think in general there are two ways dealing with out index AM API
limitation.  One of them is to extend index AM API.  At least, we would
need a method for deletion of individual index tuple (for sure, we already
have kill_prior_tuple but it's just a hint for now).  Also, it would be
nice to have arbitrary payload to index tuples instead of TID, and method
to update that payload.  But that would be quite big amount of work.
Alternatively, we could allow pluggable storages to have their own index
AMs, and that will move this amount of work to the pluggable storage side.

The thing which we would evade is storage API, which would be invasive like
something changing paradigm, but actually allowing just trivial changes in
heap format.  Mechanical replacement of heap methods with storage API
methods could lead us there.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


[HACKERS] Something for the TODO list: deprecating abstime and friends

2017-07-15 Thread Tom Lane
The types abstime, reltime, and tinterval need to go away, or be
reimplemented, sometime well before 2038 when they will overflow.
It's not too soon to start having a plan for that, especially seeing
that it seems to take a decade or more for us to actually get rid
of anything we've deprecated.

Right offhand, I don't think there is any functionality in these
types that isn't handled as well or better by timestamptz, interval,
and tstzrange respectively.  And they're basically undocumented
except for a sort-of deprecation notice just above section 8.5.1.
So my inclination is to remove them rather than try to upgrade them
in any way.  However, we'd have to do something about:

* The legacy system views pg_shadow and pg_user have abstime columns.
Experimentation suggests that we could convert those to timestamptz(0)
and the output format wouldn't change, so maybe that's a good enough
fix there.

* contrib/spi/timetravel depends on abstime columns to represent what
would nowadays be better done as a tstzrange.  I'd have thought we
could maybe toss that example module overboard, but we just today got
a question about its usage, so I'm afraid it's not quite dead yet.
What shall we do with that?

While it's too late in the v10 cycle to do anything very meaningful
about this now, I am tempted to strengthen the deprecation notice's
wording from "might disappear" to "will disappear".  And it's not good
that the documentation of contrib/spi/timetravel contains nothing
whatever pointing out the lack of future-proof-ness of abstime.

Thoughts?

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] More race conditions in logical replication

2017-07-15 Thread Alvaro Herrera
After going over this a bunch more times, I found other problems.  For
example, I noticed that if I create a temporary slot in one session,
then another session would rightly go to sleep if it tries to drop that
slot.  But the slot goes away when the first session disconnects, so I'd
expect the sleeping session to get a signal and wake up, but that
doesn't happen.

I'm going to continue to look at this and report back Tuesday 18th.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] New partitioning - some feedback

2017-07-15 Thread David Fetter
On Fri, Jul 14, 2017 at 09:49:25PM -0500, Robert Haas wrote:
> On Mon, Jul 10, 2017 at 5:46 PM, David Fetter  wrote:
> > With utmost respect, it's less messy than adding '!' to the already
> > way too random and mysterious syntax of psql's \ commands.  What
> > should '\det!' mean?  What about '\dT!'?
> 
> Since \det lists foreign tables, \det! would list foreign tables even
> if they are partitions.  Plain \det would show only the ones that are
> not partitions.
> 
> \dT! wouldn't be meaningful, since \dT lists data types and data types
> can't be partitions.  If you're trying to conjure up a rule that every
> \d command must accept the same set of modifiers, a quick
> look at the output of \? and a little experimentation will quickly
> show you that neither S nor + apply to all command types, so I see no
> reason why that would need to be true for a new modifier either.
> 
> TBH, I think we should just leave this well enough alone.  We're
> post-beta2 now, there's no clear consensus on what to do here, and
> there will be very little opportunity for users to give us feedback if
> we stick a change into an August beta3 before a September final
> release.

I think a new modifier would be too rushed at this stage, but there's
no reason to throw out the progress on \d vs \dt.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-15 Thread Tom Lane
I wrote:
> I still think that we really need to add a check in ExecProcNode().

Actually ... to what extent could a check in ExecInitNode() substitute
for that?  Or do we need both?  How about ExecEndNode() and ExecReScan()?

You could argue that the latter two tree traversals are unlikely either to
consume more stack than ExecInitNode() or to be called from a more deeply
nested point.  So maybe they're okay.  But I'm not sure I believe that
initialization stack needs always exceed execution stack needs, though
sometimes they might.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-15 Thread Tom Lane
Julien Rouhaud  writes:
> Since b8d7f053c5c2bf2a7e8734fe3327f6a8bc711755 (Andres in Cc), if you
> write queries which result in infinite recursion (or just too many
> nested function calls), execution ends with segfault instead of intended
> exhausted max_stack_depth:

Yes.  We discussed this before the patch went in [1].  I wanted to put
a stack depth check in ExecProcNode(), and still do.  Andres demurred,
claiming that that was too much overhead, but didn't really provide a
credible alternative.  The thread drifted off without any resolution,
but clearly we need to do something before 10.0 final.

> Please find attached a trivial patch to fix this.  I'm not sure
> ExecMakeTableFunctionResult() is the best or only place that needs to
> check the stack depth.

I don't think that that's adequate at all.

I experimented with a variant that doesn't go through
ExecMakeTableFunctionResult:

CREATE OR REPLACE FUNCTION so()
 RETURNS int
 LANGUAGE plpgsql
AS $$
DECLARE
 rec RECORD;
BEGIN
FOR rec IN SELECT so() as x
LOOP
RETURN rec.x;
END LOOP;
RETURN NULL;
END;
$$;

SELECT so();

This manages not to crash, but I think the reason it does not is purely
accidental: "SELECT so()" has a nontrivial targetlist so we end up running
ExecBuildProjectionInfo on that, meaning that a fresh expression
compilation happens at every nesting depth, and there are
check_stack_depth calls in expression compilation.  Surely that's
something we'd try to improve someday.  Or it could accidentally get
broken by unrelated changes in the way plpgsql sets up queries to be
executed.

I still think that we really need to add a check in ExecProcNode().
Even if there's an argument to be made that every recursion would
somewhere go through ExecMakeTableFunctionResult, very large/complex
queries could result in substantial stack getting chewed up before
we get to that --- and we don't have an infinite amount of stack slop.

regards, tom lane

[1] https://www.postgresql.org/message-id/22833.1490390...@sss.pgh.pa.us


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] Adding -E switch to pg_dumpall

2017-07-15 Thread Michael Paquier
On Fri, Jul 14, 2017 at 12:39 PM, Michael Paquier
 wrote:
> While looking at a user problem, I got surprised that pg_dumpall does
> not have a -E switch. This has been discussed a bit in the past like
> here:
> https://www.postgresql.org/message-id/75e4c42d37e6a74e9fb57c2e9f1300d601070...@tiger.nexperience.com
>
> Now it is possible to enforce the encoding used by using
> PGCLIENTENCODING but I think that a switch would be useful as well,
> particularly for Windows where "set" needs to be used. Are there any
> objections to a patch adding support for that? Such a patch should do:
> - Call PQsetClientEncoding if an encoding is defined after getting a 
> connection.
> - Pass down -E to pg_dump calls if something is set.
>
> Thoughts?

Attached is a patch to add support for this switch. I am parking that
in the next CF.
-- 
Michael


pgdumpall-encoding-v1.patch
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] hash index on unlogged tables doesn't behave as expected

2017-07-15 Thread Michael Paquier
On Sat, Jul 15, 2017 at 6:27 AM, Ashutosh Sharma  wrote:
> I do agree with Amit. I think hash index is slightly trickier (in
> terms of how it is organised) than other indexes and that could be the
> reason for maintaining common code for hashbuild and hashbuildempty.

Well, you both and Robert worked more on this code for PG10 than I
did, so I am fine to rely on your judgement for the final result.
Still I find this special handling quite surprising. All other AMs
just always log FPWs for the init fork pages so I'd rather not break
this treaty, but that's one against the world as things stand
currently on this thread ;)
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers


Re: [HACKERS] SCRAM auth and Pgpool-II

2017-07-15 Thread Michael Paquier
On Sat, Jul 15, 2017 at 12:15 AM, Stephen Frost  wrote:
> While this might be possible by having some kind of special trusted
> connection between the pooler and PG, I actually don't particularly like
> the notion of inventing a bunch of complicated logic and pain so that a
> connection pooler can avoid implementing a proper authentication system.

I strongly agree with that.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers