date:20170718

Re: [HACKERS] Something for the TODO list: deprecating abstime and friends

2017-07-18 Thread Mark Dilger


> On Jul 18, 2017, at 9:13 PM, Mark Dilger  wrote:
> 
>> 
>> On Jul 17, 2017, at 2:34 PM, Mark Dilger  wrote:
>> 
>>> 
>>> On Jul 17, 2017, at 2:14 PM, Tom Lane  wrote:
>>> 
>>> Mark Dilger  writes:
> On Jul 17, 2017, at 12:54 PM, Mark Dilger  wrote:
 On Jul 15, 2017, at 3:00 PM, Tom Lane  wrote:
>> The types abstime, reltime, and tinterval need to go away, or be
>> reimplemented, sometime well before 2038 when they will overflow.
>>> 
> These types provide a 4-byte datatype for storing real-world second
> precision timestamps, as occur in many log files.  Forcing people to
> switch to timestamp or timestamptz will incur a 4 byte per row
> penalty.  In my own builds, I have changed the epoch on these so
> they won't wrap until sometime after 2100 C.E.  I see little point in
> switching to an 8-byte millisecond precision datatype when a perfectly
> good 4-byte second precision datatype already serves the purpose.
>>> 
>>> Well, if you or somebody is willing to do the legwork, I'd be on board
>>> with a plan that says that every 68 years we redefine the origin of
>>> abstime.  I imagine it could be done so that currently-stored abstime
>>> values retain their present meaning as long as they're not too old.
>>> For example the initial change would toss abstimes before 1970 overboard,
>>> repurposing that range of values as being 2038-2106, but values between
>>> 1970 and 2038 still mean the same as they do today.  If anybody still
>>> cares in circa 2085, we toss 1970-2038 overboard and move the origin
>>> again, lather rinse repeat.
>>> 
>>> But we're already past the point where it would be time to make the
>>> first such switch, if we're gonna do it.  So I'd like to see somebody
>>> step up to the plate sooner not later.
>> 
>> Assuming other members of the community would not object to such
>> a plan, I'd be willing to step up to that plate.  I'll wait a respectable 
>> time,
>> maybe until tomorrow, to allow others to speak up.
> 
> There was not much conversation about this, so I went ahead with what
> I think makes a logical first step.  The attached patch removes the tinterval
> datatype from the sources.
> 
> I intend to remove reltime next, and then make the changes to abstime in
> a third patch.

As predicted, this second patch (which should be applied *after* the prior
tinterval_abatement patch) removes the reltime datatype from the sources.

mark



reltime_abatement.patch.1
Description: Binary data

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] merge psql ef/ev sf/sv handling functions

2017-07-18 Thread Fabien COELHO




While reviewing Corey's \if patch, I complained that there was some amount
of copy-paste in "psql/command.c".

Here is an attempt at merging some functions which removes 160 lines of
code.


Thank you for the patch. Is this an item for PG11?


Yes, as it is submitted to CF 2017-09.

--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Yugo Nagata

On Tue, 18 Jul 2017 10:10:49 -0400
Tom Lane  wrote:

Thank you for your comments. I understand the problem of my proposal
patch.

> Andres Freund  writes:
> > On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
> >> I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
> 
> > Not sure if that really does that much to solve the concern.
> 
> Well, it reduces the amount of data churn that a statement shorter than
> PGSTAT_STAT_INTERVAL could cause.
> 
> > Another,
> > pretty half-baked, approach would be to add a procsignal triggering idle
> > backends to send stats, and send that to all idle backends when querying
> > stats. We could even publish the number of outstanding stats updates in
> > PGXACT or such, without any locking, and send it only to those that have
> > outstanding ones.
> 
> If somebody wanted to do the work, that'd be a viable answer IMO.  You'd
> really want to not wake backends that have nothing more to send, but
> I agree that it'd be possible to advertise that in shared memory.
> 
>   regards, tom lane


-- 
Yugo Nagata 


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Something for the TODO list: deprecating abstime and friends

2017-07-18 Thread Mark Dilger


> On Jul 17, 2017, at 2:34 PM, Mark Dilger  wrote:
> 
>> 
>> On Jul 17, 2017, at 2:14 PM, Tom Lane  wrote:
>> 
>> Mark Dilger  writes:
 On Jul 17, 2017, at 12:54 PM, Mark Dilger  wrote:
>>> On Jul 15, 2017, at 3:00 PM, Tom Lane  wrote:
> The types abstime, reltime, and tinterval need to go away, or be
> reimplemented, sometime well before 2038 when they will overflow.
>> 
 These types provide a 4-byte datatype for storing real-world second
 precision timestamps, as occur in many log files.  Forcing people to
 switch to timestamp or timestamptz will incur a 4 byte per row
 penalty.  In my own builds, I have changed the epoch on these so
 they won't wrap until sometime after 2100 C.E.  I see little point in
 switching to an 8-byte millisecond precision datatype when a perfectly
 good 4-byte second precision datatype already serves the purpose.
>> 
>> Well, if you or somebody is willing to do the legwork, I'd be on board
>> with a plan that says that every 68 years we redefine the origin of
>> abstime.  I imagine it could be done so that currently-stored abstime
>> values retain their present meaning as long as they're not too old.
>> For example the initial change would toss abstimes before 1970 overboard,
>> repurposing that range of values as being 2038-2106, but values between
>> 1970 and 2038 still mean the same as they do today.  If anybody still
>> cares in circa 2085, we toss 1970-2038 overboard and move the origin
>> again, lather rinse repeat.
>> 
>> But we're already past the point where it would be time to make the
>> first such switch, if we're gonna do it.  So I'd like to see somebody
>> step up to the plate sooner not later.
> 
> Assuming other members of the community would not object to such
> a plan, I'd be willing to step up to that plate.  I'll wait a respectable 
> time,
> maybe until tomorrow, to allow others to speak up.

There was not much conversation about this, so I went ahead with what
I think makes a logical first step.  The attached patch removes the tinterval
datatype from the sources.

I intend to remove reltime next, and then make the changes to abstime in
a third patch.

mark



tinterval_abatement.patch.1
Description: Binary data



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] huge RAM use in multi-command ALTER of table heirarchy

2017-07-18 Thread Craig Ringer

On 19 July 2017 at 07:26, Tom Lane  wrote:

> Justin Pryzby  writes:
> > I've seen this before while doing SET STATISTICS on a larger number of
> columns
> > using xargs, but just came up while doing ADD of a large number of
> columns.
> > Seems to be roughly linear in number of children but superlinear WRT
> columns.
> > I think having to do with catalog update / cache invalidation with many
> > ALTERs*children*columns?
>
> I poked into this a bit.  The operation is necessarily roughly O(N^2) in
> the number of columns, because we rebuild the affected table's relcache
> entry after each elementary ADD COLUMN operation, and one of the principal
> components of that cost is reading all the pg_attribute entries.  However,
> that should only be a time cost not a space cost.  Eventually I traced the
> O(N^2) space consumption to RememberToFreeTupleDescAtEOX, which seems to
> have been introduced in Simon's commit e5550d5fe, and which strikes me as
> a kluge of the first magnitude.  Unless I am missing something, that
> function's design concept can fairly be characterized as "let's leak
> memory like there's no tomorrow, on the off chance that somebody somewhere
> is ignoring basic coding rules".
>
> I tried ripping that out, and the peak space consumption of your example
> (with 20 child tables and 1600 columns) decreased from ~3GB to ~200MB.
> Moreover, the system still passes make check-world, so it's not clear
> to me what excuse this code has to live.
>
> It's probably a bit late in the v10 cycle to be taking any risks in
> this area, but I'd vote for ripping out RememberToFreeTupleDescAtEOX
> as soon as the v11 cycle opens, unless someone can show an example
> of non-broken coding that requires it.  (And if so, there ought to
> be a regression test incorporating that.)


Just FYI, I believe Simon's currently on holiday, so may not notice this
discussion as promptly as usual.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[HACKERS] Re: Oddity in error handling of constraint violation in ExecConstraints for partitioned tables

2017-07-18 Thread Amit Langote

On 2017/07/18 16:20, Etsuro Fujita wrote:
> On 2017/07/18 11:03, Robert Haas wrote:
>> On Mon, Jul 10, 2017 at 5:44 PM, Robert Haas  wrote:
>>> The posted patches look OK to me.  Barring developments, I will commit
>>> them on 2017-07-17, or send another update by then.
>>
>> Committed them.
> 
> Thank you!

Thank you both.

Regards,
Amit



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] merge psql ef/ev sf/sv handling functions

2017-07-18 Thread Masahiko Sawada

On Sat, Apr 1, 2017 at 3:04 AM, Fabien COELHO  wrote:
>
> Hello,
>
> While reviewing Corey's \if patch, I complained that there was some amount
> of copy-paste in "psql/command.c".
>
> Here is an attempt at merging some functions which removes 160 lines of
> code.
>

Thank you for the patch. Is this an item for PG11?

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [GENERAL] huge RAM use in multi-command ALTER of table heirarchy

2017-07-18 Thread Tom Lane

Justin Pryzby  writes:
> I've seen this before while doing SET STATISTICS on a larger number of columns
> using xargs, but just came up while doing ADD of a large number of columns.
> Seems to be roughly linear in number of children but superlinear WRT columns.
> I think having to do with catalog update / cache invalidation with many
> ALTERs*children*columns?

I poked into this a bit.  The operation is necessarily roughly O(N^2) in
the number of columns, because we rebuild the affected table's relcache
entry after each elementary ADD COLUMN operation, and one of the principal
components of that cost is reading all the pg_attribute entries.  However,
that should only be a time cost not a space cost.  Eventually I traced the
O(N^2) space consumption to RememberToFreeTupleDescAtEOX, which seems to
have been introduced in Simon's commit e5550d5fe, and which strikes me as
a kluge of the first magnitude.  Unless I am missing something, that
function's design concept can fairly be characterized as "let's leak
memory like there's no tomorrow, on the off chance that somebody somewhere
is ignoring basic coding rules".

I tried ripping that out, and the peak space consumption of your example
(with 20 child tables and 1600 columns) decreased from ~3GB to ~200MB.
Moreover, the system still passes make check-world, so it's not clear
to me what excuse this code has to live.

It's probably a bit late in the v10 cycle to be taking any risks in
this area, but I'd vote for ripping out RememberToFreeTupleDescAtEOX
as soon as the v11 cycle opens, unless someone can show an example
of non-broken coding that requires it.  (And if so, there ought to
be a regression test incorporating that.)

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More flexible LDAP auth search filters?

2017-07-18 Thread Mark Cave-Ayland

On 17/07/17 18:08, Magnus Hagander wrote:

> On Mon, Jul 17, 2017 at 6:47 PM, Mark Cave-Ayland
> >
> wrote: 
> Great to hear from you! It has definitely been a while...
> 
> Indeed. You should spend more time on these lists :P

Well I do get the emails, unfortunately it's trying to find the time to
read them all ;)

> > How would that actually work, though? Given the same user in ldap could
> > now potentially match multiple different users in PostgreSQL. Would you
> > then create two accounts for the user, one with the uid as name and one
> > with email as name? Wouldn't that actually cause more issues than it 
> solves?
> 
> Normally what happens for search+bind is that you execute the custom
> filter with substitutions in order to find the entry for your bind DN,
> but you also request the value of ldapsearchattribute (or equivalent) at
> the same time. Say for example you had an entry like this:
> 
> dn: uid=mha,dc=users,dc=hagander,dc=net
> uid: mha
> mail: mag...@hagander.net 
> 
> Using the filter "(|(mail=%u)(uid=%u))" would locate the same bind DN
> "uid=mha,dc=users,dc=hagander,dc=net" with either one of your uid or
> email address.
> 
> If the bind is successful then the current user identity should be set
> to the value of the ldapsearchattribute retrieved from the bind DN
> entry, which with the default of "uid" would be "mha". Hence you end up
> with the same user role "mha" regardless of whether a uid or email
> address was entered.
> 
> 
> Right. This is the part that doesn't work for PostgreSQL. Because we
> have already specified the username -- it goes in the startup packet in
> order to pick the correct row from pg_hba.conf.

I don't think that's necessarily going to be an issue here because if
you're specifying a custom LDAP filter then there's a very good chance
that you're delegating access control to information held in the
directory anyway, e.g.

  (&(memberOf=cn=pgusers,dc=groups,dc=hagander,dc=net)(uid=%u))

  ((&(uid=%u)(|(uid=mha)(uid=mark)(uid=thomas)))

In the mail example above when you're using more than one attribute, I
think it's fair enough to simply say in the documentation you must set
user to "all" in pg_hba.conf since it is impossible to know which
attribute is being used to identify the user role until after
authentication.

I should add that personally I don't recommend such setups where the
user can log in using more than one identifier, but there are clients
out there who absolutely will insist on it (think internal vs. external
users) so if the LDAP support is being updated then it's worth exploring
to see if these cases can be supported.

> I guess in theory we could treat it like Kerberos or another one of the
> systems where we get the username from an external entity. But then
> you'd still have to specify the mapping again in pg_ident.conf, and it
> would be a pretty strong break of backwards compatibility.

(goes and glances at the code)

Is there no reason why you couldn't just overwrite port->user_name based
upon ldapsearchattribute at the end of CheckLDAPAuth?

And if this were only enabled when a custom filter were specified then
it shouldn't cause any backwards compatibility issues being a new feature?

> In terms of matching multiple users, all LDAP authentication routines
> I've seen will fail if more than one DN matches the search filter, so
> this nicely handles the cases where either the custom filter is
> incorrect or more than one entry is accidentally matched in the
> directory.
> 
> So do we, in the current implementation. But it's a lot less likely to
> happen in the current implementation, since we do a single equals lookup.

Great, that's absolutely fine :)


ATB,

Mark.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alvaro Herrera

Alexander Korotkov wrote:

> The problem is that you need to have not only opclass entries for the
> operators, but also operators themselves.  I.e. separate operators for
> int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
> tried to add multiple pg_amop rows for single operator and consequently get
> unique index violation.
> 
> Alvaro, do you think we need to define all these operators?  I'm not sure.
> If even we need it, I think we shouldn't do this during this GSoC.  What
> particular shortcomings do you see in explicit cast in RI triggers queries?

I'm probably confused.  Why did we add an operator and not a support
procedure?  I think we should have added rows in pg_amproc, not
pg_amproc.  I'm very tired right now so I may be speaking nonsense.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alvaro Herrera

Mark Rofail wrote:
> On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
> wrote:
> 
> >  separate operators for int4[] @>> int8, int4[] @>> int4, int4[] @>> int2,
> > int4[] @>> numeric.
> >
> 
> My only comment on the separate operators is its high maintenance.  Any new
> datatype introduced a corresponding operator should be created.

Yes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Tom Lane

Andres Freund  writes:
> ...  If we were to go this route we'd have to probably move
> the callback assignment into the ExecInit* routines, and possibly
> replace the switch in ExecInitNode() with another callback, assigned in
> make_*, and implement callbacks for ExecShutdown, MultiExecProcNode etc.

BTW, I don't see why you really need to mess with anything except
ExecProcNode?  Surely the other cases such as MultiExecProcNode are
not called often enough to justify changing them away from the
switch technology.  Yeah, maybe it would be a bit cleaner if they
all looked alike ... but if you're trying to make a patch that's
as little invasive as possible for v10, I'd suggest converting just
ExecProcNode to this style.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Tom Lane

Andres Freund  writes:
> On 2017-07-18 15:45:53 -0400, Tom Lane wrote:
>> While I'm uncomfortable with making such a change post-beta2, I'm one
>> heck of a lot *more* uncomfortable with shipping v10 with no stack depth
>> checks in the executor mainline.  Fleshing this out and putting it in
>> v10 might be an acceptable compromise.

> Ok, I'll flesh out the patch till Thursday.  But I do think we're going
> to have to do something about the back branches, too.

I'm not worried about the back branches.  The stack depth checks have
been in those same places for ten years at least, and there are no field
reports of trouble.

regards, tom lane


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Andres Freund

On 2017-07-18 15:45:53 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > Attached is a trivial patch implementing 1) and a proof-of-concept hack
> > for 2).
> 
> The comment you made previously, as well as the proposed commit message
> for 0001, suggest that you've forgotten that pre-v10 execQual.c had
> several check_stack_depth() calls.  Per its header comment,

>  *During expression evaluation, we check_stack_depth only in
>  *ExecMakeFunctionResult (and substitute routines) rather than at 
> every
>  *single node.  This is a compromise that trades off precision of the
>  *stack limit setting to gain speed.

No, I do remember that fact. But
a) unfortunately that's not really that useful because by far not all
   function calls go through ExecMakeFunctionResult()
   (e.g. ExecEvalDistinct() and a bunch of other FunctionCallInvoke()
   containing functions).
b) deeply nested executor nodes - and that's what the commit's about to
   a good degree - aren't necessarily guaranteed to actually evaluate
   expressions. There's no guarantee there's any expressions (you could
   just nest joins without conditions), and a bunch of nodes like
   hashjoins invoke functions themselves.


> and there was also a check in the recursive ExecInitExpr routine.

Which still is there.


> While it doesn't seem to be documented anywhere, I believe that we'd
> argued that ExecProcNode and friends didn't need their own stack depth
> checks because any nontrivial node would surely do expression evaluation
> which would contain a check.

Yea, I don't buy that at all.


> I agree that adding a check to ExecInitNode() is really necessary,
> but I'm not convinced that it's a sufficient substitute for checking
> in ExecProcNode().  The two flaws I see in that argument are
> 
> (a) you've provided no hard evidence backing up the argument that node
> initialization will take strictly more stack space than node execution;
> as far as I can see that's just wishful thinking.

I think it's pretty likely to be roughly (within slop) the case in
realistic scenarios, but I do feel fairly uncomfortable about the
assumption. That's why I really want to do something like the what I'm
proposing in the second patch. I just don't think we can realistically
add the check to the back branches, given that it's quite measurable
performancewise.


> (b) there's also an implicit assumption that ExecutorRun is called from
> a point not significantly more deeply nested than the corresponding
> call to ExecutorStart.  This seems also to depend on nothing much better
> than wishful thinking.  Certainly, many ExecutorRun calls are right next
> to ExecutorStart, but several of them aren't; the portal code and
> SQL-function code are both scary in this respect.

:/


> > The latter obviously isn't ready, but might make clearer what I'm
> > thinking about. If we were to go this route we'd have to probably move
> > the callback assignment into the ExecInit* routines, and possibly
> > replace the switch in ExecInitNode() with another callback, assigned in
> > make_*, and implement callbacks for ExecShutdown, MultiExecProcNode etc.
> 
> While I'm uncomfortable with making such a change post-beta2, I'm one
> heck of a lot *more* uncomfortable with shipping v10 with no stack depth
> checks in the executor mainline.  Fleshing this out and putting it in
> v10 might be an acceptable compromise.

Ok, I'll flesh out the patch till Thursday.  But I do think we're going
to have to do something about the back branches, too.

Greetings,

Andres Freund


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Tom Lane

Andres Freund  writes:
> Attached is a trivial patch implementing 1) and a proof-of-concept hack
> for 2).

The comment you made previously, as well as the proposed commit message
for 0001, suggest that you've forgotten that pre-v10 execQual.c had
several check_stack_depth() calls.  Per its header comment,

 *During expression evaluation, we check_stack_depth only in
 *ExecMakeFunctionResult (and substitute routines) rather than at every
 *single node.  This is a compromise that trades off precision of the
 *stack limit setting to gain speed.

and there was also a check in the recursive ExecInitExpr routine.

While it doesn't seem to be documented anywhere, I believe that we'd
argued that ExecProcNode and friends didn't need their own stack depth
checks because any nontrivial node would surely do expression evaluation
which would contain a check.

So the basic issue here is that (1) expression eval, per se, no longer
has any check and (2) it's not clear that we can rely on expression
compilation to substitute for that, since at least in principle
recompilation could be skipped during recursive use of a plan node.

I agree that adding a check to ExecInitNode() is really necessary,
but I'm not convinced that it's a sufficient substitute for checking
in ExecProcNode().  The two flaws I see in that argument are

(a) you've provided no hard evidence backing up the argument that node
initialization will take strictly more stack space than node execution;
as far as I can see that's just wishful thinking.

(b) there's also an implicit assumption that ExecutorRun is called from
a point not significantly more deeply nested than the corresponding
call to ExecutorStart.  This seems also to depend on nothing much better
than wishful thinking.  Certainly, many ExecutorRun calls are right next
to ExecutorStart, but several of them aren't; the portal code and
SQL-function code are both scary in this respect.

> The latter obviously isn't ready, but might make clearer what I'm
> thinking about. If we were to go this route we'd have to probably move
> the callback assignment into the ExecInit* routines, and possibly
> replace the switch in ExecInitNode() with another callback, assigned in
> make_*, and implement callbacks for ExecShutdown, MultiExecProcNode etc.

While I'm uncomfortable with making such a change post-beta2, I'm one
heck of a lot *more* uncomfortable with shipping v10 with no stack depth
checks in the executor mainline.  Fleshing this out and putting it in
v10 might be an acceptable compromise.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Julien Rouhaud

On 18/07/2017 14:04, Andres Freund wrote:
> On 2017-07-17 23:04:43 +0200, Julien Rouhaud wrote:
>> Is it v11 material or is there any chance to make it in v10?
> 
> I think it'd make sense to apply the first to v10 (and earlier), and the
> second to v11.  I think this isn't a terribly risky patch, but it's
> still a relatively large change for this point in the development
> cycle.  I'm willing to reconsider, but that's my default.

Agreed.

-- 
Julien Rouhaud



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] More race conditions in logical replication

2017-07-18 Thread Alvaro Herrera

Alvaro Herrera wrote:
> After going over this a bunch more times, I found other problems.  For
> example, I noticed that if I create a temporary slot in one session,
> then another session would rightly go to sleep if it tries to drop that
> slot.  But the slot goes away when the first session disconnects, so I'd
> expect the sleeping session to get a signal and wake up, but that
> doesn't happen.
> 
> I'm going to continue to look at this and report back Tuesday 18th.

I got tangled up in a really tough problem this week, so I won't have
time to work on this for a couple of days.  I can next update tomorrow
19:00 CLT (probably not with a final fix yet, though).

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail

On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

>  separate operators for int4[] @>> int8, int4[] @>> int4, int4[] @>> int2,
> int4[] @>> numeric.
>

My only comment on the separate operators is its high maintenance.  Any new
datatype introduced a corresponding operator should be created.

Re: [HACKERS] JSONB - JSONB operator feature request

2017-07-18 Thread David Fetter

On Tue, Jul 18, 2017 at 01:36:32PM +0200, david.tu...@linuxbox.cz wrote:
> Hi,
> 
> some users and me used hstore - hstore for example storing only changed 
> rows in trigger like:
> 
> hsore(NEW) - hstore(OLD)
> 
> There isn't same operator/function in JSON/JSONB. We can only remove keys 
> from JSONB, but not equal key-value pairs. Is there any chance to have 
> same feature with JSON/JSONB in postgres core?

Here's one slightly modified from 
http://coussej.github.io/2016/05/24/A-Minus-Operator-For-PostgreSQLs-JSONB/

CREATE OR REPLACE FUNCTION jsonb_minus ( arg1 jsonb, arg2 jsonb )
RETURNS jsonb
LANGUAGE sql
AS $$
SELECT 
COALESCE(json_object_agg(
key,
CASE
-- if the value is an object and the value of the second argument is
-- not null, we do a recursion
WHEN jsonb_typeof(value) = 'object' AND arg2 -> key IS NOT NULL 
THEN jsonb_minus(value, arg2 -> key)
-- for all the other types, we just return the value
ELSE value
END
), '{}')::jsonb
FROM 
jsonb_each(arg1)
WHERE 
arg1 -> key IS DISTINCT FROM arg2 -> key 
$$;

CREATE OPERATOR - (
PROCEDURE = jsonb_minus,
LEFTARG   = jsonb,
RIGHTARG  = jsonb
);

I suspect that there's a faster way to do the jsonb_minus function
internally.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Mark Rofail

On Tue, 18 Jul 2017 at 7:43 pm, Alexander Korotkov 
wrote:

> On T upue, Jul 18, 2017 at 2:24 AM, Mark Rofail 
> wrote:
>
>> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera <
>> alvhe...@2ndquadrant.com> wrote:
>>>
>>> We have one opclass for each type combination -- int4 to int2, int4 to
>>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>>> the opclasses.
>>
>>
>>  I tried this approach by manually declaring the operator multiple of
>> times in pg_amop.h (src/include/catalog/pg_amop.h)
>>
>> so instead of the polymorphic declaration
>> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
>> anyelem */
>>
>> multiple declarations were used, for example for int4[] :
>> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
>> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
>> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
>> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric
>> */
>>
>> However, make check produced:
>> could not create unique index "pg_amop_opr_fam_index"
>> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>>
>> Am I implementing this the wrong way or do we need to look for another
>> approach?
>>
>
> The problem is that you need to have not only opclass entries for the
> operators, but also operators themselves.  I.e. separate operators for
> int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
> tried to add multiple pg_amop rows for single operator and consequently get
> unique index violation.
>
> Alvaro, do you think we need to define all these operators?  I'm not
> sure.  If even we need it, I think
> --
> Alexander Korotkov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>

Re: [HACKERS] GSoC 2017: Foreign Key Arrays

2017-07-18 Thread Alexander Korotkov

On Tue, Jul 18, 2017 at 2:24 AM, Mark Rofail  wrote:

> On Wed, Jul 12, 2017 at 12:53 AM, Alvaro Herrera  > wrote:
>>
>> We have one opclass for each type combination -- int4 to int2, int4 to
>> int4, int4 to int8, etc.  You just need to add the new strategy to all
>> the opclasses.
>
>
>  I tried this approach by manually declaring the operator multiple of
> times in pg_amop.h (src/include/catalog/pg_amop.h)
>
> so instead of the polymorphic declaration
> DATA(insert ( 2745   2277 2283 5 s 6108 2742 0 )); /* anyarray @>>
> anyelem */
>
> multiple declarations were used, for example for int4[] :
> DATA(insert ( 2745   1007 20 5 s 6108 2742 0 )); /* int4[] @>> int8 */
> DATA(insert ( 2745   1007 23 5 s 6108 2742 0 )); /* int4[] @>> int4 */
> DATA(insert ( 2745   1007 21 5 s 6108 2742 0 )); /* int4[] @>> int2 */
> DATA(insert ( 2745   1007 1700 5 s 6108 2742 0 ));/* int4[] @>> numeric */
>
> However, make check produced:
> could not create unique index "pg_amop_opr_fam_index"
> Key (amopopr, amoppurpose, amopfamily)=(6108, s, 2745) is duplicated.
>
> Am I implementing this the wrong way or do we need to look for another
> approach?
>

The problem is that you need to have not only opclass entries for the
operators, but also operators themselves.  I.e. separate operators for
int4[] @>> int8, int4[] @>> int4, int4[] @>> int2, int4[] @>> numeric.  You
tried to add multiple pg_amop rows for single operator and consequently get
unique index violation.

Alvaro, do you think we need to define all these operators?  I'm not sure.
If even we need it, I think we shouldn't do this during this GSoC.  What
particular shortcomings do you see in explicit cast in RI triggers queries?

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] Fix performance degradation of contended LWLock on NUMA

2017-07-18 Thread Sokolov Yura


On 2017-06-05 16:22, Sokolov Yura wrote:

Good day, everyone.

This patch improves performance of contended LWLock.
Patch makes lock acquiring in single CAS loop:
1. LWLock->state is read, and ability for lock acquiring is detected.
  If there is possibility to take a lock, CAS tried.
  If CAS were successful, lock is aquired (same to original version).
2. but if lock is currently held by other backend, we check ability for
  taking WaitList lock. If wait list lock is not help by anyone, CAS
  perfomed for taking WaitList lock and set LW_FLAG_HAS_WAITERS at 
once.
  If CAS were successful, then LWLock were still held at the moment 
wait

  list lock were held - this proves correctness of new algorithm. And
  Proc is queued to wait list then.
3. Otherwise spin_delay is performed, and loop returns to step 1.



I'm sending rebased version with couple of one-line tweaks.
(less skip_wait_list on shared lock, and don't update spin-stat on 
aquiring)


With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres CompanyFrom dc8d9489074973ba589adf9b0239e1467cbba44b Mon Sep 17 00:00:00 2001
From: Sokolov Yura 
Date: Mon, 29 May 2017 09:25:41 +
Subject: [PATCH 1/6] More correct use of spinlock inside LWLockWaitListLock.

SpinDelayStatus should be function global to count iterations correctly,
and produce more correct delays.

Also if spin didn't spin, do not count it in spins_per_delay correction.
It wasn't necessary before cause delayStatus were used only in contented
cases.
---
 src/backend/storage/lmgr/lwlock.c | 36 
 src/backend/storage/lmgr/s_lock.c |  3 +++
 2 files changed, 19 insertions(+), 20 deletions(-)

diff --git a/src/backend/storage/lmgr/lwlock.c b/src/backend/storage/lmgr/lwlock.c
index 82a1cf5150..86966b804e 100644
--- a/src/backend/storage/lmgr/lwlock.c
+++ b/src/backend/storage/lmgr/lwlock.c
@@ -323,6 +323,15 @@ get_lwlock_stats_entry(LWLock *lock)
 	}
 	return lwstats;
 }
+
+static void
+add_lwlock_stats_spin_stat(LWLock *lock, SpinDelayStatus *delayStatus)
+{
+	lwlock_stats *lwstats;
+
+	lwstats = get_lwlock_stats_entry(lock);
+	lwstats->spin_delay_count += delayStatus->delays;
+}
 #endif			/* LWLOCK_STATS */
 
 
@@ -800,13 +809,9 @@ static void
 LWLockWaitListLock(LWLock *lock)
 {
 	uint32		old_state;
-#ifdef LWLOCK_STATS
-	lwlock_stats *lwstats;
-	uint32		delays = 0;
-
-	lwstats = get_lwlock_stats_entry(lock);
-#endif
+	SpinDelayStatus delayStatus;
 
+	init_local_spin_delay();
 	while (true)
 	{
 		/* always try once to acquire lock directly */
@@ -815,20 +820,10 @@ LWLockWaitListLock(LWLock *lock)
 			break;/* got lock */
 
 		/* and then spin without atomic operations until lock is released */
+		while (old_state & LW_FLAG_LOCKED)
 		{
-			SpinDelayStatus delayStatus;
-
-			init_local_spin_delay();
-
-			while (old_state & LW_FLAG_LOCKED)
-			{
-perform_spin_delay();
-old_state = pg_atomic_read_u32(>state);
-			}
-#ifdef LWLOCK_STATS
-			delays += delayStatus.delays;
-#endif
-			finish_spin_delay();
+			perform_spin_delay();
+			old_state = pg_atomic_read_u32(>state);
 		}
 
 		/*
@@ -836,9 +831,10 @@ LWLockWaitListLock(LWLock *lock)
 		 * we're attempting to get it again.
 		 */
 	}
+	finish_spin_delay();
 
 #ifdef LWLOCK_STATS
-	lwstats->spin_delay_count += delays;
+	add_lwlock_stats_spin_stat(lock, );
 #endif
 }
 
diff --git a/src/backend/storage/lmgr/s_lock.c b/src/backend/storage/lmgr/s_lock.c
index 40d8156331..f3e0fbc602 100644
--- a/src/backend/storage/lmgr/s_lock.c
+++ b/src/backend/storage/lmgr/s_lock.c
@@ -177,6 +177,9 @@ finish_spin_delay(SpinDelayStatus *status)
 	if (status->cur_delay == 0)
 	{
 		/* we never had to delay */
+		if (status->spins == 0)
+			/* but we didn't spin either, so ignore */
+			return;
 		if (spins_per_delay < MAX_SPINS_PER_DELAY)
 			spins_per_delay = Min(spins_per_delay + 100, MAX_SPINS_PER_DELAY);
 	}
-- 
2.11.0


From 067282cce98c77b9f1731cf7d302c37fcc70d59e Mon Sep 17 00:00:00 2001
From: Sokolov Yura 
Date: Tue, 30 May 2017 14:05:26 +0300
Subject: [PATCH 2/6] Perform atomic LWLock acquirement or putting into wait
 list.

Before this change, lwlock were acquired in several steps:
- first try acquire lockfree (one CAS-spin-loop)
- if it were not successful, queue self into wait list
  (two CAS-spin-loops (in fact, 3 loop, cause of atomic_fetch_and_or))
- try again to acquire lock (another one CAS loop)
- if second attempt were successful, deque our self from wait queue
  (two CAS spin loops).

With this change, lock acquired, or wait list lock acquired using single
CAS loop:
- if it is likely to acquire desired lock mode, we do CAS for that
- otherwise we are trying for CAS to lock wait list and set necessary
  flag (LG_FLAG_HAS_WAITERS) at once.
  (LWLockWaitListUnlock still performs second CAS loop for releasing
   wait list lock).

Correctness provided by the fact we are trying to

Re: [HACKERS] JSONB - JSONB operator feature request

2017-07-18 Thread David Fetter

On Tue, Jul 18, 2017 at 01:36:32PM +0200, david.tu...@linuxbox.cz wrote:
> Hi,
> 
> some users and me used hstore - hstore for example storing only changed 
> rows in trigger like:
> 
> hstore(NEW) - hstore(OLD)
> 
> There isn't same operator/function in JSON/JSONB. We can only remove keys 
> from JSONB, but not equal key-value pairs. Is there any chance to have 
> same feature with JSON/JSONB in postgres core?

What would - mean precisely for JSON[B]?

For example, what would you expect

SELECT '{"a": 1, "b": {"c": 2}}'::JSONB - '{"b": 1, "b": {"c": 3}}'::JSONB

to yield?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] GSoC 2017: weekly progress reports (week 7)

2017-07-18 Thread Shubham Barai

Project: Explicitly support predicate locks in index AMs besides b-tree

Hi,

During this week, I read documentation and source code of BRIN index to
understand its implementation.
But later I found out that it is unlikely to implement page level or tuple
level predicate locking in BRIN index.
In this week, I also fixed a small issue and updated my previous patch for
gin index. Currently, I am working on
sp-gist index.


Regards,
Shubham

 Sent with Mailtrack

Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Tom Lane

Andres Freund  writes:
> On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
>> I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.

> Not sure if that really does that much to solve the concern.

Well, it reduces the amount of data churn that a statement shorter than
PGSTAT_STAT_INTERVAL could cause.

> Another,
> pretty half-baked, approach would be to add a procsignal triggering idle
> backends to send stats, and send that to all idle backends when querying
> stats. We could even publish the number of outstanding stats updates in
> PGXACT or such, without any locking, and send it only to those that have
> outstanding ones.

If somebody wanted to do the work, that'd be a viable answer IMO.  You'd
really want to not wake backends that have nothing more to send, but
I agree that it'd be possible to advertise that in shared memory.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump

2017-07-18 Thread David Fetter

On Tue, Jul 18, 2017 at 08:38:25AM +0200, Michael Paquier wrote:
> On Tue, Jul 18, 2017 at 3:45 AM, David Fetter  wrote:
> > The one I run into frequently is in a proprietary fork, RDS Postgres.
> > It'll happily dump out COMMENT ON EXTENSION plpgsq IS ...
> > which is great as far as it goes, but errors out when you try to
> > reload it.
> >
> > While bending over backwards to support proprietary forks strikes me
> > as a terrible idea, I'd like to enable pg_dump to produce and consume
> > ToCs just as pg_restore does with its -l/-L options.  This would
> > provide the finest possible grain.
> 
> Let's add as well a similar switch to pg_dumpall that gets pushed down
> to all the created pg_dump commands. If this patch gets integrated
> as-is this is going to be asked. And tests would be welcome.

Excellent point about pg_dumpall.  I'll see what I can draft up in the
next day or two and report back.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter  XMPP: david(dot)fetter(at)gmail(dot)com

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Andres Freund

Hi,

On 2017-07-18 09:42:31 -0400, Tom Lane wrote:
> I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
> I don't think that value has been reconsidered since the code was written,
> circa turn of the century.  Maybe even make it configurable, though that
> could be overkill.

Not sure if that really does that much to solve the concern.  Another,
pretty half-baked, approach would be to add a procsignal triggering idle
backends to send stats, and send that to all idle backends when querying
stats. We could even publish the number of outstanding stats updates in
PGXACT or such, without any locking, and send it only to those that have
outstanding ones.

Greetings,

Andres Freund

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Tom Lane

Andres Freund  writes:
> That seems like it'd add a good number of new wakeups, or at least
> scheduling of wakeups.

Yes, as it stands this will result in a huge increase in alarm-scheduling
kernel call traffic.  I understand the issue but I do not think this is
an acceptable path to a fix.

> Or we could do nothing - I actually think that's a viable option.

I tend to agree.  I'm not really sure that the presented problem is a
big deal: for it to be an issue, you have to assume that a DML operation
that takes less than PGSTAT_STAT_INTERVAL is capable of causing enough
table churn that it's a problem if autovacuum doesn't hear about that
churn promptly.

I wonder if a better answer wouldn't be to reduce PGSTAT_STAT_INTERVAL.
I don't think that value has been reconsidered since the code was written,
circa turn of the century.  Maybe even make it configurable, though that
could be overkill.

regards, tom lane

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Andres Freund

Hi,

On 2017-07-18 21:49:27 +0900, Yugo Nagata wrote:
> After a DML is perfomed, the statistics is sent to pgstat by 
> pgsent_report_sent().
> However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, 
> so if
> a few DMLs are performed with short interval, some statistics could not be 
> sent
> until the backend is shutdown.
> 
> This is not a problem in usual cases, but in case that a session is remained 
> in
> idle for a long time, for example when using connection pooling, statistics of
> a huge change of a table is not sent for a long time, and as a result, 
> starting
> autovacuum might be delayed.

Hm.

> An idea to resolve this is call pgsent_report_sent() again with a delay
> after the last DML to make sure to send the statistics. The attached patch
> implements this.

That seems like it'd add a good number of new wakeups, or at least
scheduling of wakeups.  Now a timer would be armed whenever a query is
run outside of a transaction - something happening quite regularly. And
it'd do so even if a) there are no outstanding stat reports b) there's
already a timer running.  Additionally it'd, unless I mis-skimmed,
trigger pgstat_report_stat() from within
CHECK_FOR_INTERRUPTS/ProcessInterrupts even when inside a transaction,
if that transaction is started after the enable_timeout_after().

I'm not entirely sure what a good solution would be.  One option would
be to have a regular wakeup setting a flag (considerably longer than
500ms, that's way too many additional wakeups), rearm the timer in the
handler, but only process the flag when outside of a transaction.

Or we could do nothing - I actually think that's a viable option.

- Andres

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

2017-07-18 Thread K S, Sandhya (Nokia - IN/Bangalore)

Hi Craig,

While testing for another scenario of continuous postgres server restart, we 
got many cores of sh-QUIT and along with that we got cores for rm-QUIT. It is 
pointing to rm of the archive file but we were not able to get the bt as the 
stack is corrupted.

We got below info from gdb:
Core was generated by `rm ./Archive_00020118'.

And also we were able to get this info:
4518 12490  0.0  0.0  11484  1356 ?Ss   10:59   0:00 postgres: 
archiver process   archiving 00020118.0028.backup
4518 12704  2.0  0.0   7672  2932 ?S11:00   0:00   \_ sh -c 
rm ./Archive_*; touch ./Archive_"00020118.0028.backup"; 
exit 0
4518 12707  0.0  0.0344 4 ?S11:00   0:00
 \_ rm ./Archive_00020118

In the Postgres configuration file ,we have this information.
archive_command = 'rm ./Archive_*; touch ./Archive_"%f"; exit 0'

So while executing this archive command, core was generated.
You pointed out earlier that issue might be happening during archive command 
and also all evidence for this crash are pointing to this same command.
Are there any suggestions to recover from this situation or on ways to debug 
the issue ?

Regards,
Sandhya

From: K S, Sandhya (Nokia - IN/Bangalore)
Sent: Wednesday, July 12, 2017 4:51 PM
To: 'Craig Ringer' 
Cc: pgsql-bugs ; PostgreSQL Hackers 
; T, Rasna (Nokia - IN/Bangalore) 
; Itnal, Prakash (Nokia - IN/Bangalore) 

Subject: RE: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

Hi Craig,

Here is bt after installing all the missing debuginfo packages.

(gdb) bt
#0  0x00fff7682f18 in do_lookup_x (undef_name=undef_name@entry=0xfff75cece5 
"_Jv_RegisterClasses", new_hash=new_hash@entry=2681263574,
old_hash=old_hash@entry=0xa159b8, ref=0xfff75ceac8, 
result=result@entry=0xa159a0, scope=, i=1, 
version=version@entry=0x0,
flags=flags@entry=1, skip=skip@entry=0x0, type_class=type_class@entry=0, 
undef_map=undef_map@entry=0xfff76a9478) at dl-lookup.c:444
#1  0x00fff76839a0 in _dl_lookup_symbol_x (undef_name=0xfff75cece5 
"_Jv_RegisterClasses", undef_map=0xfff76a9478, ref=0xa15a90,
symbol_scope=0xfff76a9980, version=0x0, type_class=, 
flags=, skip_map=0x0) at dl-lookup.c:833
#2  0x00fff7685730 in elf_machine_got_rel (lazy=1, map=0xfff76a9478) at 
../sysdeps/mips/dl-machine.h:870
#3  elf_machine_runtime_setup (profile=, lazy=1, l=0xfff76a9478) 
at ../sysdeps/mips/dl-machine.h:916
#4  _dl_relocate_object (scope=0xfff76a9980, reloc_mode=, 
consider_profiling=0) at dl-reloc.c:259
#5  0x00fff767ba10 in dl_main (phdr=, 
phdr@entry=0x12040, phnum=, phnum@entry=8,
user_entry=user_entry@entry=0xa15cf0, auxv=) at 
rtld.c:2070
#6  0x00fff7692e3c in _dl_sysdep_start (start_argptr=, 
dl_main=0xfff7679a98 ) at ../elf/dl-sysdep.c:249
#7  0x00fff767d0d8 in _dl_start_final (arg=arg@entry=0xa16410, 
info=info@entry=0xa15d80) at rtld.c:307
#8  0x00fff767d3d8 in _dl_start (arg=0xa16410) at rtld.c:415
#9  0x00fff7679380 in __start () from /lib64/ld.so.1

Please see if this could help in analysing the issue.

Regards,
Sandhya

From: Craig Ringer [mailto:cr...@2ndquadrant.com]
Sent: Friday, July 07, 2017 1:53 PM
To: K S, Sandhya (Nokia - IN/Bangalore) 
>
Cc: pgsql-bugs >; 
PostgreSQL Hackers 
>; T, Rasna 
(Nokia - IN/Bangalore) >; Itnal, 
Prakash (Nokia - IN/Bangalore) 
>
Subject: Re: [HACKERS] Postgres process invoking exit resulting in sh-QUIT core

On 7 July 2017 at 15:41, K S, Sandhya (Nokia - IN/Bangalore) 
> wrote:
Hi Craig,

The scenario is lock and unlock of the system for 30 times. During this 
scenario 5 sh-QUIT core is generated. GDB of 5 core is pointing to different 
locations.
I have attached output for 2 such instance.

You seem to be missing debug symbols. Install appropriate debuginfo packages.

--
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

[HACKERS] JSONB - JSONB operator feature request

2017-07-18 Thread david . turon

Hi,

some users and me used hstore - hstore for example storing only changed 
rows in trigger like:

hsore(NEW) - hstore(OLD)

There isn't same operator/function in JSON/JSONB. We can only remove keys 
from JSONB, but not equal key-value pairs. Is there any chance to have 
same feature with JSON/JSONB in postgres core?

Thanks!

David

-- 
-
Ing. David TUROŇ
LinuxBox.cz, s.r.o.
28. rijna 168, 709 01 Ostrava

tel.:+420 591 166 224
fax:+420 596 621 273
mobil:  +420 732 589 152
www.linuxbox.cz

mobil servis: +420 737 238 656
email servis: ser...@linuxbox.cz
-

[HACKERS] [PATCH] Make sure all statistics is sent after a few DML are performed

2017-07-18 Thread Yugo Nagata

Hi,

After a DML is perfomed, the statistics is sent to pgstat by 
pgsent_report_sent().
However, the mininum time between stas file update is PGSTAT_STAT_INTERVAL, so 
if
a few DMLs are performed with short interval, some statistics could not be sent
until the backend is shutdown.

This is not a problem in usual cases, but in case that a session is remained in
idle for a long time, for example when using connection pooling, statistics of
a huge change of a table is not sent for a long time, and as a result, starting
autovacuum might be delayed.

An idea to resolve this is call pgsent_report_sent() again with a delay
after the last DML to make sure to send the statistics. The attached patch
implements this.

Any comments would be appreciated.

Regards,

-- 
Yugo Nagata 
diff --git a/src/backend/tcop/postgres.c b/src/backend/tcop/postgres.c
index b8d860e..928d479 100644
--- a/src/backend/tcop/postgres.c
+++ b/src/backend/tcop/postgres.c
@@ -3012,6 +3012,12 @@ ProcessInterrupts(void)
 
 	}
 
+	if (PgStatReportStatTimeoutPending)
+	{
+		pgstat_report_stat(false);
+		PgStatReportStatTimeoutPending = false;
+	}
+
 	if (ParallelMessagePending)
 		HandleParallelMessages();
 }
@@ -4010,6 +4016,14 @@ PostgresMain(int argc, char *argv[],
 ProcessCompletedNotifies();
 pgstat_report_stat(false);
 
+/* Call pgstat_report_stat() after PGSTAT_REPORT_STAT_DELAY
+ * again because if DMLs are performed with interval shorter
+ * than PGSTAT_STAT_INTERVAL then some statistics could not be
+ * sent until the backend is shutdown.
+ */
+enable_timeout_after(PGSTAT_REPORT_STAT_TIMEOUT,
+	 PGSTAT_REPORT_STAT_DELAY);
+
 set_ps_display("idle", false);
 pgstat_report_activity(STATE_IDLE, NULL);
 			}
diff --git a/src/backend/utils/init/globals.c b/src/backend/utils/init/globals.c
index 7c09498..8c29ebd 100644
--- a/src/backend/utils/init/globals.c
+++ b/src/backend/utils/init/globals.c
@@ -31,6 +31,7 @@ volatile bool QueryCancelPending = false;
 volatile bool ProcDiePending = false;
 volatile bool ClientConnectionLost = false;
 volatile bool IdleInTransactionSessionTimeoutPending = false;
+volatile bool PgStatReportStatTimeoutPending = false;
 volatile sig_atomic_t ConfigReloadPending = false;
 volatile uint32 InterruptHoldoffCount = 0;
 volatile uint32 QueryCancelHoldoffCount = 0;
diff --git a/src/backend/utils/init/postinit.c b/src/backend/utils/init/postinit.c
index eb6960d..1df30bc 100644
--- a/src/backend/utils/init/postinit.c
+++ b/src/backend/utils/init/postinit.c
@@ -71,6 +71,7 @@ static void ShutdownPostgres(int code, Datum arg);
 static void StatementTimeoutHandler(void);
 static void LockTimeoutHandler(void);
 static void IdleInTransactionSessionTimeoutHandler(void);
+static void PgStatReportStatTimeoutHandler(void);
 static bool ThereIsAtLeastOneRole(void);
 static void process_startup_options(Port *port, bool am_superuser);
 static void process_settings(Oid databaseid, Oid roleid);
@@ -599,6 +600,8 @@ InitPostgres(const char *in_dbname, Oid dboid, const char *username,
 		RegisterTimeout(LOCK_TIMEOUT, LockTimeoutHandler);
 		RegisterTimeout(IDLE_IN_TRANSACTION_SESSION_TIMEOUT,
 		IdleInTransactionSessionTimeoutHandler);
+		RegisterTimeout(PGSTAT_REPORT_STAT_TIMEOUT,
+		PgStatReportStatTimeoutHandler);
 	}
 
 	/*
@@ -1196,6 +1199,14 @@ IdleInTransactionSessionTimeoutHandler(void)
 	SetLatch(MyLatch);
 }
 
+static void
+PgStatReportStatTimeoutHandler(void)
+{
+	PgStatReportStatTimeoutPending = true;
+	InterruptPending = true;
+	SetLatch(MyLatch);
+}
+
 /*
  * Returns true if at least one role is defined in this database cluster.
  */
diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
index dad98de..4bc6f0f 100644
--- a/src/include/miscadmin.h
+++ b/src/include/miscadmin.h
@@ -81,6 +81,7 @@ extern PGDLLIMPORT volatile bool InterruptPending;
 extern PGDLLIMPORT volatile bool QueryCancelPending;
 extern PGDLLIMPORT volatile bool ProcDiePending;
 extern PGDLLIMPORT volatile bool IdleInTransactionSessionTimeoutPending;
+extern PGDLLIMPORT volatile bool PgStatReportStatTimeoutPending;
 extern PGDLLIMPORT volatile sig_atomic_t ConfigReloadPending;
 
 extern volatile bool ClientConnectionLost;
diff --git a/src/include/pgstat.h b/src/include/pgstat.h
index 6bffe63..a06703f 100644
--- a/src/include/pgstat.h
+++ b/src/include/pgstat.h
@@ -33,6 +33,11 @@
 /* Default directory to store temporary statistics data in */
 #define PG_STAT_TMP_DIR		"pg_stat_tmp"
 
+/* How long to wait before calling pgstat_stat_report after
+ * the last DML is performed; in milliseconds.
+ */
+#define PGSTAT_REPORT_STAT_DELAY   500
+
 /* Values for track_functions GUC variable --- order is significant! */
 typedef enum TrackFunctionsLevel
 {
diff --git a/src/include/utils/timeout.h b/src/include/utils/timeout.h
index 5a2efc0..7eccd2d 100644
--- a/src/include/utils/timeout.h
+++ b/src/include/utils/timeout.h
@@ -31,6 +31,7 @@ typedef enum TimeoutId
 	STANDBY_TIMEOUT,

Re: [HACKERS] merge psql ef/ev sf/sv handling functions

2017-07-18 Thread Fabien COELHO



Hello Victor,


While reviewing Corey's \if patch, I complained that there was some
amount of copy-paste in "psql/command.c".

Here is an attempt at merging some functions which removes 160 lines of 
code.


I was looking through your patch. It seems good, the of the functions was 
very similar.


Indeed. I guess that it was initially a copy paste.

I have a question for you. What was the reason to replace "printfPQExpBuffer" 
by "resetPQExpBuffer" and "appendPQExpBufferStr"?


Because the "printf" version implies interpreting the format layer which 
does not add significant value compared to just appending the string.


--
Fabien.


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] [PATCH] Add citext_pattern_ops to citext contrib module

2017-07-18 Thread Alexey Chernyshov


Hi all,

The attached patch introduces citext_pattern_ops for citext extension 
type like text_pattern_ops for text type. Here are operators ~<~, ~<=~, 
~>~, ~>=~ combined into citext_pattern_ops operator class. These 
operators simply compare underlying citext values as C strings with 
memcmp() function. This operator class isn’t supported by B-Tree index 
yet, but it is a first step to do it.


Patch includes regression tests and is applicable to the latest commit 
(c85ec643ff2586e2d144374f51f93bfa215088a2).


The problem of citext support for LIKE operator with B-Tree index was 
mentioned in [1]. Briefly, the planner doesn’t use B-Tree index for 
queries text_col LIKE ‘abc%’. I’d like to investigate how to improve it 
and make another patch. I think the start point is 
match_special_index_operator() function which doesn’t support custom 
types. I would appreciate hearing your opinion on this.


1. https://www.postgresql.org/message-id/3924.1480351187%40sss.pgh.pa.us

--
Alexey Chernyshov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 6ff180548209fb3abb66d70686df728b307635e3 Mon Sep 17 00:00:00 2001
From: Alexey Chernyshov 
Date: Fri, 23 Jun 2017 14:40:04 +0300
Subject: [PATCH 1/3] add citext opclass citext_pattern_ops

---
 contrib/citext/Makefile  |   2 +-
 contrib/citext/citext--1.4--1.5.sql  |   1 +
 contrib/citext/citext--1.4.sql   | 501 --
 contrib/citext/citext--1.5.sql   | 575 +++
 contrib/citext/citext.c  | 120 
 contrib/citext/citext.control|   2 +-
 contrib/citext/expected/citext.out   | 334 
 contrib/citext/expected/citext_1.out | 334 
 contrib/citext/sql/citext.sql|  72 +
 9 files changed, 1438 insertions(+), 503 deletions(-)
 create mode 100644 contrib/citext/citext--1.4--1.5.sql
 delete mode 100644 contrib/citext/citext--1.4.sql
 create mode 100644 contrib/citext/citext--1.5.sql

diff --git a/contrib/citext/Makefile b/contrib/citext/Makefile
index 563cd22..8474e86 100644
--- a/contrib/citext/Makefile
+++ b/contrib/citext/Makefile
@@ -3,7 +3,7 @@
 MODULES = citext
 
 EXTENSION = citext
-DATA = citext--1.4.sql citext--1.3--1.4.sql \
+DATA = citext--1.5.sql citext--1.4--1.5.sql citext--1.3--1.4.sql \
 	citext--1.2--1.3.sql citext--1.1--1.2.sql \
 	citext--1.0--1.1.sql citext--unpackaged--1.0.sql
 PGFILEDESC = "citext - case-insensitive character string data type"
diff --git a/contrib/citext/citext--1.4--1.5.sql b/contrib/citext/citext--1.4--1.5.sql
new file mode 100644
index 000..4c8abc0
--- /dev/null
+++ b/contrib/citext/citext--1.4--1.5.sql
@@ -0,0 +1 @@
+/* contrib/citext/citext--1.4--1.5.sql */
diff --git a/contrib/citext/citext--1.4.sql b/contrib/citext/citext--1.4.sql
deleted file mode 100644
index 7b06198..000
--- a/contrib/citext/citext--1.4.sql
+++ /dev/null
@@ -1,501 +0,0 @@
-/* contrib/citext/citext--1.4.sql */
-
--- complain if script is sourced in psql, rather than via CREATE EXTENSION
-\echo Use "CREATE EXTENSION citext" to load this file. \quit
-
---
---  PostgreSQL code for CITEXT.
---
--- Most I/O functions, and a few others, piggyback on the "text" type
--- functions via the implicit cast to text.
---
-
---
--- Shell type to keep things a bit quieter.
---
-
-CREATE TYPE citext;
-
---
---  Input and output functions.
---
-CREATE FUNCTION citextin(cstring)
-RETURNS citext
-AS 'textin'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextout(citext)
-RETURNS cstring
-AS 'textout'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextrecv(internal)
-RETURNS citext
-AS 'textrecv'
-LANGUAGE internal STABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citextsend(citext)
-RETURNS bytea
-AS 'textsend'
-LANGUAGE internal STABLE STRICT PARALLEL SAFE;
-
---
---  The type itself.
---
-
-CREATE TYPE citext (
-INPUT  = citextin,
-OUTPUT = citextout,
-RECEIVE= citextrecv,
-SEND   = citextsend,
-INTERNALLENGTH = VARIABLE,
-STORAGE= extended,
--- make it a non-preferred member of string type category
-CATEGORY   = 'S',
-PREFERRED  = false,
-COLLATABLE = true
-);
-
---
--- Type casting functions for those situations where the I/O casts don't
--- automatically kick in.
---
-
-CREATE FUNCTION citext(bpchar)
-RETURNS citext
-AS 'rtrim1'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citext(boolean)
-RETURNS citext
-AS 'booltext'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
-CREATE FUNCTION citext(inet)
-RETURNS citext
-AS 'network_show'
-LANGUAGE internal IMMUTABLE STRICT PARALLEL SAFE;
-
---
---  Implicit and assignment type casts.
---
-
-CREATE CAST (citext AS text)WITHOUT FUNCTION AS IMPLICIT;
-CREATE CAST (citext AS varchar) WITHOUT FUNCTION AS IMPLICIT;
-CREATE CAST (citext AS bpchar)  WITHOUT

Re: [HACKERS] merge psql ef/ev sf/sv handling functions

2017-07-18 Thread Victor Drobny


On 2017-03-31 21:04, Fabien COELHO wrote:

Hello,

While reviewing Corey's \if patch, I complained that there was some
amount of copy-paste in "psql/command.c".

Here is an attempt at merging some functions which removes 160 lines of 
code.


Hello,

I was looking through your patch. It seems good, the of the functions 
was very similar.
I have a question for you. What was the reason to replace 
"printfPQExpBuffer" by "resetPQExpBuffer" and "appendPQExpBufferStr"?


Thank you for attention!

--
--
Victor Drobny
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] segfault in HEAD when too many nested functions call

2017-07-18 Thread Andres Freund

On 2017-07-17 23:04:43 +0200, Julien Rouhaud wrote:
> On 17/07/2017 16:57, Andres Freund wrote:
> > The latter obviously isn't ready, but might make clearer what I'm
> > thinking about.
> 
> It did for me, and FWIW I like this approach.

Cool.


> > If we were to go this route we'd have to probably move
> > the callback assignment into the ExecInit* routines, and possibly
> > replace the switch in ExecInitNode() with another callback, assigned in
> > make_*, and implement callbacks for ExecShutdown, MultiExecProcNode etc.
> > 
> > This results in a good speedup in tpc-h btw:
> > master total min: 46434.897 cb min: 44778.228 [diff -3.70]
> 
> Is it v11 material or is there any chance to make it in v10?

I think it'd make sense to apply the first to v10 (and earlier), and the
second to v11.  I think this isn't a terribly risky patch, but it's
still a relatively large change for this point in the development
cycle.  I'm willing to reconsider, but that's my default.

- Andres


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] bug in locking an update tuple chain

2017-07-18 Thread Amit Kapila

On Sat, Jul 15, 2017 at 2:30 AM, Alvaro Herrera
 wrote:
> A customer of ours reported a problem in 9.3.14 while inserting tuples
> in a table with a foreign key, with many concurrent transactions doing
> the same: after a few insertions worked sucessfully, a later one would
> return failure indicating that the primary key value was not present in
> the referenced table.  It worked fine for them on 9.3.4.
>
> After some research, we determined that the problem disappeared if
> commit this commit was reverted:
>
> Author: Alvaro Herrera 
> Branch: master Release: REL9_6_BR [533e9c6b0] 2016-07-15 14:17:20 -0400
> Branch: REL9_5_STABLE Release: REL9_5_4 [649dd1b58] 2016-07-15 14:17:20 -0400
> Branch: REL9_4_STABLE Release: REL9_4_9 [166873dd0] 2016-07-15 14:17:20 -0400
> Branch: REL9_3_STABLE Release: REL9_3_14 [6c243f90a] 2016-07-15 14:17:20 -0400
>
> Avoid serializability errors when locking a tuple with a committed update
>
> I spent some time writing an isolationtester spec to reproduce the
> problem.  It turned out that this required six concurrent sessions in
> order for the problem to show up at all, but once I had that, figuring
> out what was going on was simple: a transaction wants to lock the
> updated version of some tuple, and it does so; and some other
> transaction is also locking the same tuple concurrently in a compatible
> way.  So both are okay to proceed concurrently.  The problem is that if
> one of them detects that anything changed in the process of doing this
> (such as the other session updating the multixact to include itself,
> both having compatible lock modes), it loops back to ensure xmax/
> infomask are still sane; but heap_lock_updated_tuple_rec is not prepared
> to deal with the situation of "the current transaction has the lock
> already", so it returns a failure and the tuple is returned as "not
> visible" causing the described problem.
>

Your fix seems logical to me, though I have not tested it till now.
However, I wonder why heap_lock_tuple need to restart from the
beginning of update-chain in this case?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Proposal for CSN based snapshots

2017-07-18 Thread Michael Paquier

On Wed, Jun 21, 2017 at 1:48 PM, Alexander Kuzmenkov
 wrote:
> Glad to see you working on this! I've been studying this topic too. Attached
> you can find a recently rebased version of Heikki's v4 patch.
> I also fixed a bug that appeared on report-receipts isolation test:
> XidIsConcurrent should report that a transaction is concurrent to ours when
> its csn equals our snapshotcsn.

I am finally looking at rebasing things properly now, and getting more
familiar in order to come up with a patch for the upcoming coming fest
(I can see some diff simplifications as well to impact less existing
applications), and this v5 is very suspicious. You are adding code
that should actually be removed. One such block can be found at the
beginning of the patch:

+txid_snapshot_xip
+   
+
+   
That's not actually a problem as I am reusing an older v4 from Heikki
now for the future, but I wanted to let you know about that first.
-- 
Michael

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

[HACKERS] Increase Vacuum ring buffer.

2017-07-18 Thread Sokolov Yura


Good day, every one.

I investigated autovacuum performance, and found that it suffers a lot
from small ring buffer. It suffers in a same way bulk writer suffered
before Tom Lane's commit 6382448cf96:


Tom Lane   2009-06-23 00:04:28
For bulk write operations (eg COPY IN), use a ring buffer of 16MB
instead of the 256KB limit originally enforced by a patch committed
2008-11-06. Per recent test results, the smaller size resulted in an
undesirable decrease in bulk data loading speed, due to COPY
processing frequently getting blocked for WAL flushing. This area
might need more tweaking later, but this setting seems to be good
enough for 8.4.


It is especially noticable when database doesn't fit in shared_buffers
but fit into OS file cache, and data is intensively updated (ie OLTP
load). In this scenario autovacuum with current 256kB (32 pages) ring
buffer lasts 3-10 times longer than with increased to 16MB ring buffer.

I've tested with synthetic load with 256MB or 1GB shared buffers and
2-6GB (with indices) tables, with different load factor and with/without
secondary indices on updated columns. Table were randomly updated with
hot and non-hot updates. Times before/after buffer increase (depending
on load) were 7500sec/1200sec, 75000sec/11500sec. So benefit is
consistently reproducible.

I didn't tested cases when database doesn't fit OS file cache. Probably
in this case benefit will be smaller cause more time will be spent in
disk read.
I didn't test intensively OLAP load. I've seen once that increased
buffer slows a bit scanning almost immutable huge table, perhaps cause
of decreased CPU cache locality. But given such scan is already fast,
and autovacuum of "almost immutable table" runs rarely, I don't think
it is very important.

Initially I wanted to make BAS_BULKWRITE and BAS_VACUUM ring sizes
configurable, but after testing I don't see much gain from increasing
ring buffer above 16MB. So I propose just 1 line change.

With regards,
--
Sokolov Yura aka funny_falcon
Postgres Professional: https://postgrespro.ru
The Russian Postgres CompanyFrom 4ee9e83e915e42de57061b29b1b7adfeec89f531 Mon Sep 17 00:00:00 2001
From: Sokolov Yura 
Date: Tue, 18 Jul 2017 12:33:33 +0300
Subject: [PATCH] Set vacuum ring buffer 16MB

Vacuum suffers a lot from small ring buffer in a way bulk writer
suffered before Tom Lane's fix at 6382448cf96:
> the smaller size resulted in an undesirable decrease in bulk data
> loading speed, due to COPY processing frequently getting blocked
> for WAL flushing.
---
 src/backend/storage/buffer/freelist.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/src/backend/storage/buffer/freelist.c b/src/backend/storage/buffer/freelist.c
index 9d8ae6ae8e..4f12ff9f77 100644
--- a/src/backend/storage/buffer/freelist.c
+++ b/src/backend/storage/buffer/freelist.c
@@ -546,7 +546,7 @@ GetAccessStrategy(BufferAccessStrategyType btype)
 			ring_size = 16 * 1024 * 1024 / BLCKSZ;
 			break;
 		case BAS_VACUUM:
-			ring_size = 256 * 1024 / BLCKSZ;
+			ring_size = 16 * 1024 * 1024 / BLCKSZ;
 			break;
 
 		default:
-- 
2.11.0


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] PgFDW connection invalidation by ALTER SERVER/ALTER USER MAPPING

2017-07-18 Thread Kyotaro HORIGUCHI

Thank you for the comments.

At Mon, 17 Jul 2017 16:09:04 -0400, Tom Lane  wrote in 
<9897.1500322...@sss.pgh.pa.us>
> Kyotaro HORIGUCHI  writes:
> > This is the revased and revised version of the previous patch.
> 
> A few more comments:
> 
> * Per the spec for CacheRegisterSyscacheCallback, a zero hash value means
> to flush all associated state.  This isn't handling that case correctly.

Right, fixed.

> Even when you are given a specific hash value, I think exiting after
> finding one match is incorrect: there could be multiple connections
> to the same server with different user mappings, or vice versa.

Sure. I'm confused that key hash value nails an entry in "the
connection cache". Thank you for pointing out that.

> * I'm not really sure that it's worth the trouble to pay attention
> to the hash value at all.  Very few other syscache callbacks do,
> and the pg_server/pg_user_mapping catalogs don't seem likely to
> have higher than average traffic.

Agreed to the points. But there is another point that reconection
is far intensive than re-looking up of a system catalog or maybe
even than replanning. For now I choosed to avoid a possibility of
causing massive number of simultaneous reconnection.

> * Personally I'd be inclined to register the callbacks at the same
> time the hashtables are created, which is a pattern we use elsewhere
> ... in fact, postgres_fdw itself does it that way for the transaction
> related callbacks, so it seems a bit weird that you are going in a
> different direction for these callbacks.  That way avoids the need to
> depend on a _PG_init function and means that the callbacks don't have to
> worry about the hashtables not being valid.

Sure, seems more reasonable than it is now. Changed the way of
registring a callback in the attached.

>  Also, it seems a bit
> pointless to have separate layers of postgresMappingSysCallback and
> InvalidateConnectionForMapping functions.

It used to be one function but it seemed a bit wierd that the
function is called from two places (two catalogs) then branchs
according to the caller. I don't have a firm opinion on this so
changed.

As the result the changes in postgres_fdw.c has been disappeard.

> * How about some regression test cases?  You couldn't really exercise
> cross-session invalidation easily, but I don't think you need to.

Ha Ha. You got me. I will add some test cases for this in the
next version. Thanks.


Ashutosh, I'll register this to the next CF after providing a
regression, thanks.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center
*** a/contrib/postgres_fdw/connection.c
--- b/contrib/postgres_fdw/connection.c
***
*** 22,27 
--- 22,28 
  #include "pgstat.h"
  #include "storage/latch.h"
  #include "utils/hsearch.h"
+ #include "utils/inval.h"
  #include "utils/memutils.h"
  #include "utils/syscache.h"
  
***
*** 53,58  typedef struct ConnCacheEntry
--- 54,62 
  	bool		have_prep_stmt; /* have we prepared any stmts in this xact? */
  	bool		have_error;		/* have any subxacts aborted in this xact? */
  	bool		changing_xact_state;	/* xact state change in process */
+ 	bool		invalidated;	/* true if reconnect is requried */
+ 	uint32		server_hashvalue;	/* hash value of foreign server oid */
+ 	uint32		mapping_hashvalue;  /* hash value of user mapping oid */
  } ConnCacheEntry;
  
  /*
***
*** 69,74  static bool xact_got_connection = false;
--- 73,79 
  
  /* prototypes of private functions */
  static PGconn *connect_pg_server(ForeignServer *server, UserMapping *user);
+ static void disconnect_pg_server(ConnCacheEntry *entry);
  static void check_conn_params(const char **keywords, const char **values);
  static void configure_remote_session(PGconn *conn);
  static void do_sql_command(PGconn *conn, const char *sql);
***
*** 78,83  static void pgfdw_subxact_callback(SubXactEvent event,
--- 83,89 
  	   SubTransactionId mySubid,
  	   SubTransactionId parentSubid,
  	   void *arg);
+ static void pgfdw_inval_callback(Datum arg, int cacheid, uint32 hashvalue);
  static void pgfdw_reject_incomplete_xact_state_change(ConnCacheEntry *entry);
  static bool pgfdw_cancel_query(PGconn *conn);
  static bool pgfdw_exec_cleanup_query(PGconn *conn, const char *query,
***
*** 130,135  GetConnection(UserMapping *user, bool will_prep_stmt)
--- 136,145 
  		 */
  		RegisterXactCallback(pgfdw_xact_callback, NULL);
  		RegisterSubXactCallback(pgfdw_subxact_callback, NULL);
+ 		CacheRegisterSyscacheCallback(FOREIGNSERVEROID,
+ 	  pgfdw_inval_callback, (Datum)0);
+ 		CacheRegisterSyscacheCallback(USERMAPPINGOID,
+ 	  pgfdw_inval_callback, (Datum)0);
  	}
  
  	/* Set flag that we did GetConnection during the current transaction */
***
*** 144,160  GetConnection(UserMapping *user, bool will_prep_stmt)
  	entry = hash_search(ConnectionHash, ,

[HACKERS] Re: Oddity in error handling of constraint violation in ExecConstraints for partitioned tables

2017-07-18 Thread Etsuro Fujita


On 2017/07/18 11:03, Robert Haas wrote:

On Mon, Jul 10, 2017 at 5:44 PM, Robert Haas  wrote:

The posted patches look OK to me.  Barring developments, I will commit
them on 2017-07-17, or send another update by then.


Committed them.


Thank you!

Best regards,
Etsuro Fujita



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] asynchronous execution

2017-07-18 Thread Kyotaro HORIGUCHI

Hello,

At Tue, 11 Jul 2017 10:28:51 +0200, Antonin Houska  wrote in 
<6448.1499761731@localhost>
> Kyotaro HORIGUCHI  wrote:
> > Effectively it is a waiting-queue followed by a
> > completed-list. The point of the compaction is keeping the order
> > of waiting or not-yet-completed requests, which is crucial to
> > avoid kind-a precedence inversion. We cannot keep the order by
> > using bitmapset in such way.
> 
> > The current code waits all waiters at once and processes all
> > fired events at once. The order in the waiting-queue is
> > inessential in the case. On the other hand I suppoese waiting on
> > several-tens to near-hundred remote hosts is in a realistic
> > target range. Keeping the order could be crucial if we process a
> > part of the queue at once in the case.
> > 
> > Putting siginificance on the deviation of response time of
> > remotes, process-all-at-once is effective. In turn we should
> > consider the effectiveness of the lifecycle of the larger wait
> > event set.
> 
> ok, I missed the fact that the order of es_pending_async entries is
> important. I think this is worth adding a comment.

I'll put an upper limit to the number of waiters processed at
once. Then add a comment like that.

> Actually the reason I thought of simplification was that I noticed small
> inefficiency in the way you do the compaction. In particular, I think it's not
> always necessary to swap the tail and head entries. Would something like this
> make sense?

I'm not sure, but I suppose that it is rare that all of the first
many elements in the array are not COMPLETE. In most cases the
first element gets a response first.

> 
>   /* If any node completed, compact the array. */
>   if (any_node_done)
>   {
...
>   for (tidx = 0; tidx < estate->es_num_pending_async; 
> ++tidx)
>   {
...
>   if (tail->state == ASYNCREQ_COMPLETE)
>   continue;
> 
>   /*
>* If the array starts with one or more 
> incomplete requests,
>* both head and tail point at the same item, 
> so there's no
>* point in swapping.
>*/
>   if (tidx > hidx)
>   {

This works to skip the first several elements when all of them
are ASYNCREQ_COMPLETE. I think it makes sense as long as it
doesn't harm the loop. The optimization is more effective by
putting out of the loop like this.

|  for (tidx = 0; tidx < estate->es_num_pending_async &&
  estate->es_pending_async[tidx] == ASYNCREQ_COMPLETE; ++tidx);
|  for (; tidx < estate->es_num_pending_async; ++tidx)
...


> And besides that, I think it'd be more intuitive if the meaning of "head" and
> "tail" was reversed: if the array is iterated from lower to higher positions,
> then I'd consider head to be at higher position, not tail.

Yeah, but maybe the "head" is still confusing even if reversed
because it is still not a head of something.  It might be less
confusing by rewriting it in more verbose-but-straightforwad way.


|  int npending = 0;
| 
|  /* Skip over not-completed items at the beginning */
|  while (npending < estate->es_num_pending_async &&
| estate->es_pending_async[npending] != ASYNCREQ_COMPLETE)
|npending++;
| 
|  /* Scan over the rest for not-completed items */
|  for (i = npending + 1 ; i < estate->es_num_pending_async; ++i)
|  {
|PendingAsyncRequest *tmp;
|PendingAsyncRequest *curr = estate->es_pending_async[i];
|
|if (curr->state == ASYNCREQ_COMPLETE)
|  continue;
|
|/* Move the not-completed item to the tail of the first chunk */
|tmp = estate->es_pending_async[i];
|estate->es_pending_async[nepending] = tmp;
|estate->es_pending_async[i] = tmp;
|++npending;
|  }


regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center



-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] pg_stop_backup(wait_for_archive := true) on standby server

2017-07-18 Thread Masahiko Sawada

On Tue, Jul 18, 2017 at 1:28 PM, Stephen Frost  wrote:
> Masahiko, Michael,
>
> * Masahiko Sawada (sawada.m...@gmail.com) wrote:
>> > This is beginning to shape.
>>
>> Sorry, I missed lots of typo in the last patch. All comments from you
>> are incorporated into the attached latest patch and I've checked it
>> whether there is other typos. Please review it.
>
> I've taken an initial look through the patch and it looks pretty
> reasonable.  I need to go over it in more detail and work through
> testing it myself next.
>
> I expect to commit this (with perhaps some minor tweaks primairly around
> punctuation/wording), barring any issues found, on Wednesday or Thursday
> of this week.

I understood. Thank you for looking at this!

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] Patch: Add --no-comments to skip COMMENTs with pg_dump

2017-07-18 Thread Michael Paquier

On Tue, Jul 18, 2017 at 3:45 AM, David Fetter  wrote:
> The one I run into frequently is in a proprietary fork, RDS Postgres.
> It'll happily dump out COMMENT ON EXTENSION plpgsq IS ...
> which is great as far as it goes, but errors out when you try to
> reload it.
>
> While bending over backwards to support proprietary forks strikes me
> as a terrible idea, I'd like to enable pg_dump to produce and consume
> ToCs just as pg_restore does with its -l/-L options.  This would
> provide the finest possible grain.

Let's add as well a similar switch to pg_dumpall that gets pushed down
to all the created pg_dump commands. If this patch gets integrated
as-is this is going to be asked. And tests would be welcome.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] hash index on unlogged tables doesn't behave as expected

2017-07-18 Thread Michael Paquier

On Tue, Jul 18, 2017 at 4:18 AM, Amit Kapila  wrote:
> Thanks.  Do you have any suggestion for back-branches?  As of now, it
> fails badly with below kind of error:
>
> test=> SELECT * FROM t_u_hash;
> ERROR:  could not open file "base/16384/16392": No such file or directory
>
> It is explained in another thread [3] where it has been found that the
> reason for such an error is that hash indexes are not WAL logged prior
> to 10.  Now, we can claim that we don't recommend hash indexes to be
> used prior to 10 in production, so such an error is okay even if there
> is no crash has happened in the system.

There are a couple of approaches:
1) Marking such indexes as invalid at recovery and log information
about the switch done.
2) Error at creation of hash indexes on unlogged tables.
3) Leave it as-is, because there is already a WARNING at creation.
I don't mind seeing 3) per the amount of work done lately to support
WAL on hash indexes.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] New partitioning - some feedback

2017-07-18 Thread Vik Fearing

On 07/07/2017 02:02 AM, Mark Kirkwood wrote:
> I'd prefer *not* to see a table and its partitions all intermixed in the
> same display (especially with nothing indicating which are partitions) -
> as this will make for unwieldy long lists when tables have many
> partitions. Also it would be good if the 'main' partitioned table and
> its 'partitions' showed up as a different type in some way.

I've just read through this thread, and I'm wondering why we can't just
have something like  \set SHOW_PARTITIONS true  or something, and that
would default to false.
-- 
Vik Fearing  +33 6 46 75 15 36
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

46 matches

Mail list logo