date:20180313

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 01:50 PM, Stephen Frost wrote:

> Yes, that's true, but PG has never paid attention to POSIX ACLs or Linux
> capabilities.  Changing it to do so is quite a bit beyond the scope...

I think we're largely in agreement here, as my aim was not to advocate
that PG should work harder to understand the subtleties of every system
it could be installed on, but rather that it should work less hard at
pretending to understand them when it doesn't, and thus avoid
obstructing the admin, who presumably does.

> I'll point out that PG does run on quite a few other OS's beyond Linux.

I'll second that, as I think it is making my argument. When I can
supply three or four examples of semantic subtleties that undermine
PG's assumptions under Linux alone, the picture when broadened to
those quite-a-few other platforms as well certainly doesn't become
simpler!

>> but then sooner or later it will still end up making assumptions
>> that aren't true under, say, SELinux, so there's another #ifdef,
>> and where does it end?
> 
> I agree with this general concern.

:)  That's probably where it became clear that I'm not advocating
an add-#ifdefs-for-everything approach.

> PG will stat PGDATA and conclude that the system is saying that group
> access has been granted on PGDATA and will do the same for subsequent
> files created later on. ... The only issue that remains is that PG
> doesn't understand or work with POSIX ACLs or Linux capabilities,
> but that's not anything new.

What's new is that it is now pretending even more extravagantly to
understand what it doesn't understand. Where it would previously draw
in incorrect conclusion and refuse to start, which is annoying but
not very difficult to work around if need be, it would now draw the
same incorrect conclusion and then actively go about making the real
world embody the incorrect conclusion, contrary to the admin's efforts.

>> umask inherited by the postmaster. A --permission-transparency option
>> would simply use open and mkdir in the traditional ways, allowing
>> the chosen umask, ACL defaults, SELinux contexts, etc., to do their
>> thing, and would avoid then stepping on the results with explicit
>> chmods (and of course skip the I-refuse-to-start-because-I-
>> misunderstand-your-setup check).
>> ...
> 
> I'm a fan of this idea in general, but it's unclear how that
> --permission-transparency option would work in practice.  Are you
> suggesting that it be a compile-time flag, which would certainly work
> but also then have to be debated among packagers as to the right setting
> and if there's any need to be concerned about users misunderstanding it,
> or a flag for each program,

I was thinking of a command-line option ...

> which hardly seems ideal as some invokations
> of programs might forget to specify that, leading to inconsistent
> permissions, or something else..?

... but I see how that gets complicated with the various other command-
line utilities included.

> .. we'd have to build in complete
> support for POSIX ACLs and Linux capabilities if we go down a route

I'm wary of an argument that we can't do better except by building
in complete support for POSIX ACLs, and capabilities (and NFSv4
ACLs, and SELinux, and AppArmor, and #ifdef and #ifdef and #ifdef).

It seems to me that, in most cases, the designers of these sorts of
extensions to old traditional Unix behavior take great pains to design
them such that they can still usefully function in the presence of
programs that "don't pay attention to or understand or use" them, as
long as those programs are in some sense well-behaved, and not going
out of their way with active steps that insist on or impose permissions
that aren't appropriate under the non-understood circumstances.

So my suggestion boils down to PG having at least an option, somehow,
to be well-behaved in that sense.

-Chap

Re: TupleTableSlot abstraction

2018-03-13 Thread Andres Freund

On 2018-03-13 15:18:34 -0300, Alvaro Herrera wrote:
> Andres Freund wrote:
> > Hi,
> > 
> > I've recently been discussing with Robert how to abstract
> > TupleTableSlots in the light of upcoming developments around abstracting
> > storage away.  Besides that aspect I think we also need to make them
> > more abstract to later deal with vectorized excution, but I'm more fuzzy
> > on the details there.

> Hi, is this patch proposed for pg11?

I wish, but I don't see it happening unless I get a time compression
device from somewhere :(

Greetings,

Andres Freund

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

On 2018-03-13 10:25:49 -0400, Peter Eisentraut wrote:
> On 3/12/18 17:04, Andres Freund wrote:
> > │ JIT:  
> >  │
> > │   Functions: 4
> >  │
> > │   Inlining: false 
> >  │
> > │   Inlining Time: 0.000
> >  │
> > │   Optimization: false 
> >  │
> > │   Optimization Time: 5.023
> >  │
> > │   Emission Time: 34.987   
> >  │
> 
> The time quantities need some units.
> 
> > │ Execution time: 46.277 ms 
> >  │
> 
> like this :)

Yea, I know. I was planning to start a thread about that. explain.c is
littered with code like
if (es->format == EXPLAIN_FORMAT_TEXT)
appendStringInfo(es->str, "Planning time: %.3f ms\n",
 1000.0 * plantime);
else
ExplainPropertyFloat("Planning Time", 1000.0 * 
plantime, 3, es);
which, to me, is bonkers.  I think we should add add 'const char *unit'
parameter to at least ExplainProperty{Float,Integer,Long}? Or a *Unit
version of them doing so, allowing a bit more gradual change?

Greetings,

Andres Freund

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

Hi,

On 2018-03-13 14:36:44 -0400, Robert Haas wrote:
> On Mon, Mar 12, 2018 at 5:04 PM, Andres Freund  wrote:
> > Currently a handful of explain outputs in the regression tests change
> > output when compiled with JITing. Therefore I'm thinking of adding
> > JITINFO or such option, which can be set to false for those tests?
> 
> Can we spell that JIT or at least JIT_INFO?

The latter works, I don't have a strong opinion on that. For now I've
just tied it to COSTS off.


> I realize that EXPLAIN (JIT OFF) may sound like it's intended to
> disable JIT itself

Yea, that's what I'm concerned about.


> , but I think it's pretty clear that EXPLAIN (BUFFERS OFF) does not
> disable the use of actual buffers, only the display of buffer-related
> information.

Hm.

Greetings,

Andres Freund

Re: INOUT parameters in procedures

2018-03-13 Thread Pavel Stehule

2018-03-13 14:14 GMT+01:00 Peter Eisentraut <
peter.eisentr...@2ndquadrant.com>:

> On 3/8/18 02:25, Pavel Stehule wrote:
> > It looks like some error in this concept. The rules for enabling
> > overwriting procedures should modified, so this collision should not be
> > done.
> >
> > When I using procedure from PL/pgSQL, then it is clear, so I place on
> > *OUT position variables. But when I call procedure from top, then I'll
> > pass fake parameters to get some result.
>
> What we'll probably want to do here is to make the OUT parameters part
> of the identity signature of procedures, unlike in functions.  This
> should be a straightforward change, but it will require some legwork in
> many parts of the code.
>

yes


>
> >if (argmodes && (argmodes[i] == PROARGMODE_INOUT ||
> > argmodes[i] == PROARGMODE_OUT))
> > +   {
> > +   Param  *param;
> >
> > Because PROARGMODE_OUT are disallowed, then this check is little bit
> > messy. Please, add some comment.
>
> Fixed.
>
> I discovered another issue, in LANGUAGE SQL procedures.  Currently, if
> you make a CALL with an INOUT parameter in an SQL procedure, the output
> is thrown away (unless it's the last command).  I would like to keep
> open the option of assigning the results by name, like we do in
> PL/pgSQL.  So in this patch I have made a change to prohibit calling
> procedures with INOUT parameters in LANGUAGE SQL routines (see
> check_sql_fn_statements()).  What do you think?
>

The disabling it, it is probably the best what is possible now. The
variables in SQL are more named parameters than variables. Is not necessary
to complicate it.

Regards

Pavel



>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-13 Thread Robert Haas

On Wed, Feb 14, 2018 at 5:37 AM, Amit Kapila  wrote:
> Your concern is valid, but isn't the same problem exists in another
> approach as well, because in that also we can call
> adjust_paths_for_srfs after generating gather path which means that we
> might lose some opportunity to reduce the relative cost of parallel
> paths due to tlists having SRFs.  Also, a similar problem can happen
> in create_order_paths  for the cases as described in the example
> above.

You're right.  I think cleaning all of this up for v11 is too much to
consider, but please tell me your opinion of the attached patch set.
Here, instead of the ripping the problematic logic out of
apply_projection_to_path, what I've done is just removed several of
the callers to apply_projection_to_path.  I think that the work of
removing other callers to that function could be postponed to a future
release, but we'd still get some benefit now, and this shows the
direction I have in mind.  I'm going to explain what the patches do
one by one, but out of order, because I backed into the need for the
earlier patches as a result of troubleshooting the later ones in the
series.  Note that the patches need to be applied in order even though
I'm explaining them out of order.

0003 introduces a new upper relation to represent the result of
applying the scan/join target to the topmost scan/join relation.  I'll
explain further down why this seems to be needed.  Since each path
must belong to only one relation, this involves using
create_projection_path() for the non-partial pathlist as we already do
for the partial pathlist, rather than apply_projection_to_path().
This is probably marginally more expensive but I'm hoping it doesn't
matter.  (However, I have not tested.)  Each non-partial path in the
topmost scan/join rel produces a corresponding path in the new upper
rel.  The same is done for partial paths if the scan/join target is
parallel-safe; otherwise we can't.

0004 causes the main part of the planner to skip calling
generate_gather_paths() from the topmost scan/join rel.  This logic is
mostly stolen from your patch.  If the scan/join target is NOT
parallel-safe, we perform generate_gather_paths() on the topmost
scan/join rel.  If the scan/join target IS parallel-safe, we do it on
the post-projection rel introduced by 0003 instead.  This is the patch
that actually fixes Jeff's original complaint.

0005 goes a bit further and removes a bunch of logic from
create_ordered_paths().  The removed logic tries to satisfy the
query's ordering requirement via cheapest_partial_path + Sort + Gather
Merge.  Instead, it adds an additional path to the new upper rel added
in 0003.  This again avoids a call to apply_projection_to_path() which
could cause projection to be pushed down after costing has already
been fixed.  Therefore, it gains the same advantages as 0004 for
queries that would this sort of plan.  Currently, this loses the
ability to set limit_tuples for the Sort path; that doesn't look too
hard to fix but I haven't done it.  If we decide to proceed with this
overall approach I'll see about getting it sorted out.

Unfortunately, when I initially tried this approach, a number of
things broke due to the fact that create_projection_path() was not
exactly equivalent to apply_projection_to_path().  This initially
surprised me, because create_projection_plan() contains logic to
eliminate the Result node that is very similar to the logic in
apply_projection_to_path().  If apply_projection_path() sees that the
subordinate node is projection-capable, it applies the revised target
list to the child; if not, it adds a Result node.
create_projection_plan() does the same thing.  However,
create_projection_plan() can lose the physical tlist optimization for
the subordinate node; it forces an exact projection even if the parent
node doesn't require this.  0001 fixes this, so that we get the same
plans with create_projection_path() that we would with
apply_projection_to_path().  I think this patch is a good idea
regardless of what we decide to do about the rest of this, because it
can potentially avoid losing the physical-tlist optimization in any
place where create_projection_path() is used.

It turns out that 0001 doesn't manage to preserve the physical-tlist
optimization when the projection path is attached to an upper
relation.  0002 fixes this.

If we decide to go forward with this approach, it may makes sense to
merge some of these when actually committing, but I found it useful to
separate them for development and testing purposes, and for clarity
about what was getting changed at each stage.  Please let me know your
thoughts.

Thanks,

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

0005-Remove-explicit-path-construction-logic-in-create_or.patch
Description: Binary data

0004-Postpone-generate_gather_paths-for-topmost-scan-join.patch
Description: Binary data

Re: JIT compiling with LLVM v11

2018-03-13 Thread Robert Haas

On Mon, Mar 12, 2018 at 5:04 PM, Andres Freund  wrote:
> Currently a handful of explain outputs in the regression tests change
> output when compiled with JITing. Therefore I'm thinking of adding
> JITINFO or such option, which can be set to false for those tests?

Can we spell that JIT or at least JIT_INFO?

I realize that EXPLAIN (JIT OFF) may sound like it's intended to
disable JIT itself, but I think it's pretty clear that EXPLAIN
(BUFFERS OFF) does not disable the use of actual buffers, only the
display of buffer-related information.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: proposal: schema variables

2018-03-13 Thread Pavel Stehule

2018-03-13 10:54 GMT+01:00 Pavel Luzanov :

> Pavel Stehule wrote
> > Now, there is one important question - storage - Postgres stores all
> > objects to files - only memory storage is not designed yet. This is part,
> > where I need a help.
>
> O, I do not feel confident in such questions.
> May be some ideas you can get from extension with similar functionality:
> https://github.com/postgrespro/pg_variables


Unfortunately not - it doesn't implement this functionality

Regards

Pavel


>
>
> -
> Pavel Luzanov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>
>
> --
> Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-
> f1928748.html
>
>

neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Jeff Janes

The following commit has caused a devastating performance regression
in concurrent refresh of MV:

commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
Author: Tom Lane 
Date:   Wed Nov 29 22:00:29 2017 -0500

Fix neqjoinsel's behavior for semi/anti join cases.


The below reproduction goes from taking about 1 second to refresh, to
taking an amount of time I don't have the patience to measure.

drop table foobar2 cascade;
create table foobar2 as select * from generate_series(1,20);
create materialized view foobar3 as select * from foobar2;
create unique index on foobar3 (generate_series );
analyze foobar3;
refresh materialized view CONCURRENTLY foobar3 ;


When I interrupt the refresh, I get a message including this line:

CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420
newdata WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM
pg_temp_3.pg_temp_16420 newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid OPERATOR(pg_catalog.<>)
newdata.ctid) LIMIT 1"

So I makes sense that the commit in question could have caused a change in
the execution plan.  Because these are temp tables, I can't easily get my
hands on them to investigate further.

Cheers,

Jeff

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2018-03-13 Thread Robert Haas

On Mon, Feb 19, 2018 at 4:02 AM, David Rowley
 wrote:
> On 19 February 2018 at 18:01, David Rowley  
> wrote:
>> On 19 February 2018 at 15:11, Tomas Vondra  
>> wrote:
>>> and perhaps we should do s/isproxy/is_proxy/ which seems like the usual
>>> naming for boolean variables.
>>
>> You're right. I'll change that and post an updated patch. I'll also
>> reprocess your email again and try to improve some comments to make
>> the intent of the code more clear.
>
> I've made a few changes to the patch. "isproxy" is now "is_proxy".
> I've made the comments a bit more verbose at the top of the definition
> of struct AppendPath. Also shuffled some comments around in allpaths.c
>
> I've attached both a delta patch with just these changes, and also a
> completely new patch based on a recent master.

While I was working on
http://postgr.es/m/ca+tgmoakt5gmahbpwgqrr2nadfomaonoxyowhrdvfgws34t...@mail.gmail.com
I noticed that a projection path is very close to being usable as a
proxy path, because we've already got code to strip out unnecessary
proxy paths at plan generation time.  I noticed that there were a few
problems with that which I wrote patches to fix (see 0001, 0002
attached to that email) but they seem to be only minor issues.

What do you think about the idea of using a projection path as a proxy
path instead of inventing a new method?  It seems simple enough to do:

new_path = (Path *) create_projection_path(root, new_rel, old_path,
old_path->pathtarget);

...when we need a proxy path.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 02:47 PM, Tom Lane wrote:
> Well, to be blunt, what we target is POSIX-compatible systems.  If
> you're telling us to worry about non-POSIX filesystem semantics,
> and your name is not Microsoft, it's going to be a hard sell.
> We have enough to do just keeping up with that scope of target
> systems.

So, how many POSIX-compatible systems are available today (if any),
where you can actually safely assume that there are no additional
security/access-control-related considerations in effect beyond
three user bits/three group bits/three other bits, and not be wrong?

I'm not advocating the Sisyphean task of having PG incorporate
knowledge of all those possibilities. I'm advocating the conservative
approach: have PG be that well-behaved application that those extended
semantics are generally all designed to play well with, and just not
do stuff that obstructs or tramples over what the admin takes care
to set up.

On 03/13/2018 03:44 PM, Stephen Frost wrote:
> * Chapman Flack (c...@anastigmatix.net) wrote:
>> So my suggestion boils down to PG having at least an option, somehow,
>> to be well-behaved in that sense.
>
> I'm afraid that we haven't got any great answer to that "somehow".  I
> was hoping you might have some other ideas beyond command-line
> switches which could leave the system in an inconsistent state a bit
> too easily.

I wonder how complicated it would have to be really. On any system
with a POSIX base, I guess it's possible for PGDATA to have an S_ISVTX
"sticky" bit in the mode, which does have an official significance (but
one that only affects whether non-postgres can rename or unlink things
in the directory, which might be of little practical significance).
Perhaps its meaning could be overloaded with "the admin is handling
the permissions, thank you", and postmaster and various command-line
utilities could see it, and refrain from any gratuitous chmods or
refusals to function.

Or, if overloading S_ISVTX seems in poor taste, what would be wrong
with simply checking for an empty file PERMISSIONS-ARE-MANAGED in
PGDATA and responding the same way?

Or, assuming some form of ACL is available, just let the admin
change the owner and group of PGDATA to other than postgres,
grant no access to other, and give rwx to postgres in the ACL?

PG could then reason as follows: * I do not own this directory.
* I am not the group of this directory. * It grants no access to other.
* Yet, I find myself listing and accessing files in it without
difficulty. * The admin has set this up for me in a way I do not
understand. * I will refrain from messing with it.

Three ideas off the top of my head. Probably more where they came from.
:)

-Chap

Re: parallel append vs. simple UNION ALL

2018-03-13 Thread Robert Haas

On Tue, Mar 13, 2018 at 12:35 AM, Ashutosh Bapat
 wrote:
> It looks like it was not changed in all the places. make falied. I
> have fixed all the instances of these two functions in the attached
> patchset (only 0003 changes). Please check.

Oops.  Thanks.

I'm going to go ahead and commit 0001 here.  Any more thoughts on the rest?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Problem while setting the fpw with SIGHUP

2018-03-13 Thread Michael Paquier

On Fri, Mar 09, 2018 at 01:42:04PM +0530, Dilip Kumar wrote:
> While setting the full_page_write with SIGHUP I hit an assert in checkpoint
> process. And, that is because inside a CRITICAL section we are calling
> RecoveryInProgress which intern allocates memory.  So I have moved
> RecoveryInProgress call out of the CRITICAL section and the problem got
> solved.

Indeed, I can see how this is possible.

If you apply the following you can also have way more fun, but that's
overdoing it:
--- a/src/backend/access/transam/xlog.c
+++ b/src/backend/access/transam/xlog.c
@@ -7918,6 +7918,8 @@ CheckRecoveryConsistency(void)
bool
RecoveryInProgress(void)
{
+   Assert(CritSectionCount == 0);

Anyway, it seems to me that you are not taking care of all possible race
conditions here.  RecoveryInProgress() could as well be called in
XLogFlush(), and that's a code path taken during redo.

Instead of doing what you are suggesting, why not moving
InitXLogInsert() out of InitXLOGAccess() and change InitPostgres() so as
the allocations for WAL inserts is done unconditionally?  This has
the cost of also making this allocation even for backends which are
started during recovery, still we are talking about allocating a couple
of bytes in exchange of addressing completely all race conditions in
this area.  InitXLogInsert() does not depend on any post-recovery data
like ThisTimeLineId, so a split is possible.

Thoughts?
--
Michael

signature.asc
Description: PGP signature

Re: PATCH: Unlogged tables re-initialization tests

2018-03-13 Thread Michael Paquier

On Mon, Mar 12, 2018 at 02:33:18PM -0400, David Steele wrote:
> On 3/12/18 11:59 AM, Dagfinn Ilmari Mannsåker wrote:
>> However, that is still wrong, because die() and BAIL_OUT() mean
>> different things: die() aborts the current test script, while BAIL_OUT()
>> aborts the entire 'prove' run, i.e. all subsequent scripts in the same
>> test directory.
> 
> If that's the case, do we really want to abort all subsequent test
> modules if a single module fails?  I'm good either way, just throwing it
> out there for consideration.

I am getting that in those code paths, we want the test series to die in
a painful way which should be used for tests where things are critically
failing.  In which case, as documented by Test:More we should use
BAIL_OUT.  It is true that using die() has the advantage that one can
look at multiple failures in a single run when you want to look for
similar failure patterns, however it makes debugging harder if you a lot
of test scripts because it becomes harder to track which one was the
first to fail in a parallel test.

Also, if you look at all the code paths calling die() in the test suites
those refer to critical failures, which should cause an immediate stop
of the test.  So my take would be to switch all die() calls to
BAIL_OUT() in order to make all this code consistent.
--
Michael

signature.asc
Description: PGP signature

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread David Gould

On Mon, 12 Mar 2018 12:21:34 -0400
Tom Lane  wrote:

> I wrote:
> > Maybe this type of situation is an argument for trusting an ANALYZE-based
> > estimate more than the VACUUM-based estimate.  I remain uncomfortable with
> > that in cases where VACUUM looked at much more of the table than ANALYZE
> > did, though.  Maybe we need some heuristic based on the number of pages
> > actually visited by each pass?  
> 
> I looked into doing something like that.  It's possible, but it's fairly
> invasive; there's no clean way to compare those page counts without
> altering the API of acquire_sample_rows() to pass back the number of pages
> it visited.  That would mean a change in FDW-visible APIs.  We could do
> that, but I don't want to insist on it when there's nothing backing it up
> except a fear that *maybe* ANALYZE's estimate will be wrong often enough
> to worry about.
> 
> So at this point I'm prepared to go forward with your patch, though not
> to risk back-patching it.  Field experience will tell us if we need to
> do more.  I propose the attached cosmetic refactoring, though.

I like the re-factor. Making vac_estimate_reltuples() specific to the
special case of vacuum and having the normal analyze case just in analyze
seems like an improvement overall.

It it helps I have been experimenting with your thought experiment (update
first 20% of rows, then delete 50% of those) to try to trick analyze. I
built test scripts and generate data and found the the current system
after vacuum is usually about 8% to 10% off on reltuples. Analyze moves it
very slowly closer to the true count. With the patch analyze nails it
immediately.

To see how this was affected by the relationship of table size and sample
I created another test setup just to iterate analyze runs and compare to
the true count. fter a couple thousand analyzes with various table mutations
and table sizes up to 100 M rows, (1.8 M pages) and
default_statistics_targets ranging from 1 to 1000 I am pretty confident that
we can get 2% accuracy for any size table even with the old statistics
target. The table size does not really matter much, the error drops and
clusters more tightly as sample size increases but past 1 or so it's well
into diminishing returns.

  Summary of 100 analyze runs for each line below.
Errors and sample fraction are in percent for easy reading.
  / --- percent 
---/
 testcase  | Mrows | stats |  pages  | sample | fraction | maxerr | avgerr | 
stddev 
---+---+---+-++--+++-
first-last |10 | 1 |  163935 |300 | 0.001830 | 6.6663 | 2.3491 | 
2.9310
first-last |10 | 3 |  163935 |900 | 0.005490 | 3.8886 | 1.2451 | 
1.5960
first-last |10 |10 |  163935 |   3000 | 0.018300 | 2.8337 | 0.7539 | 
0.9657
first-last |10 |33 |  163935 |   9900 | 0.060390 | 1.4903 | 0.3723 | 
0.4653
first-last |10 |   100 |  163935 |  3 | 0.182999 | 0.6580 | 0.2221 | 
0.2707
first-last |10 |   333 |  163935 |  99900 | 0.609388 | 0.1960 | 0.0758 | 
0.0897
first-last |   100 | 1 | 1639345 |300 | 0.000183 | 8.7500 | 2.2292 | 
2.8685
first-last |   100 | 3 | 1639345 |900 | 0.000549 | 5.4166 | 1.1695 | 
1.5431
first-last |   100 |10 | 1639345 |   3000 | 0.001830 | 1.7916 | 0.6258 | 
0.7593
first-last |   100 |33 | 1639345 |   9900 | 0.006039 | 1.8182 | 0.4141 | 
0.5433
first-last |   100 |   100 | 1639345 |  3 | 0.018300 | 0.9417 | 0.2464 | 
0.3098
first-last |   100 |   333 | 1639345 |  99900 | 0.060939 | 0.4642 | 0.1206 | 
0.1542
first-last |   100 |  1000 | 1639345 | 30 | 0.183000 | 0.2192 | 0.0626 | 
0.0776
un-updated |10 | 1 |  180328 |300 | 0.001664 | 7.9259 | 2.2845 | 
2.7806
un-updated |10 | 3 |  180328 |900 | 0.004991 | 4.2964 | 1.2923 | 
1.5990
un-updated |10 |10 |  180328 |   3000 | 0.016636 | 2.2593 | 0.6734 | 
0.8271
un-updated |10 |33 |  180328 |   9900 | 0.054900 | 0.9260 | 0.3305 | 
0.3997
un-updated |10 |   100 |  180328 |  3 | 0.166364 | 1.0162 | 0.2024 | 
0.2691
un-updated |10 |   333 |  180328 |  99900 | 0.553991 | 0.2058 | 0.0683 | 
0.0868
un-updated |   100 | 1 | 1803279 |300 | 0.000166 | 7. | 1.8793 | 
2.3820
un-updated |   100 | 3 | 1803279 |900 | 0.000499 | 3.8889 | 1.0586 | 
1.3265
un-updated |   100 |10 | 1803279 |   3000 | 0.001664 | 2.1407 | 0.6710 | 
0.8376
un-updated |   100 |33 | 1803279 |   9900 | 0.005490 | 1.1728 | 0.3779 | 
0.4596
un-updated |   100 |   100 | 1803279 |  3 | 0.016636 | 0.6256 | 0.1983 | 
0.2551
un-updated |   100 |   333 | 1803279 |  99900 | 0.055399 | 0.3454 | 0.1181 | 
0.1407
un-updated |   100 |  1000 | 1803279 | 30 | 0.166364 | 0.1738 | 0.0593 | 
0.0724

I also thought about the theory and am confident that there really is no way
to trick it. Basically if there are enough

Re: Ambigous Plan - Larger Table on Hash Side

2018-03-13 Thread Ashutosh Bapat

On Mon, Mar 12, 2018 at 10:02 PM, Narendra Pradeep U U
 wrote:
> Hi ,
>
>   Recently I came across a case where the planner choose larger table on
> hash side. I am not sure whether it is an intended  behavior or we are
> missing something.
>
>  I have two tables (a and b) each with  single column in it. One table
> 'a' is large with around 30 million distinct rows and other table 'b' has
> merely 70,000 rows with one-seventh (10,000) distinct rows. I have analyzed
> both the table.  But while joining both the table I get the larger table on
> hash side.
>
> tpch=# explain select b from b left join a on a = b;
>QUERY PLAN
> -
>  Hash Left Join  (cost=824863.75..950104.42 rows=78264 width=4)
>Hash Cond: (b.b = a.a)o
>->  Foreign Scan on b  (cost=0.00..821.64 rows=78264 width=4)
>  CStore File:
> /home/likewise-open/pg96/data/cstore_fdw/1818708/1849879
>  CStore File Size: 314587
>->  Hash  (cost=321721.22..321721.22 rows=30667722 width=4)
>  ->  Foreign Scan on a  (cost=0.00..321721.22 rows=30667722 width=4)
>CStore File:
> /home/likewise-open/pg96/data/cstore_fdw/1818708/1849876
>CStore File Size: 123236206
> (9 rows)
>
>
>
> I would like to know the reason for choosing this plan and Is there a easy
> fix to prevent such plans (especially like this one where it choose a larger
> hash table) ?

A plan with larger table being hashed doesn't necessarily bad
performing one. During partition-wise join analysis I have seen plans
with larger table being hashed perform better than the plans with
smaller table being hashed. But I have seen the other way around as
well. Although, I don't know an easy way to force which side of join
gets hashed. I tried that under the debugger. In your case, if you run
EXPLAIN ANALYZE on this query, produce outputs of two plans: one with
larger table being hashed and second with the smaller one being
hashed, you will see which of them performs better.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-13 Thread Andrey Borodin

Tom, thanks for inspecting the patch.
There's so many problems that slipped from my attention... But one thing that 
bothers me most is the problem with predicate locking

> 13 марта 2018 г., в 0:55, Tom Lane  написал(а):
> 
> The PredicateLockPage call also troubles me quite a bit, not only from
> a modularity standpoint but because that implies a somewhat-user-visible
> behavioral change when this optimization activates.  People who are using
> serializable mode are not going to find it to be an improvement if their
> queries fail and need retries a lot more often than they did before.
> I don't know if that problem is bad enough that we should disable skipping
> when serializable mode is active, but it's something to think about.

Users already have to expect this if the scan turns into IoS somewhat 
accidentally. There will be page predicate locks with possible false positive 
conflicts. I'm not sure that keeping existing tuple-level lock granularity is 
necessary.

I think we can do it if we introduce PredicateLockTuple that do not require 
tuple's xmin, that takes only tid and does not look into heap. But this tweak 
seems outside of the patch scope and I believe it's better avoid messing with 
SSI internals without strong reason.

Best regards, Andrey Borodin.

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Michael Paquier

On Mon, Mar 12, 2018 at 03:14:13PM -0400, Stephen Frost wrote:
> We already had a discussion about having a GUC for this and concluded,
> rightly in my view, that it's not sensible to have since we don't want
> all of the various tools having to read and parse out postgresql.conf.

If the problem is parsing, it could as well be more portable to put that
in the control file, no?  I have finished for example finished by
implementing my own flavor of pg_controldata to save parsing efforts
soas it prints control file fields on a per-call basis using individual
fields, which saved also games with locales for as translations of
pg_controldata can disturb the parsing logic.

> I don't see anything in the discussion which has changed that and I
> don't agree that there's an issue with using the privileges on the data
> directory for this- it's a simple solution which all of the tools can
> use and work with easily.  I certainly don't agree that it's a serious
> issue to relax the explicit check- it's just a check, which a user could
> implement themselves if they wished to and had a concern for.  On the
> other hand, with the explicit check, we are actively preventing an
> entirely reasonable goal of wanting to use a read-only role to perform a
> backup of the system.

Well, one thing is that the current checks in the postmaster make sure
that a data folder is never using anything else than 0700.  From a
security point of view, making it possible to allow a postmaster to
start with 0750 is a step backwards if users don't authorize it
explicitely.  There are a lot of systems which use a bunch of users with
only single group with systemd.  So this would remove an existing
safeguard.  I am not against the idea of this thread, just that I think
that secured defaults should be user-enforceable if they want Postgres
to behave so.
--
Michael


signature.asc
Description: PGP signature

RE: Temporary tables prevent autovacuum, leading to XID wraparound

2018-03-13 Thread Tsunakawa, Takayuki

From: Tom Lane [mailto:t...@sss.pgh.pa.us]
> So the correct fix is to improve autovacuum's check to discover whether
> an old temp table is orphaned, so that it isn't fooled by putative owner
> processes that are connected to some other DB.  Step 2 of the proposed patch
> tries to do that, but it's only half right: we need a change near line 2264
> not only line 2090.  (Evidently, we need either a comment that the logic
> is repeated, or else refactor so that there's only one copy.)
> 
> Now, you can argue that autovacuum's check can be fooled by an "owner"
> backend that is connected to the current DB but hasn't actually taken
> possession of its assigned temp schema (and hence the table in question
> really is orphaned after all).  This edge case could be taken care of by
> having backends clear their temp schema immediately, as in step 1 of the
> patch.  But I still think that that is an expensive way to catch what would
> be a really infrequent case.  Perhaps a better idea is to have backends
> advertise in PGPROC whether they have taken possession of their assigned
> temp schema, and then autovacuum could ignore putative owners that aren't
> showing that flag.  Or we could just do nothing about that, on the grounds
> that nobody has shown the case to be worth worrying about.
> Temp schemas that are sufficiently low-numbered to be likely to have an
> apparent owner are also likely to get cleaned out by actual temp table
> creation reasonably often, so I'm not at all convinced that we need a
> mechanism that covers this case.  We do need something for the cross-DB
> case though, because even a very low-numbered temp schema can be problematic
> if it's in a seldom-used DB, which seems to be the case that was complained
> of originally.
> 
> On the whole, my vote is to fix and apply step 2, and leave it at that.

Done.  It seems to work well.

Regards
Takayuki Tsunakawa



reset_temp_schema_v2.patch
Description: reset_temp_schema_v2.patch

Re: proposal: schema variables

2018-03-13 Thread Pavel Luzanov

Pavel Stehule wrote
> Now, there is one important question - storage - Postgres stores all
> objects to files - only memory storage is not designed yet. This is part,
> where I need a help.

O, I do not feel confident in such questions.
May be some ideas you can get from extension with similar functionality:
https://github.com/postgrespro/pg_variables

-
Pavel Luzanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: Ambigous Plan - Larger Table on Hash Side

2018-03-13 Thread Narendra Pradeep U U

Hi, 

  Thanks everyone for your suggestions. I would like to add  explain 
analyze of both the plans so that we can have broader picture.



I have a work_mem of 1000 MB.



The Plan which we get regularly with table being analyzed .

tpch=# explain analyze  select b from tab2 left join tab1 on a = b; 

   QUERY PLAN   




 Hash Left Join  (cost=945515.68..1071064.34 rows=78264 width=4) (actual 
time=9439.410..20445.620 rows=78264 loops=1)

   Hash Cond: (tab2.b = tab1.a)

   -  Seq Scan on tab2  (cost=0.00..1129.64 rows=78264 width=4) (actual 
time=0.006..5.116 rows=78264 loops=1)

   -  Hash  (cost=442374.30..442374.30 rows=30667630 width=4) (actual 
time=9133.593..9133.593 rows=30667722 loops=1)

 Buckets: 33554432  Batches: 2  Memory Usage: 801126kB

 -  Seq Scan on tab1  (cost=0.00..442374.30 rows=30667630 width=4) 
(actual time=0.030..3584.652 rows=30667722 loops=1)

 Planning time: 0.055 ms

 Execution time: 20472.603 ms

(8 rows)








I reproduced the  other plan by not analyzing the smaller table.

tpch=# explain analyze  select b from tab2 left join tab1 on a = b; 

QUERY PLAN  
  

--

 Hash Right Join  (cost=2102.88..905274.97 rows=78039 width=4) (actual 
time=15.331..7590.406 rows=78264 loops=1)

   Hash Cond: (tab1.a = tab2.b)

   -  Seq Scan on tab1  (cost=0.00..442375.48 rows=30667748 width=4) 
(actual time=0.046..2697.480 rows=30667722 loops=1)

   -  Hash  (cost=1127.39..1127.39 rows=78039 width=4) (actual 
time=15.133..15.133 rows=78264 loops=1)

 Buckets: 131072  Batches: 1  Memory Usage: 3776kB

 -  Seq Scan on tab2  (cost=0.00..1127.39 rows=78039 width=4) 
(actual time=0.009..5.516 rows=78264 loops=1)

 Planning time: 0.053 ms

 Execution time: 7592.688 ms

(8 rows)



 
The actual plan seems to be Slower. The smaller table (tab2) has exactly each 
row duplicated 8 times  and all the rows in larger table (tab2) are distinct. 
what may be the exact reason  and  can we fix this ?
 
P.s I have also attached a sql file to reproduce this 



 On Tue, 13 Mar 2018 12:42:12 +0530 Ashutosh Bapat 
ashutosh.ba...@enterprisedb.com wrote 




On Mon, Mar 12, 2018 at 10:02 PM, Narendra Pradeep U U 

narendra.prad...@zohocorp.com wrote: 

 Hi , 

 

 Recently I came across a case where the planner choose larger table on 

 hash side. I am not sure whether it is an intended behavior or we are 

 missing something. 

 

 I have two tables (a and b) each with single column in it. One table 

 'a' is large with around 30 million distinct rows and other table 'b' has 

 merely 70,000 rows with one-seventh (10,000) distinct rows. I have 
analyzed 

 both the table. But while joining both the table I get the larger table on 

 hash side. 

 

 tpch=# explain select b from b left join a on a = b; 

 QUERY PLAN 

 
-
 

 Hash Left Join (cost=824863.75..950104.42 rows=78264 width=4) 

 Hash Cond: (b.b = a.a)o 

 - Foreign Scan on b (cost=0.00..821.64 rows=78264 width=4) 

 CStore File: 

 /home/likewise-open/pg96/data/cstore_fdw/1818708/1849879 

 CStore File Size: 314587 

 - Hash (cost=321721.22..321721.22 rows=30667722 width=4) 

 - Foreign Scan on a (cost=0.00..321721.22 rows=30667722 width=4) 

 CStore File: 

 /home/likewise-open/pg96/data/cstore_fdw/1818708/1849876 

 CStore File Size: 123236206 

 (9 rows) 

 

 

 

 I would like to know the reason for choosing this plan and Is there a easy 

 fix to prevent such plans (especially like this one where it choose a 
larger 

 hash table) ? 

 

A plan with larger table being hashed doesn't necessarily bad 

performing one. During partition-wise join analysis I have seen plans 

with larger table being hashed perform better than the plans with 

smaller table being hashed. But I have seen the other way around as 

well. Although, I don't know an easy way to force which side of join 

gets hashed. I tried that under the debugger. In your case, if you run 

EXPLAIN ANALYZE on this query, produce outputs of two plans: one with 

larger table being hashed and second with the smaller one being 

hashed, you will see which of them performs better. 

 

-- 

Best Wishes, 

Ashutosh Bapat 

EnterpriseDB Corporation 

The Postgres Database Company 

 








join_plan.sql
Description: Binary data

Re: SQL/JSON: functions

2018-03-13 Thread Simon Riggs

On 7 March 2018 at 14:34, Nikita Glukhov  wrote:

> Attached 12th version of SQL/JSON patches rebased onto the latest master.

Please write some docs or notes to go with this.

If you drop a big pile of code with no explanation it will just be ignored.

I think many people want SQL/JSON, but the purpose of a patch
submission is to allow a committer to review and commit without
needing to edit anything. It shouldn't be like assembling flat pack
furniture while wearing a blindfold.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Aleksander Alekseev

Hello everyone,

> > I am mostly interested in anything that requires C/C++ implementation and
> > AlgoDS.
> >
> > For that reason I would love to work in any of the following (in that
> > order of preference):
> >
> >1. Sorting algorithms benchmark and implementation
> >2. Enhancing amcheck for all AMs
> >3. TOAST'ing in slices
> >4. Thrift datatype support
> >
> Having recently worked with Thrift, I recommend ... don't use Thrift. The
> library is awkward to work with, it isn't very source-compatible across
> versions.
> 
> Consider protobuf instead.

Craig, I believe you probably did something wrong if you had to work
with some library directly. Actually you generate classes from text
description and just use them. I worked with Thrift some time ago, in
2015 [1]. I wouldn't call it awkward. Protobuf is fine too, but
unfortunately we don't have any Protobuf-related projects this time.
Also it's probably worth noticing that the GSoC project doesn't imply
using any existing libraries, only the binary format which is quite
stable.

Christos, I appreciate your interest in the Thrift-related project. You
should know however that we already have a student interested in it [2].
Feel free to apply for it as well but in this case be prepared for a
little competition.

[1]: 
https://github.com/afiskon/scala-thrift-example/blob/master/src/test/scala/me/eax/examples/thrift/tests/BinaryProtocol.scala#L15
[2]: 
https://postgr.es/m/CA%2BSXE9sP1iHNp9_DFJzdbE0cszAA-QF8d-8GAUyoCA4q9KCsGw%40mail.gmail.com

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: Using base backup exclusion filters to reduce data transferred with pg_rewind

2018-03-13 Thread Anastasia Lubennikova


05.02.2018 10:10, Michael Paquier:

So the patch set attached is made of the following:
- 0001, which refactors all hardcoded system paths into pg_paths.h.
This modifies only initdb.c and basebackup.c to ease reviews.
- 0002 spreads the path changes and the use of pg_paths.h across the
core code.
- 0003 moves the last set of definitions with backup_label,
tablespace_map and pg_internal.init.
- 0004 creates basebackup_paths.h, this can be consumed by pg_rewind.
- 0005 makes the changes for pg_rewind.

Thank you for this set of patches.
This refactoring makes code way more convenient to read and change.

Due to some merge conflicts, patch 0002 was not applying clearly.
So I attach the updated version.
I also noticed a couple of rows that were not updated, and wrote a patch 
0006,
which contains just minor changes and can be applied on top of any patch 
after 0003.


Since these patches contain mostly cosmetic changes and do not break 
anything,
I think it's fine to mark this thread as Ready For Committer without 
long discussion.


--
Anastasia Lubennikova
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 6698d430ffb6bd2e33aba0b21dc2a9358cf6328c Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Mon, 5 Feb 2018 13:03:38 +0900
Subject: [PATCH 1/5] Refactor path definitions into a single header file,
 pg_paths.h

The definition of all internal paths of PostgreSQL are spread across the
code base, making tracing of those definitions difficult to begin with,
and resulting in a lot of code paths to be hardcoded.  While this has no
real practical consequences in practice, this makes the creation of
lists manipulating those paths less maintainable as as things stand the
already-hardcoded paths need to be copied around more.  In order to
improve things for the long-term, move all those definitions into a
single header file.

This commit does a first step by switching basebackup.c and initdb.c to
use them.  An upcoming commit will make all the definitions across the
code base use this new header more consistently.
---
 src/backend/postmaster/postmaster.c  |   1 +
 src/backend/postmaster/syslogger.c   |   1 +
 src/backend/replication/basebackup.c |  16 ++--
 src/backend/storage/ipc/dsm.c|   1 +
 src/backend/storage/ipc/dsm_impl.c   |   1 +
 src/backend/storage/lmgr/predicate.c |   3 +-
 src/backend/utils/adt/misc.c |   1 +
 src/bin/initdb/initdb.c  |  45 ++--
 src/include/access/xlog_internal.h   |   7 +-
 src/include/pg_paths.h   | 137 +++
 src/include/pgstat.h |   3 -
 src/include/postmaster/syslogger.h   |   7 --
 src/include/storage/dsm_impl.h   |   9 ---
 src/include/utils/guc.h  |   7 --
 14 files changed, 176 insertions(+), 63 deletions(-)
 create mode 100644 src/include/pg_paths.h

diff --git a/src/backend/postmaster/postmaster.c b/src/backend/postmaster/postmaster.c
index f3ddf828bb..66d80914a0 100644
--- a/src/backend/postmaster/postmaster.c
+++ b/src/backend/postmaster/postmaster.c
@@ -105,6 +105,7 @@
 #include "libpq/pqsignal.h"
 #include "miscadmin.h"
 #include "pg_getopt.h"
+#include "pg_paths.h"
 #include "pgstat.h"
 #include "port/pg_bswap.h"
 #include "postmaster/autovacuum.h"
diff --git a/src/backend/postmaster/syslogger.c b/src/backend/postmaster/syslogger.c
index f70eea37df..c8770f6c61 100644
--- a/src/backend/postmaster/syslogger.c
+++ b/src/backend/postmaster/syslogger.c
@@ -35,6 +35,7 @@
 #include "libpq/pqsignal.h"
 #include "miscadmin.h"
 #include "nodes/pg_list.h"
+#include "pg_paths.h"
 #include "pgstat.h"
 #include "pgtime.h"
 #include "postmaster/fork_process.h"
diff --git a/src/backend/replication/basebackup.c b/src/backend/replication/basebackup.c
index dd7ad64862..63ab81f837 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -117,25 +117,25 @@ static const char *excludeDirContents[] =
 	 * even if the intention is to restore to another master. See backup.sgml
 	 * for a more detailed description.
 	 */
-	"pg_replslot",
+	PG_REPLSLOT_DIR,
 
 	/* Contents removed on startup, see dsm_cleanup_for_mmap(). */
 	PG_DYNSHMEM_DIR,
 
 	/* Contents removed on startup, see AsyncShmemInit(). */
-	"pg_notify",
+	PG_NOTIFY_DIR,
 
 	/*
 	 * Old contents are loaded for possible debugging but are not required for
 	 * normal operation, see OldSerXidInit().
 	 */
-	"pg_serial",
+	PG_SERIAL_DIR,
 
 	/* Contents removed on startup, see DeleteAllExportedSnapshotFiles(). */
-	"pg_snapshots",
+	PG_SNAPSHOTS_DIR,
 
 	/* Contents zeroed on startup, see StartupSUBTRANS(). */
-	"pg_subtrans",
+	PG_SUBTRANS_DIR,
 
 	/* end of list */
 	NULL
@@ -147,7 +147,7 @@ static const char *excludeDirContents[] =
 static const char *excludeFiles[] =
 {
 	/* Skip auto conf temporary file. */
-	PG_AUTOCONF_FILENAME ".tmp",
+	PG_AUTOCONF_FILENAME_TMP,
 
 	/* Skip current log file temporary file */

Re: Additional Statistics Hooks

2018-03-13 Thread David Rowley

On 13 March 2018 at 11:44, Tom Lane  wrote:
> While it would certainly be nice to have better behavior for that,
> "add a hook so users who can write C can fix it by hand" doesn't seem
> like a great solution.  On top of the sheer difficulty of writing a
> hook function, you'd have the problem that no pre-written hook could
> know about all available functions.  I think somehow we'd need a way
> to add per-function knowledge, perhaps roughly like the protransform
> feature.

I always imagined that extended statistics could be used for this.
Right now the estimates are much better when you create an index on
the function, but there's no real reason to limit the stats that are
gathered to just plain columns + expression indexes.

I believe I'm not the only person to have considered this. Originally
extended statistics were named multivariate statistics. I think it was
Dean and I (maybe others too) that suggested to Tomas to give the
feature a more generic name so that it can be used for a more general
purpose later.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: MCV lists for highly skewed distributions

2018-03-13 Thread John Naylor

Hi Dean,
I've looked over your patch briefly, but not tested it yet.

> [construction of a pathological data set]

> So the table has around 31 million rows, and the stats target is maxed
> out -- we're sampling around 10% of the table, and it's not possible
> to sample more. Looking at the value a=5, it appears 102 times in the
> table, so we can expect it to be seen around 10 times in any sample,
> but the minimum count for the new algorithm in this case is 23, so it
> won't be included in the MCV list. This leads to it having the same
> estimate as all other non-MCV values, which turns out to be rows=5, a
> considerable under-estimate.

> So arguably, the new algorithm is still better than the current one
> for this data, but the fact that it doesn't include a=5 in the list
> bugs me, because the estimate for that single value is consistently
> worse. Looking at the distribution of a=5 in the sample, it is a
> hypergeometric distribution with N=3100, n=300 and K=102. This
> gives a sample mean of 9.8 and a variance of 8.9 (a standard deviation
> of 3.0).

Mean =  n * K / N = 9.8
Var = (K / N) * (1 - K / N) * n * (N - n) / (N - 1) = 8.9
Stdev = sqrt(Var) = 3.0

Got it.

> Also, this is common enough that in fact that distribution
> can be reasonably approximated by a normal distribution.

For the archives, because it's typically seen 10 times in the sample,
per the rule of thumb mention upthread?

> The value is
> excluded from the MCV list because the spread is deemed too large,
> relative to the mean, so the relative error of the MCV-based estimate
> would be too high. However, not including it in the MCV list causes it
> to have an estimate of 5, which would correspond to a sample mean of
> 0.5, and the observed sample mean is more than 3 standard deviations
> higher than that. So we do actually have enough information to
> conclude that the value is almost certainly more common than the
> non-MCV selectivity would suggest, and I think that's sufficient
> justification for including it in the MCV list.

> The crux of this is that the relative-standard-error algorithm is
> making use of 2 pieces of information -- the sample frequency, and an
> estimate of its statistical spread. However, there is a 3rd piece of
> information that we know, that is not being made use of -- the
> selectivity that would be assigned to the value if it were not
> included in the MCV list. Looking at just the first 2 pieces of
> information ensures that we get decent estimates for the values in the
> MCV list (where what constitutes "decent" depends on an arbitrarily
> chosen RSE cutoff), but it doesn't take into account the fact that the
> values not in the list may have much worse estimates. Making use of
> the non-MCV-based estimate allows us to do better -- if the MCV-based
> estimate is statistically significantly higher than the non-MCV-based
> estimate, then it would almost certainly be better to include that
> value in the MCV list, even if its spread is quite large.

Because we can treat the sample as normal, testing for > 2 standard
deviations means that we're "~95% sure" it's more common in the table
than the non-MCV selectivity would suggest, right? (I realize I'm not
using technically accurate language)

In your v1 patch, we didn't need a clamp on 10 values in the sample to
treat it as normal, because it was mathematically impossible to pass
the RSE test with less than 10 unless we were sampling most of the
table, in which case it didn't matter. Is there a similar
justification with the new algorithm?

> [results summary]
>
> So HEAD and P1 are both producing too many MCVs. The latest 2 patches
> cure that problem, but the new v2 patch is better at picking out all
> the common values.

Looks good. I'll run some tests of my own this week, trying to find
corner cases different from yours.

As far as the code goes, I haven't studied it very closely yet, but I
did want to call attention to this comment:

+   /*
+* If the value is kept in the MCV list, its population 
frequency is
+* assumed to equal its sample frequency, and the distribution 
of the
+* value's count in the sample is a hypergeomtric distribution 
with
+* the following standard deviation.
+*/

The part after "and" doesn't seem to follow from the first part. Would
you mind clarifying?

-John Naylor

Re: Additional Statistics Hooks

2018-03-13 Thread Ashutosh Bapat

On Tue, Mar 13, 2018 at 4:14 AM, Tom Lane  wrote:
> Mat Arye  writes:
>> So the use-case is an analytical query like
>
>> SELECT date_trunc('hour', time) AS MetricMinuteTs, AVG(value) as avg
>> FROM hyper
>> WHERE time >= '2001-01-04T00:00:00' AND time <= '2001-01-05T01:00:00'
>> GROUP BY MetricMinuteTs
>> ORDER BY MetricMinuteTs DESC;
>
>> Right now this query will choose a much-less-efficient GroupAggregate plan
>> instead of a HashAggregate. It will choose this because it thinks the
>> number of groups
>> produced here is 9,000,000 because that's the number of distinct time
>> values there are.
>> But, because date_trunc "buckets" the values there will be about 24 groups
>> (1 for each hour).
>
> While it would certainly be nice to have better behavior for that,
> "add a hook so users who can write C can fix it by hand" doesn't seem
> like a great solution.  On top of the sheer difficulty of writing a
> hook function, you'd have the problem that no pre-written hook could
> know about all available functions.  I think somehow we'd need a way
> to add per-function knowledge, perhaps roughly like the protransform
> feature.

Like cost associated with a function, we may associate mapping
cardinality with a function. It tells how many distinct input values
map to 1 output value. By input value, I mean input argument tuple. In
Mat's case the mapping cardinality will be 12. The number of distinct
values that function may output is estimated as number of estimated
rows / mapping cardinality of that function.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Christos Maris

Hi all,

At first, I appreciate your insights Craig, but I think I will stick with
AlgoDS ;)

Aleksander, I am mostly interested on the sorting algos benchmark and
implementation one.
I will start writing a proposal soon enough. Do you have any project
related insights as to what I should put in there?

Thanks a lot in advance!

On Tue, Mar 13, 2018 at 11:55 AM, Aleksander Alekseev <
a.aleks...@postgrespro.ru> wrote:

> Hello everyone,
>
> > > I am mostly interested in anything that requires C/C++ implementation
> and
> > > AlgoDS.
> > >
> > > For that reason I would love to work in any of the following (in that
> > > order of preference):
> > >
> > >1. Sorting algorithms benchmark and implementation
> > >2. Enhancing amcheck for all AMs
> > >3. TOAST'ing in slices
> > >4. Thrift datatype support
> > >
> > Having recently worked with Thrift, I recommend ... don't use Thrift. The
> > library is awkward to work with, it isn't very source-compatible across
> > versions.
> >
> > Consider protobuf instead.
>
> Craig, I believe you probably did something wrong if you had to work
> with some library directly. Actually you generate classes from text
> description and just use them. I worked with Thrift some time ago, in
> 2015 [1]. I wouldn't call it awkward. Protobuf is fine too, but
> unfortunately we don't have any Protobuf-related projects this time.
> Also it's probably worth noticing that the GSoC project doesn't imply
> using any existing libraries, only the binary format which is quite
> stable.
>
> Christos, I appreciate your interest in the Thrift-related project. You
> should know however that we already have a student interested in it [2].
> Feel free to apply for it as well but in this case be prepared for a
> little competition.
>
> [1]: https://github.com/afiskon/scala-thrift-example/blob/
> master/src/test/scala/me/eax/examples/thrift/tests/
> BinaryProtocol.scala#L15
> [2]: https://postgr.es/m/CA%2BSXE9sP1iHNp9_DFJzdbE0cszAA-
> QF8d-8GAUyoCA4q9KCsGw%40mail.gmail.com
>
> --
> Best regards,
> Aleksander Alekseev
>

Re: Re: [HACKERS] make async slave to wait for lsn to be replayed

2018-03-13 Thread David Steele

Hi Ivan,

On 3/6/18 9:25 PM, Michael Paquier wrote:
> On Tue, Mar 06, 2018 at 02:24:24PM +0300, Ivan Kartyshov wrote:
>> Hello, I now is preparing the patch over syntax that Simon offered. And in
>> few day I will update the patch.
>> Thank you for your interest in thread.
> 
> It has been more than one month since a patch update has been requested,
> and time is growing short.  This refactored patch introduces a whole new
> concept as well, so my recommendation would be to mark this patch as
> returned with feedback, and then review it freshly for v12 if this
> concept is still alive and around.

This patch wasn't updated at the beginning of the CF and still hasn't
been updated after almost two weeks.

I have marked the patch Returned with Feedback.  Please resubmit to a
new CF when you have an updated patch.

Regards,
-- 
-David
da...@pgmasters.net

Re: [HACKERS] GSoC 2017: weekly progress reports (week 4) and patch for hash index

2018-03-13 Thread Amit Kapila

On Mon, Mar 12, 2018 at 7:18 PM, Amit Kapila  wrote:
> On Mon, Mar 12, 2018 at 12:18 AM, Alexander Korotkov
>  wrote:
>> On Sat, Mar 3, 2018 at 2:53 PM, Amit Kapila  wrote:
>>>
>>> On Fri, Mar 2, 2018 at 9:27 AM, Thomas Munro
>>> >  If that is indeed a race, could it be fixed by
>>> > calling PredicateLockPageSplit() at the start of _hash_splitbucket()
>>> > instead?
>>> >
>>>
>>> Yes, but I think it would be better if we call this once we are sure
>>> that at least one tuple from the old bucket has been transferred
>>> (consider if all tuples in the old bucket are dead).
>>
>>
>> Is it really fair?  For example, predicate lock can be held by session
>> which queried some key, but didn't find any corresponding tuple.
>> If we imagine this key should be in new bucket while all existing
>> tuples would be left in old bucket.  As I get, in this case no locks
>> would be transferred since no tuples were moved to the new bucket.
>> So, further insertion to the new bucket wouldn't conflict with session,
>> which looked for non-existing key, while it should.  Do it make sense?
>>
>
> Valid point, I think on split we should always transfer locks from old
> bucket to new bucket.
>

Attached patch changes it as per above suggestion.


-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


Predicate-Locking-in-hash-index_v8.patch
Description: Binary data

Re: PATCH: Unlogged tables re-initialization tests

2018-03-13 Thread David Steele

Thanks for reviewing, Peter.

On 3/9/18 5:23 PM, Peter Eisentraut wrote:
> This seems like a useful test.
> 
> On 3/5/18 12:35, David Steele wrote:
>> +mkdir($tablespaceDir)
>> +or die "unable to mkdir \"$tablespaceDir\"";
> 
> Use BAIL_OUT instead of die in tests.

Done.

>> +$ts1UnloggedPath = $node->safe_psql('postgres',
>> +q{select pg_relation_filepath('ts1_unlogged')});
> 
> strange indentation

Sure is.  Fixed.

>> +# Write forks to test that they are removed during recovery
>> +$node->command_ok(['touch', "$pgdata/${baseUnloggedPath}_vm"],
>> +'touch vm fork in base');
>> +$node->command_ok(['touch', "$pgdata/${baseUnloggedPath}_fsm"],
>> +'touch fsm fork in base');
> 
> These are not tests, just some prep work.  So they should not use
> command_ok.

Removed command_ok().

> It would probably also be better to avoid the Unix-specific touch
> command and instead use Perl code to open and write to the files.

Updated to use append_to_file().

>> +# Check unlogged table in base
>> +ok(-f "$pgdata/${baseUnloggedPath}_init", 'init fork in base');
>> +ok(-f "$pgdata/$baseUnloggedPath", 'main fork in base');
>> +ok(!-f "$pgdata/${baseUnloggedPath}_vm", 'vm fork not in base');
>> +ok(!-f "$pgdata/${baseUnloggedPath}_fsm", 'fsm fork not in base');
> 
> These test names could be a bit more verbose and distinct, for example,
> 'main fork was recreated after restart'.

Done.

A new patch is attached.

Thanks,
-- 
-David
da...@pgmasters.net
>From 6e082e9db8f401d986a03d02023173e37063996b Mon Sep 17 00:00:00 2001
From: David Steele 
Date: Tue, 13 Mar 2018 10:06:09 -0400
Subject: [PATCH 1/1] Add regression tests for reinit.c.

These were originally written for a patch that refactored reinit.c. That 
refactor ended up not being needed, but it seems like a good idea to add the 
tests.
---
 src/test/recovery/t/014_unlogged_reinit.pl | 97 ++
 1 file changed, 97 insertions(+)
 create mode 100644 src/test/recovery/t/014_unlogged_reinit.pl

diff --git a/src/test/recovery/t/014_unlogged_reinit.pl 
b/src/test/recovery/t/014_unlogged_reinit.pl
new file mode 100644
index 00..e87c805e46
--- /dev/null
+++ b/src/test/recovery/t/014_unlogged_reinit.pl
@@ -0,0 +1,97 @@
+# Tests that unlogged tables are properly reinitialized after a crash.
+#
+# The behavior should be the same when restoring from a backup but that is not
+# tested here.
+use strict;
+use warnings;
+use PostgresNode;
+use TestLib;
+use Test::More tests => 12;
+
+# Initialize node without replication settings
+my $node = get_new_node('main');
+
+$node->init;
+$node->start;
+my $pgdata = $node->data_dir;
+
+# Create an unlogged table to test that forks other than init are not copied
+$node->safe_psql('postgres', 'CREATE UNLOGGED TABLE base_unlogged (id int)');
+
+my $baseUnloggedPath = $node->safe_psql('postgres',
+   q{select pg_relation_filepath('base_unlogged')});
+
+# Make sure main and init forks exist
+ok(-f "$pgdata/${baseUnloggedPath}_init", 'init fork in base');
+ok(-f "$pgdata/$baseUnloggedPath", 'main fork in base');
+
+# Create unlogged tables in a tablespace
+my $tablespaceDir = undef;
+my $ts1UnloggedPath = undef;
+
+$tablespaceDir = TestLib::tempdir . "/ts1";
+
+mkdir($tablespaceDir)
+   or BAIL_OUT("unable to mkdir '$tablespaceDir'");
+
+$node->safe_psql('postgres',
+   "CREATE TABLESPACE ts1 LOCATION '$tablespaceDir'");
+$node->safe_psql('postgres',
+   'CREATE UNLOGGED TABLE ts1_unlogged (id int) TABLESPACE ts1');
+
+$ts1UnloggedPath = $node->safe_psql('postgres',
+   q{select pg_relation_filepath('ts1_unlogged')});
+
+# Make sure main and init forks exist
+ok(-f "$pgdata/${ts1UnloggedPath}_init", 'init fork in tablespace');
+ok(-f "$pgdata/$ts1UnloggedPath", 'main fork in tablespace');
+
+# Crash the postmaster
+$node->stop('immediate');
+
+# Write forks to test that they are removed during recovery
+append_to_file("$pgdata/${baseUnloggedPath}_vm", 'TEST_VM');
+append_to_file("$pgdata/${baseUnloggedPath}_fsm", 'TEST_FSM');
+
+# Remove main fork to test that it is recopied from init
+unlink("$pgdata/${baseUnloggedPath}")
+   or BAIL_OUT("unable to remove '${baseUnloggedPath}'");
+
+# Write forks to test that they are removed by recovery
+append_to_file("$pgdata/${ts1UnloggedPath}_vm", 'TEST_VM');
+append_to_file("$pgdata/${ts1UnloggedPath}_fsm", 'TEST_FSM');
+
+# Remove main fork to test that it is recopied from init
+unlink("$pgdata/${ts1UnloggedPath}")
+   or BAIL_OUT("unable to remove '${ts1UnloggedPath}'");
+
+# Start the postmaster
+$node->start;
+
+# Check unlogged table in base
+ok(-f "$pgdata/${baseUnloggedPath}_init", 'init fork still exists in base');
+ok(-f "$pgdata/$baseUnloggedPath", 'main fork in base recreated at startup');
+ok(!-f "$pgdata/${baseUnloggedPath}_vm", 'vm fork in base removed at startup');
+ok(!-f "$pgdata/${baseUnloggedPath}_fsm",
+   'fsm fork in base removed at startup');
+
+# Drop unlogged table

Re: Fix error in ECPG while connection handling

2018-03-13 Thread Michael Meskes

> Now, ideally the connection would have been null here, but, as the
> 'ClosePortalStmt'
> rule freed the connection but did not set it to NULL, it still sees
> that there
> is a connection(which is actually having garbage in it) and throws an
> error.

Thanks for spotting and fixing. I will push the patch as soon as I'm
online again.

Michael
-- 
Michael Meskes
Michael at Fam-Meskes dot De, Michael at Meskes dot (De|Com|Net|Org)
Meskes at (Debian|Postgresql) dot Org
Jabber: michael at xmpp dot meskes dot org
VfL Borussia! Força Barça! SF 49ers! Use Debian GNU/Linux, PostgreSQL

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-03-13 Thread Alvaro Herrera

Alexander Korotkov wrote:

> And what happen if somebody concurrently set (fastupdate = on)?
> Can we miss conflicts because of that?

I think it'd be better to have that option require AccessExclusive lock,
so that it can never be changed concurrently with readers.  Seems to me
that penalizing every single read to cope with this case would be a bad
trade-off.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: INOUT parameters in procedures

2018-03-13 Thread Peter Eisentraut

On 3/8/18 02:25, Pavel Stehule wrote:
> It looks like some error in this concept. The rules for enabling
> overwriting procedures should modified, so this collision should not be
> done.
> 
> When I using procedure from PL/pgSQL, then it is clear, so I place on
> *OUT position variables. But when I call procedure from top, then I'll
> pass fake parameters to get some result.

What we'll probably want to do here is to make the OUT parameters part
of the identity signature of procedures, unlike in functions.  This
should be a straightforward change, but it will require some legwork in
many parts of the code.

>    if (argmodes && (argmodes[i] == PROARGMODE_INOUT ||
> argmodes[i] == PROARGMODE_OUT))
> +   {
> +   Param  *param;
> 
> Because PROARGMODE_OUT are disallowed, then this check is little bit
> messy. Please, add some comment.

Fixed.

I discovered another issue, in LANGUAGE SQL procedures.  Currently, if
you make a CALL with an INOUT parameter in an SQL procedure, the output
is thrown away (unless it's the last command).  I would like to keep
open the option of assigning the results by name, like we do in
PL/pgSQL.  So in this patch I have made a change to prohibit calling
procedures with INOUT parameters in LANGUAGE SQL routines (see
check_sql_fn_statements()).  What do you think?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 5b9f1506e73826f4f6ff567e54b12c4e232a4263 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 12 Mar 2018 21:39:26 -0400
Subject: [PATCH v4] Support INOUT parameters in procedures

In a top-level CALL, the values of INOUT parameters will be returned as
a result row.  In PL/pgSQL, the values are assigned back to the input
parameters.  In other languages, the same convention as for return a
record from a function is used.  That does not require any code changes
in the PL implementations.

Reviewed-by: Pavel Stehule 
---
 doc/src/sgml/plperl.sgml   |  14 +++
 doc/src/sgml/plpgsql.sgml  |  16 +++
 doc/src/sgml/plpython.sgml |  11 ++
 doc/src/sgml/pltcl.sgml|  12 ++
 doc/src/sgml/ref/create_procedure.sgml |   7 +-
 src/backend/catalog/pg_proc.c  |   4 +-
 src/backend/commands/functioncmds.c|  51 +++--
 src/backend/executor/functions.c   |  51 +
 src/backend/tcop/utility.c |   3 +-
 src/backend/utils/fmgr/funcapi.c   |  11 +-
 src/include/commands/defrem.h  |   3 +-
 src/include/executor/functions.h   |   2 +
 src/include/funcapi.h  |   3 +-
 src/pl/plperl/expected/plperl_call.out |  25 +
 src/pl/plperl/sql/plperl_call.sql  |  22 
 src/pl/plpgsql/src/expected/plpgsql_call.out   |  89 +++
 .../plpgsql/src/expected/plpgsql_transaction.out   |   2 +-
 src/pl/plpgsql/src/pl_comp.c   |  10 +-
 src/pl/plpgsql/src/pl_exec.c   | 125 -
 src/pl/plpgsql/src/pl_funcs.c  |  25 +
 src/pl/plpgsql/src/pl_gram.y   |  38 +--
 src/pl/plpgsql/src/pl_scanner.c|   1 +
 src/pl/plpgsql/src/plpgsql.h   |  12 ++
 src/pl/plpgsql/src/sql/plpgsql_call.sql| 108 ++
 src/pl/plpython/expected/plpython_call.out |  23 
 src/pl/plpython/plpy_exec.c|  24 ++--
 src/pl/plpython/sql/plpython_call.sql  |  20 
 src/pl/tcl/expected/pltcl_call.out |  26 +
 src/pl/tcl/sql/pltcl_call.sql  |  23 
 src/test/regress/expected/create_procedure.out |  21 
 src/test/regress/sql/create_procedure.sql  |  19 
 31 files changed, 752 insertions(+), 49 deletions(-)

diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index cff7a847de..9295c03db9 100644
--- a/doc/src/sgml/plperl.sgml
+++ b/doc/src/sgml/plperl.sgml
@@ -278,6 +278,20 @@ PL/Perl Functions and Arguments
hash will be returned as null values.
   
 
+  
+   Similarly, output parameters of procedures can be returned as a hash
+   reference:
+
+
+CREATE PROCEDURE perl_triple(INOUT a integer, INOUT b integer) AS $$
+my ($a, $b) = @_;
+return {a = $a * 3, b = $b * 3};
+$$ LANGUAGE plperl;
+
+CALL perl_triple(5, 10);
+
+  
+
   
 PL/Perl functions can also return sets of either scalar or
 composite types.  Usually you'll want to return rows one at a
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index c1e3c6a19d..6c25116538 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1870,6 +1870,22 @@ Returning From a

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Aleksander Alekseev

Hello Christos,

> Do you have any project related insights as to what I should put in
> there?

Nope :) I believe Andrey Borodin and Atri Sharma have (added to CC).

-- 
Best regards,
Aleksander Alekseev


signature.asc
Description: PGP signature

Re: Using base backup exclusion filters to reduce data transferred with pg_rewind

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 01:16:27PM +0300, Anastasia Lubennikova wrote:
> Since these patches contain mostly cosmetic changes and do not break
> anything, I think it's fine to mark this thread as Ready For Committer
> without long discussion.

Thanks Anastasia for the review.  The refactoring is quite intuitive I
think, still that's perhaps a bit too much intrusive.  Extra opinions
about that are welcome.

As I read again the patch set, please note that I am not much happy yet
about the part for the handling of temporary files when applying the
filters in pg_rewind and the lack inconsistency for file filters and
directory filters...
--
Michael


signature.asc
Description: PGP signature

Re: SQL/JSON: functions

2018-03-13 Thread Oleg Bartunov

On Tue, Mar 13, 2018 at 2:04 PM, Simon Riggs  wrote:
> On 7 March 2018 at 14:34, Nikita Glukhov  wrote:
>
>> Attached 12th version of SQL/JSON patches rebased onto the latest master.
>
> Please write some docs or notes to go with this.
>
> If you drop a big pile of code with no explanation it will just be ignored.
>
> I think many people want SQL/JSON, but the purpose of a patch
> submission is to allow a committer to review and commit without
> needing to edit anything. It shouldn't be like assembling flat pack
> furniture while wearing a blindfold.

The docs are here
https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md

It's not easy to write docs for SQL/JSON in xml, so I decided to write in more
friendly way. We'll have time to convert it to postgres format.


>
> --
> Simon Riggshttp://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-13 Thread Jeevan Chalke

Hi,

I have resolved all the comments/issues reported in this new patch-set.

Changes done by Ashutosh Bapat for splitting out create_grouping_paths()
into separate functions so that partitionwise aggregate code will use them
were based on my partitionwise aggregate changes. Those were like
refactoring changes. And thus, I have refactored them separately and before
any partitionwise changes (see
0005-Split-create_grouping_paths-and-Add-GroupPathExtraDa.patch). And then
I have re-based all partitionwise changes over it including all fixes.

The patch-set is complete now. But still, there is a scope of some comment
improvements due to all these refactorings. I will work on it. Also, need
to update few documentations and indentations etc. Will post those changes
in next patch-set. But meanwhile, this patch-set is ready to review.

On Tue, Mar 13, 2018 at 9:12 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Mon, Mar 12, 2018 at 7:49 PM, Jeevan Chalke
>  wrote:
> >
> >
> > On Mon, Mar 12, 2018 at 6:07 PM, Ashutosh Bapat
> >  wrote:
> >>
> >> On Fri, Mar 9, 2018 at 4:21 PM, Ashutosh Bapat
> >>
> We don't need UpperRelationKind member in that structure. That will be
> provided by the RelOptInfo passed.
>
> The problem here is the extra information required for grouping is not
> going to be the same for that needed for window aggregate and
> certainly not for ordering. If we try to jam everything in the same
> structure, it will become large with many members useless for a given
> operation. A reader will not have an idea about which of them are
> useful and which of them are not. So, instead we should try some
> polymorphism. I think we can pass a void * to GetForeignUpperPaths()
> and corresponding FDW hook knows what to cast it to based on the
> UpperRelationKind passed.
>

Yep. Done this way.

>
> BTW, the patch has added an argument to GetForeignUpperPaths() but has
> not documented the change in API. If we go the route of polymorphism,
> we will need to document the mapping between UpperRelationKind and the
> type of structure passed in.
>

Oops. Will do that in next patchset.

Thanks for pointing out, I have missed this at first place it self.

-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

partition-wise-agg-v17.tar.gz
Description: application/gzip

Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-03-13 Thread Andrey Borodin



> 13 марта 2018 г., в 17:02, Alexander Korotkov  
> написал(а):
> 
> BTW to BTW. I think we should check pending list size with 
> GinGetPendingListCleanupSize() here
> +
> +   /*
> +* If fast update is enabled, we acquire a predicate lock on the 
> entire
> +* relation as fast update postpones the insertion of tuples into 
> index
> +* structure due to which we can't detect rw conflicts.
> +*/
> +   if (GinGetUseFastUpdate(ginstate->index))
> +   PredicateLockRelation(ginstate->index, snapshot);
> 
> Because we can alter alter index set (fastupdate = off), but there still will 
> be pending list.
> 
> And what happen if somebody concurrently set (fastupdate = on)?
> Can we miss conflicts because of that?
No, AccessExclusiveLock will prevent this kind of problems with enabling 
fastupdate.

Best regards, Andrey Borodin.

Re: JIT compiling with LLVM v11

2018-03-13 Thread Peter Eisentraut

On 3/12/18 17:04, Andres Freund wrote:
> │ JIT:
>│
> │   Functions: 4  
>│
> │   Inlining: false   
>│
> │   Inlining Time: 0.000  
>│
> │   Optimization: false   
>│
> │   Optimization Time: 5.023  
>│
> │   Emission Time: 34.987 
>│

The time quantities need some units.

> │ Execution time: 46.277 ms   
>│

like this :)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Configurable file mode mask

2018-03-13 Thread David Steele

On 3/13/18 2:46 AM, Michael Paquier wrote:
> On Mon, Mar 12, 2018 at 03:14:13PM -0400, Stephen Frost wrote:
>> We already had a discussion about having a GUC for this and concluded,
>> rightly in my view, that it's not sensible to have since we don't want
>> all of the various tools having to read and parse out postgresql.conf.
> 
> If the problem is parsing, it could as well be more portable to put that
> in the control file, no?

The current approach is based on early discussion of this patch, around
[1] and [2] in particular.  I proposed an enforcing GUC at that time but
there wasn't any interest in the idea.

I definitely think it's overkill to put a field in pg_control as that
requires more tooling to update values.

>> I don't see anything in the discussion which has changed that and I
>> don't agree that there's an issue with using the privileges on the data
>> directory for this- it's a simple solution which all of the tools can
>> use and work with easily.  I certainly don't agree that it's a serious
>> issue to relax the explicit check- it's just a check, which a user could
>> implement themselves if they wished to and had a concern for.  On the
>> other hand, with the explicit check, we are actively preventing an
>> entirely reasonable goal of wanting to use a read-only role to perform a
>> backup of the system.
> 
> Well, one thing is that the current checks in the postmaster make sure
> that a data folder is never using anything else than 0700.  From a
> security point of view, making it possible to allow a postmaster to
> start with 0750 is a step backwards if users don't authorize it
> explicitely.  

I would argue that changing the mode of PGDATA is explicit, even if it
is accidental.  To be clear, after a pg_upgrade the behavior of the
cluster WRT to setting the mode would be exactly the same as now.  The
user would need to specify -g at initdb time or explicitly update PGDATA
to 750 for group access to be enabled.

> There are a lot of systems which use a bunch of users with
> only single group with systemd.  So this would remove an existing
> safeguard.  I am not against the idea of this thread, just that I think
> that secured defaults should be user-enforceable if they want Postgres
> to behave so.

As Stephen notes, this can be enforced by the user if they want to, and
without much effort (and with better tools).

Regards,
-- 
-David
da...@pgmasters.net

[1] https://www.postgresql.org/message-id/20526.1489428968%40sss.pgh.pa.us
[2] https://www.postgresql.org/message-id/22248.1489431803%40sss.pgh.pa.us



signature.asc
Description: OpenPGP digital signature

Re: Transform for pl/perl

2018-03-13 Thread Nikita Glukhov


Hi.

I have reviewed this patch too.  Attached new version with v8-v9 delta-patch.

Here is my changes:

 * HV_ToJsonbValue():
- addded missing hv_iterinit()
- used hv_iternextsv() instead of hv_iternext(), HeSVKEY_force(), HeVAL()

 * SV_ToJsonbValue():
- added recursive dereferencing for all SV types
- removed unnecessary JsonbValue heap-allocations

 * Jsonb_ToSV():
- added iteration to the end of iterator needed for correct freeing of
  JsonbIterators

 * passed JsonbParseState ** to XX_ToJsonbValue() functions.
 
 * fixed warnings (see below)


 * fixed comments (see below)


Also I am not sure if we need to use newRV() for returning SVs in
Jsonb_ToSV() and JsonbValue_ToSV().

On 12.03.2018 18:06, Pavel Stehule wrote:

2018-03-12 9:08 GMT+01:00 Anthony Bykov >:


On Mon, 5 Mar 2018 14:03:37 +0100
Pavel Stehule > wrote:

> Hi
>
> I am looking on this patch. I found few issues:
>
> 1. compile warning
>
> I../../src/include  -D_GNU_SOURCE -I/usr/include/libxml2
> -I/usr/lib64/perl5/CORE  -c -o jsonb_plperl.o jsonb_plperl.c
> jsonb_plperl.c: In function ‘SV_FromJsonbValue’:
> jsonb_plperl.c:69:9: warning: ‘result’ may be used uninitialized in
> this function [-Wmaybe-uninitialized]
>   return result;
>          ^~
> jsonb_plperl.c: In function ‘SV_FromJsonb’:
> jsonb_plperl.c:142:9: warning: ‘result’ may be used uninitialized in
> this function [-Wmaybe-uninitialized]
>   return result;
>          ^~

Hello, thanks for reviewing my patch! I really appreciate it.

That warnings are created on purpose - I was trying to prevent the
case
when new types are added into pl/perl, but new transform logic was
not.
Maybe there is a better way to do it, but right now I'll just add the
"default: pg_unreachable" logic.


pg_unreachable() is replaced with elog(ERROR) for reporting impossible
JsonbValue types and JsonbIteratorTokens.


> 3. Do we need two identical tests fro PLPerl and PLPerlu? Why?
>
> Regards
>
> Pavel

About the 3 point - I thought that plperlu and plperl uses different
interpreters. And if they act identically on same examples - there
is no need in identical tests for them indeed.


plperlu and plperl uses same interprets - so the duplicate tests has 
not any sense. But in last versions there are duplicate tests again
I have not removed duplicate test yet, because I am not sure that this 
test does not make sense at all.



The naming convention of functions is not consistent

almost are are src_to_dest

This is different and it is little bit messy

+static SV  *
+SV_FromJsonb(JsonbContainer *jsonb)


Renamed to Jsonb_ToSV() and JsonbValue_ToSV().


This comment is broken

+/*
+ * plperl_to_jsonb(SV *in)
+ *
+ * Transform Jsonb into SV  ---< should be SV to Jsonb
+ */
+PG_FUNCTION_INFO_V1(plperl_to_jsonb);
+Datum

Fixed.

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company
diff --git a/contrib/Makefile b/contrib/Makefile
index 8046ca4..53d44fe 100644
--- a/contrib/Makefile
+++ b/contrib/Makefile
@@ -75,9 +75,9 @@ ALWAYS_SUBDIRS += sepgsql
 endif
 
 ifeq ($(with_perl),yes)
-SUBDIRS += hstore_plperl
+SUBDIRS += hstore_plperl jsonb_plperl
 else
-ALWAYS_SUBDIRS += hstore_plperl
+ALWAYS_SUBDIRS += hstore_plperl jsonb_plperl
 endif
 
 ifeq ($(with_python),yes)
diff --git a/contrib/jsonb_plperl/Makefile b/contrib/jsonb_plperl/Makefile
new file mode 100644
index 000..cd86553
--- /dev/null
+++ b/contrib/jsonb_plperl/Makefile
@@ -0,0 +1,40 @@
+# contrib/jsonb_plperl/Makefile
+
+MODULE_big = jsonb_plperl
+OBJS = jsonb_plperl.o $(WIN32RES)
+PGFILEDESC = "jsonb_plperl - jsonb transform for plperl"
+
+PG_CPPFLAGS = -I$(top_srcdir)/src/pl/plperl
+
+EXTENSION = jsonb_plperlu jsonb_plperl
+DATA = jsonb_plperlu--1.0.sql jsonb_plperl--1.0.sql
+
+REGRESS = jsonb_plperl jsonb_plperl_relocatability jsonb_plperlu jsonb_plperlu_relocatability
+
+ifdef USE_PGXS
+PG_CONFIG = pg_config
+PGXS := $(shell $(PG_CONFIG) --pgxs)
+include $(PGXS)
+else
+subdir = contrib/jsonb_plperl
+top_builddir = ../..
+include $(top_builddir)/src/Makefile.global
+include $(top_srcdir)/contrib/contrib-global.mk
+endif
+
+# We must link libperl explicitly
+ifeq ($(PORTNAME), win32)
+# these settings are the same as for plperl
+override CPPFLAGS += -DPLPERL_HAVE_UID_GID -Wno-comment
+# ... see silliness in plperl Makefile ...
+SHLIB_LINK += $(sort $(wildcard ../../src/pl/plperl/libperl*.a))
+else
+rpathdir = $(perl_archlibexp)/CORE
+SHLIB_LINK += $(perl_embed_ldflags)
+endif
+
+# As with plperl we need to make sure that the CORE directory is included
+# last, probably because it sometimes contains some header files with names
+# that clash with some of ours, or with some that we include, notably on
+# Windows.

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Tom Lane

Stephen Frost  writes:
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> If the problem is parsing, it could as well be more portable to put that
>> in the control file, no?

> Then we'd need a tool to allow changing it for people who do want to
> change it.  There's no reason we should have to have an extra tool for
> this- an administrator who chooses to change the privileges on the data
> folder should be able to do so and I don't think anyone is going to
> thank us for requiring them to use some special tool to do so for
> PostgreSQL.

FWIW, I took a quick look through this patch and don't have any problem
with the approach, which appears to be "use the data directory's
startup-time permissions as the status indicator".  I am kinda wondering
about this though:

+(These files can confuse pg_ctl.)  If group read
+access is enabled on the data directory and an unprivileged user in the
+PostgreSQL group is performing the backup, then
+postmaster.pid will not be readable and must be
+excluded.

If we're allowing group read on the data directory, why should that not
include postmaster.pid?  There's nothing terribly security-relevant in
there, and being able to find out the postmaster PID seems useful.  I can
see the point of restricting server key files, as documented elsewhere,
but it's possible to keep those somewhere else so they don't cause
problems for simple backup software.

Also, in general, this patch desperately needs a round of copy-editing,
particularly with attention to the comments.  For instance, there are a
minimum of three grammatical errors in this one comment:

+ * Group read/execute may optional be enabled on PGDATA so any frontend tools
+ * That write into PGDATA must know what mask to set and the permissions to
+ * use for creating files and directories.

In a larger sense, this fails to explain the operating principle,
namely what I stated above, that we allow the existing permissions
on PGDATA to decide what we allow as group permissions.  If you
don't already understand that, you're going to have a hard time
extracting it from either file_perm.h or file_perm.c.  (The latter
fails to even explain what the argument of DataDirectoryMask is.)

regards, tom lane

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Stephen Frost

Greetings,

* Aleksander Alekseev (a.aleks...@postgrespro.ru) wrote:
> Craig, I believe you probably did something wrong if you had to work
> with some library directly. Actually you generate classes from text
> description and just use them. I worked with Thrift some time ago, in
> 2015 [1]. I wouldn't call it awkward. Protobuf is fine too, but
> unfortunately we don't have any Protobuf-related projects this time.

Just to be clear, the list on the wiki is just a set of suggestions-
students are welcome to propose their own projects as well.

> Also it's probably worth noticing that the GSoC project doesn't imply
> using any existing libraries, only the binary format which is quite
> stable.

A student proposal should really include information about what other
libraries, if any, are being considered for the project as that will
play into the consideration as to if it's something we would be
interested in including in PG or not.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Michael,

* Michael Paquier (mich...@paquier.xyz) wrote:
> On Mon, Mar 12, 2018 at 03:14:13PM -0400, Stephen Frost wrote:
> > We already had a discussion about having a GUC for this and concluded,
> > rightly in my view, that it's not sensible to have since we don't want
> > all of the various tools having to read and parse out postgresql.conf.
> 
> If the problem is parsing, it could as well be more portable to put that
> in the control file, no?  I have finished for example finished by
> implementing my own flavor of pg_controldata to save parsing efforts
> soas it prints control file fields on a per-call basis using individual
> fields, which saved also games with locales for as translations of
> pg_controldata can disturb the parsing logic.

Then we'd need a tool to allow changing it for people who do want to
change it.  There's no reason we should have to have an extra tool for
this- an administrator who chooses to change the privileges on the data
folder should be able to do so and I don't think anyone is going to
thank us for requiring them to use some special tool to do so for
PostgreSQL.

> > I don't see anything in the discussion which has changed that and I
> > don't agree that there's an issue with using the privileges on the data
> > directory for this- it's a simple solution which all of the tools can
> > use and work with easily.  I certainly don't agree that it's a serious
> > issue to relax the explicit check- it's just a check, which a user could
> > implement themselves if they wished to and had a concern for.  On the
> > other hand, with the explicit check, we are actively preventing an
> > entirely reasonable goal of wanting to use a read-only role to perform a
> > backup of the system.
> 
> Well, one thing is that the current checks in the postmaster make sure
> that a data folder is never using anything else than 0700.  From a
> security point of view, making it possible to allow a postmaster to
> start with 0750 is a step backwards if users don't authorize it
> explicitely.  There are a lot of systems which use a bunch of users with
> only single group with systemd.  So this would remove an existing
> safeguard.  I am not against the idea of this thread, just that I think
> that secured defaults should be user-enforceable if they want Postgres
> to behave so.

I'm aware of what the current one-time check in the postmaster does, and
that we don't implement it on all platforms, making me seriously doubt
that the level of concern being raised here makes sense.  Should we
consider it a security issue that the Windows builds don't perform this
check, and never has?

Further, if the permissions are changed without authorization, it's
probably done while the database is running and unlikely to be noticed
for days, weeks, or longer, if the administrator is depending on PG to
let them know of the change.  Considering that the only user who can
change the privileges is a database owner or root, it seems even less
likely to help (why would an attacker change those privileges when they
already have full access?).

Lastly, the user *is* able to enforce the privileges on the data
directory if they wish to, using tools such as tripwire which are built
specifically to provide such checks and to do so regularly instead of
the extremely ad-hoc check provided by PG.

PostgreSQL should, and does, secure the data directory when it's created
by initdb, and subsequent files and directories are similairly secured
appropriately.  Ultimately, the default which makes sense here isn't a
one-size-fits-all but is system dependent and the administrator should
be able to choose what permissions they want to have.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2018-03-13 Thread Stephen Frost

Greetings,

* Ashutosh Bapat (ashutosh.ba...@enterprisedb.com) wrote:
> On Mon, Mar 12, 2018 at 1:25 PM, Etsuro Fujita
>  wrote:
> > (2018/03/09 20:55), Etsuro Fujita wrote:
> >> (2018/03/08 14:24), Ashutosh Bapat wrote:
> >>> For local constraints to be enforced, we use remote
> >>> constraints. For local WCO we need to use remote WCO. That means we
> >>> create many foreign tables pointing to same local table on the foreign
> >>> server through many views, but it's not impossible.
> >>
> >> Maybe I don't understand this correctly, but I guess that it would be
> >> the user's responsibility to not create foreign tables in such a way.
> >
> > I think I misunderstood your words.  Sorry for that.  I think what you
> > proposed would be a solution for this issue, but I'm not sure that's a good
> > one because that wouldn't work for the data sources that don't support views
> > with WCO options.
> 
> Our solution for the constraints doesn't work with the data sources
> (like flat files) which don't support constraints. So, that argument
> doesn't help.

It would really help to have some examples of exactly what is being
proposed here wrt solutions.

WCO is defined at a view level, so I'm not following the notion that
we're going to depend on something remote to enforce the WCO when the
remote object is just a regular table that you can't define a WCO on top
of.  That's not the case when we're talking about foreign tables vs.
local tables, so it's not the same.  I certainly don't think we should
require a remote view to exist to perform the WCO check.  If the remote
WCO is a view itself then I would expect it to operate in the same
manner as WCO on local views does- you can have them defined as being
cascaded or not.

In other words, there is no case where we have a "foreign view."  Views
are always local.  A "foreign table" could actually be a view, in which
case everything we treat it as a table in the local database, but WCO
doesn't come up in that case at all- there's no way to define WCO on a
table, foreign or not.  If WCO is defined on the view on the remote
server, then it should operate properly and not require anything from the
local side.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: [HACKERS] Proposal: generic WAL compression

2018-03-13 Thread Oleg Ivanov


On 02/07/2018 09:02 PM, Markus Nullmeier wrote:

One general comment I can already make is that enabling compression
should be made optional, which appears to be a small and easy addition
to the generic WAL API.


The new version of the patch is attached.

In order to control generic WAL compression the new structure
PageXLogCompressParams is introduced. It is passed as an additional
parameter into GenericXLogRegisterBufferEx, so the access method
can control its own WAL compression.

GenericXLogRegisterBuffer uses default compression settings which
appeared to be a reasonable tradeoff between performance overheads
and compression rate on RUM. On my HDD PostgreSQL works even 10%
faster for some RUM workloads because of reducing size of generic
WAL to be written.

Oleg Ivanov
Postgres Professional
The Russian PostgreSQL Company
diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c
index ce02354..fa3cd4e 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -43,19 +43,41 @@
  * a full page's worth of data.
  *-
  */
+
+#define MAX_ALIGN_MISMATCHES	255
+/* MAX_ALIGN_MISMATCHES is not supposed to be greater than PG_UINT8_MAX */
+#if MAX_ALIGN_MISMATCHES > PG_UINT8_MAX
+#error "MAX_ALIGN_MISMATCHES must be not greater than PG_UINT8_MAX"
+#endif
+
 #define FRAGMENT_HEADER_SIZE	(2 * sizeof(OffsetNumber))
+#define REGION_HEADER_SIZE		(sizeof(char) + sizeof(int))
+#define DIFF_DELTA_HEADER_SIZE	(sizeof(char) + 2 * sizeof(OffsetNumber))
+#define MISMATCH_HEADER_SIZE	(sizeof(char) + sizeof(uint8) + \
+ sizeof(OffsetNumber))
 #define MATCH_THRESHOLD			FRAGMENT_HEADER_SIZE
-#define MAX_DELTA_SIZE			(BLCKSZ + 2 * FRAGMENT_HEADER_SIZE)
+#define MAX_DELTA_SIZE			(BLCKSZ + \
+ 2 * REGION_HEADER_SIZE + \
+ 2 * FRAGMENT_HEADER_SIZE + \
+ 2 * DIFF_DELTA_HEADER_SIZE + \
+ MAX_ALIGN_MISMATCHES * MISMATCH_HEADER_SIZE \
+ + sizeof(bool))
+
+#define writeToPtr(ptr, x)		memcpy(ptr, &(x), sizeof(x)), ptr += sizeof(x)
+#define readFromPtr(ptr, x)		memcpy(&(x), ptr, sizeof(x)), ptr += sizeof(x)
 
 /* Struct of generic xlog data for single page */
 typedef struct
 {
 	Buffer		buffer;			/* registered buffer */
 	int			flags;			/* flags for this buffer */
+
 	int			deltaLen;		/* space consumed in delta field */
 	char	   *image;			/* copy of page image for modification, do not
  * do it in-place to have aligned memory chunk */
 	char		delta[MAX_DELTA_SIZE];	/* delta between page images */
+
+	PageXLogCompressParams compressParams;	/* compress parameters */
 } PageData;
 
 /* State of generic xlog record construction */
@@ -71,15 +93,77 @@ struct GenericXLogState
 	bool		isLogged;
 };
 
+/* Describes for the region which type of delta is used in it */
+typedef enum
+{
+	DIFF_DELTA = 0,/* diff delta with insert, remove and replace
+ * operations */
+	BASE_DELTA = 1/* base delta with update operations only */
+} DeltaType;
+
+/* Diff delta operations for transforming current page to target page */
+typedef enum
+{
+	DIFF_INSERT = 0,
+	DIFF_REMOVE = 1,
+	DIFF_REPLACE = 2
+} DiffDeltaOperations;
+
+/* Describes the kind of region of the page */
+typedef enum
+{
+	UPPER_REGION = 0,
+	LOWER_REGION = 1
+} RegionType;
+
 static void writeFragment(PageData *pageData, OffsetNumber offset,
 			  OffsetNumber len, const char *data);
 static void computeRegionDelta(PageData *pageData,
    const char *curpage, const char *targetpage,
    int targetStart, int targetEnd,
-   int validStart, int validEnd);
+   int validStart, int validEnd,
+   uint8 maxMismatches);
 static void computeDelta(PageData *pageData, Page curpage, Page targetpage);
 static void applyPageRedo(Page page, const char *delta, Size deltaSize);
 
+static int alignRegions(const char *curRegion, const char *targetRegion,
+			 int curRegionLen, int targetRegionLen, uint8 maxMismatches);
+static int restoreCompactAlignment(const char *curRegion,
+		const char *targetRegion,
+		int curRegionLen,
+		int targetRegionLen,
+		int numMismatches);
+
+static bool computeRegionDiffDelta(PageData *pageData,
+	   const char *curpage, const char *targetpage,
+	   int targetStart, int targetEnd,
+	   int validStart, int validEnd,
+	   uint8 maxMismatches);
+static const char *applyPageDiffRedo(Page page, const char *delta, Size deltaSize);
+
+static void computeRegionBaseDelta(PageData *pageData,
+	   const char *curpage, const char *targetpage,
+	   int targetStart, int targetEnd,
+	   int validStart, int validEnd);
+static const char *applyPageBaseRedo(Page page, const char *delta, Size deltaSize);
+
+static bool pageDataContainsDiffDelta(PageData *pageData);
+static void downgradeDeltaToBaseFormat(PageData *pageData);
+
+/* Arrays for the alignment building and for the resulting alignments */

Re: Google Summer of Code: Potential Applicant

2018-03-13 Thread Andrey Borodin

Thanks, Aleksander!SP-

> 13 марта 2018 г., в 19:03, Aleksander Alekseev  
> написал(а):
> 
>> Do you have any project related insights as to what I should put in
>> there?
> 
Christos, as far as I remember, good proposal must have schedule, 
implementation details and deliverables.
Also, it is good to show that you are capable of executing the project, like 
mentioning your previous project (no matter commercial, educational or pet 
projects), achievements etc.
GSoC typically have 3 milestones, usually this milestones must have some viable 
results. 
There are exact dates, but here I'll put a sketch.
Algorithms. June - implement introsort and timsort, July - design benchmarks, 
implement some other hashtables, August - if benchmarks are successful, then 
propose patch to commitfest and review others patches from commitfest, else 
implement more algorithms.
amcheck. June - implement checks for Gin (like b-tree in b-tree, resembles 
existing amcheck), July - checks for Hash, BRIN and SP-GiST, August - RUM, 
patch, commitfest, reviews.

Best regards, Andrey Borodin.

Re: committing inside cursor loop

2018-03-13 Thread Peter Eisentraut

On 3/6/18 07:48, Ildus Kurbangaliev wrote:
> Although as was discussed before it seems inconsistent without ROLLBACK
> support. There was a little discussion about it, but no replies. Maybe
> the status of the patch should be changed to 'Waiting on author' until
> the end of discussion.

I'm wondering what the semantics of it should be.

For example, consider this:

drop table test1;
create table test1 (a int, b int);
insert into test1 values (1, 11), (2, 22), (3, 33);

do
language plpgsql
$$
declare
  x int;
begin
  for x in update test1 set b = b + 1 where b > 20 returning a loop
raise info 'x = %', x;
if x = 2 then
   rollback;
end if;
  end loop;
end;
$$;

The ROLLBACK call in the first loop iteration undoes the UPDATE command
that drives the loop.  Is it then sensible to continue the loop?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Configurable file mode mask

2018-03-13 Thread David Steele

On 3/13/18 11:00 AM, Tom Lane wrote:
> Stephen Frost  writes:
>> * Michael Paquier (mich...@paquier.xyz) wrote:
>>> If the problem is parsing, it could as well be more portable to put that
>>> in the control file, no?
> 
>> Then we'd need a tool to allow changing it for people who do want to
>> change it.  There's no reason we should have to have an extra tool for
>> this- an administrator who chooses to change the privileges on the data
>> folder should be able to do so and I don't think anyone is going to
>> thank us for requiring them to use some special tool to do so for
>> PostgreSQL.
> 
> FWIW, I took a quick look through this patch and don't have any problem
> with the approach, which appears to be "use the data directory's
> startup-time permissions as the status indicator".  I am kinda wondering
> about this though:
> 
> +(These files can confuse pg_ctl.)  If group 
> read
> +access is enabled on the data directory and an unprivileged user in the
> +PostgreSQL group is performing the backup, 
> then
> +postmaster.pid will not be readable and must be
> +excluded.
> 
> If we're allowing group read on the data directory, why should that not
> include postmaster.pid?  There's nothing terribly security-relevant in
> there, and being able to find out the postmaster PID seems useful.  I can
> see the point of restricting server key files, as documented elsewhere,
> but it's possible to keep those somewhere else so they don't cause
> problems for simple backup software.

I'm OK with that.  I had a look at the warnings regarding the required
mode of postmaster.pid in miscinit.c (889-911) and it seems to me they
still hold true if the mode is 640 instead of 600.

Do you agree, Tom?  Stephen?

If so, I'll make those changes.

> Also, in general, this patch desperately needs a round of copy-editing,
> particularly with attention to the comments.  For instance, there are a
> minimum of three grammatical errors in this one comment:
> 
> + * Group read/execute may optional be enabled on PGDATA so any frontend tools
> + * That write into PGDATA must know what mask to set and the permissions to
> + * use for creating files and directories.

> In a larger sense, this fails to explain the operating principle,
> namely what I stated above, that we allow the existing permissions
> on PGDATA to decide what we allow as group permissions.  If you
> don't already understand that, you're going to have a hard time
> extracting it from either file_perm.h or file_perm.c.  (The latter
> fails to even explain what the argument of DataDirectoryMask is.)
I'll do comment/doc review for the next round of patches.

Thanks,
-- 
-David
da...@pgmasters.net

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread Tom Lane

David Gould  writes:
> I also thought about the theory and am confident that there really is no way
> to trick it. Basically if there are enough pages that are different to affect
> the overall density, say 10% empty or so, there is no way a random sample
> larger than a few hundred probes can miss them no matter how big the table is.
> If there are few enough pages to "hide" from the sample, then they are so few
> they don't matter anyway.

> After all this my vote is for back patching too. I don't see any case where
> the patched analyze is or could be worse than what we are doing. I'm happy to
> provide my test cases if anyone is interested.

Yeah, you have a point.  I'm still worried about unexpected side-effects,
but it seems like overall this is very unlikely to hurt anyone.  I'll
back-patch (minus the removal of the unneeded vac_estimate_reltuples
argument).

regards, tom lane

Re: Additional Statistics Hooks

2018-03-13 Thread Mat Arye

On Tue, Mar 13, 2018 at 6:56 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Tue, Mar 13, 2018 at 4:14 AM, Tom Lane  wrote:
> > Mat Arye  writes:
> >> So the use-case is an analytical query like
> >
> >> SELECT date_trunc('hour', time) AS MetricMinuteTs, AVG(value) as avg
> >> FROM hyper
> >> WHERE time >= '2001-01-04T00:00:00' AND time <= '2001-01-05T01:00:00'
> >> GROUP BY MetricMinuteTs
> >> ORDER BY MetricMinuteTs DESC;
> >
> >> Right now this query will choose a much-less-efficient GroupAggregate
> plan
> >> instead of a HashAggregate. It will choose this because it thinks the
> >> number of groups
> >> produced here is 9,000,000 because that's the number of distinct time
> >> values there are.
> >> But, because date_trunc "buckets" the values there will be about 24
> groups
> >> (1 for each hour).
> >
> > While it would certainly be nice to have better behavior for that,
> > "add a hook so users who can write C can fix it by hand" doesn't seem
> > like a great solution.  On top of the sheer difficulty of writing a
> > hook function, you'd have the problem that no pre-written hook could
> > know about all available functions.  I think somehow we'd need a way
> > to add per-function knowledge, perhaps roughly like the protransform
> > feature.
>
> Like cost associated with a function, we may associate mapping
> cardinality with a function. It tells how many distinct input values
> map to 1 output value. By input value, I mean input argument tuple. In
> Mat's case the mapping cardinality will be 12. The number of distinct
> values that function may output is estimated as number of estimated
> rows / mapping cardinality of that function.
>

I think this is complicated by the fact that the mapping cardinality is not
a constant per function
but depends on the constant given as the first argument to the function and
the granularity of the
underlying data (do you have a second-granularity or microsecond
granularity). I actually think the logic for the
estimate here should be the (max(time)-min(time))/interval. I think to be
general you need to allow functions on statistics to determine the estimate.



>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>

Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-13 Thread Tom Lane

David Gould  writes:
> I have attached the patch we are currently using. It applies to 9.6.8. I
> have versions for older releases in 9.4, 9.5, 9.6. I fails to apply to 10,
> and presumably head but I can update it if there is any interest.

> The patch has three main features:
> - Impose an ordering on the autovacuum workers worklist to avoid
>   the need for rechecking statistics to skip already vacuumed tables.
> - Reduce the frequency of statistics refreshes
> - Remove the AutovacuumScheduleLock

As per the earlier thread, the first aspect of that needs more work to
not get stuck when the worklist has long tasks near the end.  I don't
think you're going to get away with ignoring that concern.

Perhaps we could sort the worklist by decreasing table size?  That's not
an infallible guide to the amount of time that a worker will need to
spend, but it's sure safer than sorting by OID.

Alternatively, if we decrease the frequency of stats refreshes, how
much do we even need to worry about reordering the worklist?

In any case, I doubt anyone will have any appetite for back-patching
such a change.  I'd recommend that you clean up your patch and rebase
to HEAD, then submit it into the September commitfest (either on a
new thread or a continuation of the old #13750 thread, not this one).

regards, tom lane

Re: Additional Statistics Hooks

2018-03-13 Thread Mat Arye

On Tue, Mar 13, 2018 at 6:31 AM, David Rowley 
wrote:

> On 13 March 2018 at 11:44, Tom Lane  wrote:
> > While it would certainly be nice to have better behavior for that,
> > "add a hook so users who can write C can fix it by hand" doesn't seem
> > like a great solution.  On top of the sheer difficulty of writing a
> > hook function, you'd have the problem that no pre-written hook could
> > know about all available functions.  I think somehow we'd need a way
> > to add per-function knowledge, perhaps roughly like the protransform
> > feature.
>

I think this isn't either-or. I think a general hook can be useful for
extensions
that want to optimize particular data distributions/workloads using
domain-knowledge about functions common for those workloads.
That way users working with that data can use extensions to optimize
workloads without writing C themselves. I also think a
protransform like feature would add a lot of power to the native planner
but this could take a while
to get into core properly and may not handle all kinds of data
distributions/cases.

An example, of a case a protransform type system would not be able to
optimize is mathematical operator expressions like bucketing integers by
decile --- (integer / 10) * 10.
This is somewhat analogous to date_trunc in the integer space and would
also change the number of resulting distinct rows.

>
> I always imagined that extended statistics could be used for this.
> Right now the estimates are much better when you create an index on
> the function, but there's no real reason to limit the stats that are
> gathered to just plain columns + expression indexes.
>
> I believe I'm not the only person to have considered this. Originally
> extended statistics were named multivariate statistics. I think it was
> Dean and I (maybe others too) that suggested to Tomas to give the
> feature a more generic name so that it can be used for a more general
> purpose later.
>

I also think that the point with extended statistics is a good one and
points to the need for more experimentation/experience which I think
a C hook is better suited for. Putting in a hook will allow extension
writers like us to experiment and figure out the kinds of transform on
statistics that are useful while having
a small footprint on the core. I think designing a protransform-like system
would benefit from more experience with the kinds of transformations that
are useful.
For example, can anything be done if the interval passed to date_trunc is
not constant, or is it not even worth bothering with that case? Maybe
extended
statistics is a better approach, etc.

>
> --
>  David Rowley   http://www.2ndQuadrant.com/
>  PostgreSQL Development, 24x7 Support, Training & Services
>

Re: [HACKERS] [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-13 Thread Arthur Zakirov

On Mon, Mar 12, 2018 at 02:11:42PM +0100, Julian Markwort wrote:
> > In same manner pg_stat_statements_good_plan_reset() and
> > pg_stat_statements_bad_plan_reset() functions can be combined in
> > function:
> 
> > pg_stat_statements_plan_reset(in queryid bigint, in good boolean, in bad
> > boolean)
> 
> This is also sensible, however if any more kinds of plans would be added in 
> the future, there would be a lot of flags in this function. I think having 
> varying amounts of flags (between extension versions) as arguments to the 
> function also makes any upgrade paths difficult. Thinking more about this, we 
> could also user function overloading to avoid this.
> An alternative would be to have the caller pass the type of plan he wants to 
> reset. Omitting the type would result in the deletion of all plans, maybe?
> pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)

Yes, it is a good idea. But maybe instead of passing an empty string
into plans type we should overload the function with only queryid
argument. I think it is a common practice in PostgreSQL. Otherwise
allowance to pass empty string as plan type may confuse users. So
functions declaration may be the following:

pg_stat_statements_plan_reset(in queryid bigint, in plan_type text)
pg_stat_statements_plan_reset(in queryid bigint)

+   interquartile_dist = 2.0*(0.6745 * 
sqrt(e->counters.sum_var_time / e->counters.calls));

I think it would be better to have defines for 2.0 and 0.6745 values for
the sake of readability.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [WIP PATCH] Index scan offset optimisation using visibility map

2018-03-13 Thread Michail Nikolaev

Hello.

Tom, thanks a lot for your thorough review.

> What you've done to
> IndexNext() is a complete disaster from a modularity standpoint: it now
> knows all about the interactions between index_getnext, index_getnext_tid,
> and index_fetch_heap.

I was looking into the current IndexOnlyNext implementation as a starting
point - and it knows about index_getnext_tid and index_fetch_heap already.
At the same time I was trying to keep patch non-invasive.
Patched IndexNext now only knowns about index_getnext_tid and
index_fetch_heap - the same as IndexOnlyNext.
But yes, it probably could be done better.

> I'm not sure about a nicer way to refactor that, but I'd suggest that
> maybe you need an additional function in indexam.c that hides all this
> knowledge about the internal behavior of an IndexScanDesc.

I'll try to split index_getnext into two functions. A new one will do
everything index_getnext does except index_fetch_heap.

> Or that is, it knows too much and still not enough,
> because it's flat out wrong for the case that xs_continue_hot is set.
> You can't call index_getnext_tid when that's still true from last time.

Oh.. Yes, clear error here.

< The PredicateLockPage call also troubles me quite a bit, not only from
< a modularity standpoint but because that implies a somewhat-user-visible
< behavioral change when this optimization activates. People who are using
< serializable mode are not going to find it to be an improvement if their
< queries fail and need retries a lot more often than they did before.
< I don't know if that problem is bad enough that we should disable skipping
< when serializable mode is active, but it's something to think about.

Current IndexOnlyScan already does that. And I think a user should expect
such a change in serializable mode.

> You haven't documented the behavior required for tuple-skipping in any
> meaningful fashion, particularly not the expectation that the child plan
> node will still return tuples that just need not contain any valid
> content.

Only nodeLimit could receive such tuples and they are immediately
discarded. I'll add some comment to it.

> is a gross hack and probably wrong. You could use ExecStoreAllNullTuple,
> perhaps.

Oh, nice, missed it.

> I'm disturbed by the fact that you've not taught the planner about the
> potential cost saving from this, so that it won't have any particular
> reason to pick a regular indexscan over some other plan type when this
> optimization applies. Maybe there's no practical way to do that, or maybe
> it wouldn't really matter in practice; I've not looked into it. But not
> doing anything feels like a hack.

I was trying to do it. But current planner architecture does not provide a
nice way to achive it due to the way it handles limit and offset.
So, I think it is better to to be avoided for now.

> Setting this back to Waiting on Author.

I'll try to make the required changes in a few days.

Thanks.

Re: PATCH: Unlogged tables re-initialization tests

2018-03-13 Thread Alvaro Herrera

Dagfinn Ilmari Mannsåker wrote:

> $SIG{__DIE__} gets called even for exceptions that would be caught by a
> surrunding eval block, so this should at the very least be:
> 
> $SIG{__DIE__} = sub { BAIL_OUT(@_) unless $^S };
> 
> However, that is still wrong, because die() and BAIL_OUT() mean
> different things: die() aborts the current test script, while BAIL_OUT()
> aborts the entire 'prove' run, i.e. all subsequent scripts in the same
> test directory.

Sounds like 'die' is the correct thing, then, and that BAIL_OUT should
be called sparingly ... for example this one in PostgresNode::start
seems like an overreaction:
BAIL_OUT("node \"$name\" is already running") if defined $self->{_pid};

Yes?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Greetings,

* David Steele (da...@pgmasters.net) wrote:
> On 3/13/18 11:00 AM, Tom Lane wrote:
> > Stephen Frost  writes:
> >> * Michael Paquier (mich...@paquier.xyz) wrote:
> >>> If the problem is parsing, it could as well be more portable to put that
> >>> in the control file, no?
> > 
> >> Then we'd need a tool to allow changing it for people who do want to
> >> change it.  There's no reason we should have to have an extra tool for
> >> this- an administrator who chooses to change the privileges on the data
> >> folder should be able to do so and I don't think anyone is going to
> >> thank us for requiring them to use some special tool to do so for
> >> PostgreSQL.
> > 
> > FWIW, I took a quick look through this patch and don't have any problem
> > with the approach, which appears to be "use the data directory's
> > startup-time permissions as the status indicator".  I am kinda wondering
> > about this though:
> > 
> > +(These files can confuse pg_ctl.)  If group 
> > read
> > +access is enabled on the data directory and an unprivileged user in the
> > +PostgreSQL group is performing the backup, 
> > then
> > +postmaster.pid will not be readable and must be
> > +excluded.
> > 
> > If we're allowing group read on the data directory, why should that not
> > include postmaster.pid?  There's nothing terribly security-relevant in
> > there, and being able to find out the postmaster PID seems useful.  I can
> > see the point of restricting server key files, as documented elsewhere,
> > but it's possible to keep those somewhere else so they don't cause
> > problems for simple backup software.
> 
> I'm OK with that.  I had a look at the warnings regarding the required
> mode of postmaster.pid in miscinit.c (889-911) and it seems to me they
> still hold true if the mode is 640 instead of 600.
> 
> Do you agree, Tom?  Stephen?
> 
> If so, I'll make those changes.

I agree that we can still consider an EPERM-error case as being ok even
with the changes proposed, and with the same-user check happening
earlier in checkDataDir(), we won't even get to the point of looking at
the pid file if the userid's don't match.  The historical comment about
the old datadir permissions can likely just be removed, perhaps replaced
with a bit more commentary above that check in checkDataDir().  The
open() call should also fail if we only have group-read privileges on
the file (0640), but surely the kill() will in any case.

Thanks!

Stephen


signature.asc
Description: PGP signature

Re: PATCH: Exclude temp relations from base backup

2018-03-13 Thread David Steele

Hi,

On 2/28/18 10:55 AM, David Steele wrote:
> This is a follow-up patch from the exclude unlogged relations discussion
> [1].
> 
> The patch excludes temporary relations during a base backup using the
> existing looks_like_temp_rel_name() function for identification.
> 
> It shares code to identify database directories from [1], so for now
> that has been duplicated in this patch to make it independent.  I'll
> rebase depending on what gets committed first.

Updated the patch to change die() to BAIL_OUT() and use append_to_file()
as suggested for another test patch [1].

Regards,
-- 
-David
da...@pgmasters.net

[1]
https://www.postgresql.org/message-id/6bc5d931-5b00-279f-f65a-26e32de400a6%40pgmasters.net
From c83f3ae09bacb961c511ebe9cb1f9f79bf1e3d29 Mon Sep 17 00:00:00 2001
From: David Steele 
Date: Tue, 13 Mar 2018 12:22:24 -0400
Subject: [PATCH 1/1] Exclude temporary relations from base backup.

Exclude temporary relations during a base backup using the existing 
looks_like_temp_rel_name() function for identification.

It shares code to identify database directories with [1], so for now that has 
been duplicated in this patch to make it independent.  I'll rebase depending on 
what gets committed first.

[1] 
https://www.postgresql.org/message-id/04791bab-cb04-ba43-e9c0-664a4c1ffb2c%40pgmasters.net
---
 doc/src/sgml/protocol.sgml   |  2 +-
 src/backend/replication/basebackup.c | 41 +++
 src/backend/storage/file/fd.c|  3 +-
 src/bin/pg_basebackup/t/010_pg_basebackup.pl | 49 +++-
 src/include/storage/fd.h |  1 +
 5 files changed, 92 insertions(+), 4 deletions(-)

diff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 4fd61d7c2d..629a462574 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2565,7 +2565,7 @@ The commands accepted in replication mode are:
 
  Various temporary files and directories created during the operation
  of the PostgreSQL server, such as any file or directory beginning
- with pgsql_tmp.
+ with pgsql_tmp and temporary relations.
 


diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 185f32a5f9..f0c3d13b2b 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -26,6 +26,7 @@
 #include "nodes/pg_list.h"
 #include "pgtar.h"
 #include "pgstat.h"
+#include "port.h"
 #include "postmaster/syslogger.h"
 #include "replication/basebackup.h"
 #include "replication/walsender.h"
@@ -958,6 +959,36 @@ sendDir(const char *path, int basepathlen, bool sizeonly, 
List *tablespaces,
charpathbuf[MAXPGPATH * 2];
struct stat statbuf;
int64   size = 0;
+   const char  *lastDir;   /* Split last dir from 
parent path. */
+   boolisDbDir = false;/* Does this directory contain 
relations? */
+
+   /*
+* Determine if the current path is a database directory that can
+* contain relations.
+*
+* Start by finding the location of the delimiter between the parent
+* path and the current path.
+*/
+   lastDir = last_dir_separator(path);
+
+   /* Does this path look like a database path (i.e. all digits)? */
+   if (lastDir != NULL &&
+   strspn(lastDir + 1, "0123456789") == strlen(lastDir + 1))
+   {
+   /* Part of path that contains the parent directory. */
+   int parentPathLen = lastDir - path;
+
+   /*
+* Mark path as a database directory if the parent path is 
either
+* $PGDATA/base or a tablespace version path.
+*/
+   if (strncmp(path, "./base", parentPathLen) == 0 ||
+   (parentPathLen >= (sizeof(TABLESPACE_VERSION_DIRECTORY) 
- 1) &&
+strncmp(lastDir - 
(sizeof(TABLESPACE_VERSION_DIRECTORY) - 1),
+TABLESPACE_VERSION_DIRECTORY,
+sizeof(TABLESPACE_VERSION_DIRECTORY) - 
1) == 0))
+   isDbDir = true;
+   }
 
dir = AllocateDir(path);
while ((de = ReadDir(dir, path)) != NULL)
@@ -1007,6 +1038,16 @@ sendDir(const char *path, int basepathlen, bool 
sizeonly, List *tablespaces,
if (excludeFound)
continue;
 
+   /* Exclude temporary relations */
+   if (isDbDir && looks_like_temp_rel_name(de->d_name))
+   {
+   elog(DEBUG2,
+"temporary relation file \"%s\" excluded from 
backup",
+de->d_name);
+
+   continue;
+   }
+
snprintf(pathbuf, sizeof(pathbuf), "%s/%s", path,

Re: PATCH: Configurable file mode mask

2018-03-13 Thread David Steele

Hi Michael,

On 3/12/18 3:28 AM, Michael Paquier wrote:
> On Fri, Mar 09, 2018 at 01:51:14PM -0500, David Steele wrote:
>> How about a GUC that enforces one mode or the other on startup?  Default
>> would be 700.  The GUC can be set automatically by initdb based on the
>> -g option.  We had this GUC originally, but since the front-end tools
>> can't read it we abandoned it.  Seems like it would be good as an
>> enforcing mechanism, though.
> 
> Hm.  OK.  I can see the whole set of points about that.  Please let me
> think a bit more about that bit.  Do you think that there could be a
> pool of users willing to switch from one mode to another?  Compared to
> your v1, we could indeed have a GUC which enforces a restriction to not
> allow group access, and enabled by default.  As the commit fest is
> running and we don't have a clear picture yet, I am afraid that it may
> be better to move that to v12, and focus on getting patches 1 and 2
> committed. This will provide a good base for the next move.
> 
> There are three places where things are still not correct:
> 
> -   if (chmod(location, S_IRWXU) != 0)
> +   current_umask = umask(0);
> +   umask(current_umask);
> +
> +   if (chmod(location, PG_DIR_MODE_DEFAULT & ~current_umask) != 0)
> This is in tablespace.c.

I have moved this hunk to 03 and used only PG_DIR_MODE_DEFAULT instead.

> @@ -185,6 +186,9 @@ main(int argc, char **argv)
> exit(1);
> }
> 
> +   /* Set dir/file mode mask */
> +   umask(PG_MODE_MASK_DEFAULT);
> +
> In pg_rewind and pg_resetwal, isn't that also a portion which is not
> necessary without the group access feature?

These seem like a good idea to me with or without patch 03.  Some of our
front-end tools (initdb, pg_upgrade) were setting umask and others
weren't.  I think it's more consistent (and safer) if they all do, at
least if they are writing into PGDATA.

> This is all I have basically for patch 2, which would be good for
> shipping.

Thanks!

I'll attach new patches in a reply to [1] once I have made the changes
Tom requested.

-- 
-David
da...@pgmasters.net

[1] https://www.postgresql.org/message-id/22928.1520953220%40sss.pgh.pa.us



signature.asc
Description: OpenPGP digital signature

Re: [submit code] I develop a tool for pgsql, how can I submit it

2018-03-13 Thread Euler Taveira

2018-03-13 12:19 GMT-03:00 leap :
> I develop a tool for pgsql, I want to submit it on pgsql.
> how can I submit it?
>
As Laurenz said a tool doesn't necessarily have to be part of
PostgreSQL. Sometimes a lot of people ask for a tool, someone write it
and the community decide to maintain it. I'm not sure this is the case
for your tool. The pro is that the tool will be maintained by a group
of experienced programmers (I'm not saying you can't have this group
outside PostgreSQL project. You may have enough interest if people are
excited by it). The con about this direction is that the tool
development cycle is tied to PostgreSQL (which means 1-year cycle and
slow development over time). Since I do a lot of debug it seems very
useful for me. If you decide to convince people about the usefulness
of your tool, you should: (i) remove some overlap between the tool and
core, (ii) cleanup your code to follow the PostgreSQL guidelines,
(iii) reuse core functions, (iv) reuse constants instead of redefine
them and (v) document everything. Steps (i), (iii) and (iv) may
require some code rearrange in PostgreSQL. Even if you decide to
continue the development outside PostgreSQL, those steps would improve
your code.


-- 
   Euler Taveira   Timbira -
http://www.timbira.com.br/
   PostgreSQL: Consultoria, Desenvolvimento, Suporte 24x7 e Treinamento

Re: ALTER TABLE ADD COLUMN fast default

2018-03-13 Thread Andrew Dunstan

On Tue, Mar 13, 2018 at 10:58 PM, Andrew Dunstan
 wrote:
> On Tue, Mar 13, 2018 at 2:40 PM, Andrew Dunstan
>  wrote:
>
>>>
>>> Going by the commitfest app, the patch still does appear to be waiting
>>> on Author. Never-the-less, I've made another pass over it and found a
>>> few mistakes and a couple of ways to improve things:
>>>
>>
>> working on these. Should have a new patch tomorrow.
>>
>
>
> Here is a patch that attends to most of these, except that I haven't
> re-indented it.
>
> Your proposed changes to slot_getmissingattrs() wouldn't work, but I
> have rewritten it in a way that avoids the double work you disliked.
>
> I'll rerun the benchmark tests I posted upthread and let you know the results.
>

Here are the benchmark results from the v15 patch. Fairly similar to
previous results. I'm going to run some profiling again to see if I
can identify any glaring hotspots. I do suspect that the "physical
tlist" optimization sometimes turns out not to be one. It seems
perverse to be able to improve a query's performance by dropping a
column.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


results.t100r50k.v15
Description: Binary data


results.t100r64.v15
Description: Binary data

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Alvaro Herrera

Amit Langote wrote:

> I will continue working on improving the comments / cleaning things up and
> post a revised version soon, but until then please look at the attached.

I tried to give this a read.  It looks pretty neat stuff -- as far as I
can tell, it follows Robert's sketch for how this should work.  The fact
that it's under-commented makes me unable to follow it too closely
though (I felt like adding a few "wtf?" comments here and there), so
it's taking me a bit to follow things in detail.  Please do submit
improved versions as you have them.

I think you're using an old version of pg_bsd_indent.

In particular need of commentary
 * match_clause_to_partition_key() should indicate which params are
 output and what do they get

 * get_steps_using_prefix already has a comment, but it doesn't really
 explain much.  (I'm not sure why you use the term "tuple" here.  I mean,
 mathematically it probably makes sense, but in the overall context it
 seems just confusing.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: SQL/JSON: functions

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 04:08:01PM +0300, Oleg Bartunov wrote:
> The docs are here
> https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
> 
> It's not easy to write docs for SQL/JSON in xml, so I decided to write in more
> friendly way. We'll have time to convert it to postgres format.

If you aim at getting a feature committed first without its
documentation, and getting the docs written after the feature freeze
using a dedicated open item or such, this is much acceptable in my
opinion and the CF is running short in time.
--
Michael


signature.asc
Description: PGP signature

Re: SQL/JSON: functions

2018-03-13 Thread Andres Freund

On 2018-03-14 07:54:35 +0900, Michael Paquier wrote:
> On Tue, Mar 13, 2018 at 04:08:01PM +0300, Oleg Bartunov wrote:
> > The docs are here
> > https://github.com/obartunov/sqljsondoc/blob/master/README.jsonpath.md
> > 
> > It's not easy to write docs for SQL/JSON in xml, so I decided to write in 
> > more
> > friendly way. We'll have time to convert it to postgres format.
> 
> If you aim at getting a feature committed first without its
> documentation, and getting the docs written after the feature freeze
> using a dedicated open item or such, this is much acceptable in my
> opinion and the CF is running short in time.

Given that this patch still uses PG_TRY/CATCH around as wide paths of
code as a whole ExecEvalExpr() invocation, basically has gotten no
review, I don't see this going anywhere for v11.

Greetings,

Andres Freund

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Alvaro Herrera

By the way, I checked whether patch 0002 (additional tests) had an
effect on coverage, and couldn't detect any changes in terms of
lines/functions.  Were you able to find any bugs in your code thanks to
the new tests that would not have been covered by existing tests?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: JIT compiling with LLVM v11

2018-03-13 Thread Andres Freund

Hi,

On 2018-03-14 10:32:40 +1300, Thomas Munro wrote:
> I decided to try this on a CentOS 7.2 box.  It has LLVM 3.9 in the
> 'epel' package repo, but unfortunately it only has clang 3.4.

That's a bit odd, given llvm and clang really live in the same repo...


> clang: error: unknown argument: '-fexcess-precision=standard'
> clang: error: unknown argument: '-flto=thin'
>
> Ok, so I hacked src/Makefile.global.in to remove -flto=thin.

I think I can get actually rid of that entirely.


> It looks
> like -fexcess-precision-standard is coming from a configure test that
> was run against ${CC}, not against ${CLANG}, so I hacked the generated
> src/Makefile.global to remove that too, just to see if I could get
> past that.

Yea, I'd hoped we could avoid duplicating all the configure tests, but
maybe not :(.



> Then I could build successfully and make check passed.  I did see one warning:
> 
> In file included from execExpr.c:39:
> ../../../src/include/jit/jit.h:36:3: warning: redefinition of typedef
> 'JitProviderCallbacks' is a C11 feature [-Wtypedef-redefinition]
> } JitProviderCallbacks;
>   ^
> ../../../src/include/jit/jit.h:22:37: note: previous definition is here
> typedef struct JitProviderCallbacks JitProviderCallbacks;
> ^

Yep. Removed the second superflous / redundant typedef.  Will push a
heavily rebased version in a bit, will include fix for this.

Greetings,

Andres Freund

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 11:34 AM, Thomas Munro
 wrote:
> LOG:  duration: 26101.612 ms  plan:
> Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16452 newdata WHERE
> newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16452
> newdata2 WHERE newdata2 IS NOT NULL AND newdata2
> OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
> OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
> Limit  (cost=0.00..90.52 rows=1 width=28) (actual
> time=26101.608..26101.608 rows=0 loops=1)
>  ->  Nested Loop Semi Join  (cost=0.00..225220.96 rows=2488 width=28)
> (actual time=26101.606..26101.606 rows=0 loops=1)
>Join Filter: ((newdata2.ctid <> newdata.ctid) AND (newdata.* *=
> newdata2.*))
>Rows Removed by Join Filter: 2500
>->  Seq Scan on pg_temp_16452 newdata  (cost=0.00..73.00
> rows=4975 width=34) (actual time=0.022..15.448 rows=5000 loops=1)
>  Filter: (newdata.* IS NOT NULL)
>->  Materialize  (cost=0.00..97.88 rows=4975 width=34) (actual
> time=0.000..0.500 rows=5000 loops=5000)
>  ->  Seq Scan on pg_temp_16452 newdata2  (cost=0.00..73.00
> rows=4975 width=34) (actual time=0.010..4.033 rows=5000 loops=1)
>Filter: (newdata2.* IS NOT NULL)

This plan is chosen because we're looking for just one row (LIMIT 1)
that has equal data but a different ctid.  In this case we're not
going to find one, so we'll pay the full enormous cost of the nested
loop, but the startup cost is estimated as 0 and we think we are going
to find a row straight away.  That's because we don't know that it's
unlikely for there to be a row with the same columns but a different
ctid.

There is a fundamental and complicated estimation problem lurking here
of course and I'm not sure what to think about that yet.  Maybe there
is a very simple fix for this particular problem:

--- a/src/backend/commands/matview.c
+++ b/src/backend/commands/matview.c
@@ -660,7 +660,7 @@ refresh_by_match_merge(Oid matviewOid, Oid
tempOid, Oid relowner,
 "(SELECT * FROM %s newdata2
WHERE newdata2 IS NOT NULL "
 "AND newdata2
OPERATOR(pg_catalog.*=) newdata "
 "AND newdata2.ctid
OPERATOR(pg_catalog.<>) "
-"newdata.ctid) LIMIT 1",
+"newdata.ctid)",
 tempname, tempname);
if (SPI_execute(querybuf.data, false, 1) != SPI_OK_SELECT)
elog(ERROR, "SPI_exec failed: %s", querybuf.data);

That gets me back to the sort-merge plan, but maybe it's too superficial.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Tom Lane

Thomas Munro  writes:
> This looks like an invisible correlation problem.

Yeah --- the planner has no idea that the join rows satisfying
newdata.* *= newdata2.* are likely to be exactly the ones not
satisfying newdata.ctid <> newdata2.ctid.  It's very accidental
that we got a good plan before.

I've not looked to see where this query is generated, but I wonder
if we could make things better by dropping the LIMIT 1 and instead
using an executor count parameter to stop early.

regards, tom lane

JIT compiling with LLVM v12

2018-03-13 Thread Andres Freund

Hi,

I've pushed a revised and rebased version of my JIT patchset.
The git tree is at
  https://git.postgresql.org/git/users/andresfreund/postgres.git
in the jit branch
  
https://git.postgresql.org/gitweb/?p=users/andresfreund/postgres.git;a=shortlog;h=refs/heads/jit

There's nothing hugely exciting, mostly lots of cleanups.

- added some basic EXPLAIN output, displaying JIT options and time spent
  jitting (see todo below)

  JIT:
Functions: 9
Generation Time: 4.604
Inlining: false
Inlining Time: 0.000
Optimization: false
Optimization Time: 0.585
Emission Time: 12.858

- Fixed bugs around alignment computations in tuple deforming. Wasn't
  able to trigger any bad consequences, but it was clearly wrong.
- Worked a lot on making code more pgindent safe. There's still some
  minor layout damage, but it's mostly ok now. For that I had to add a
  bunch of helpers that make the code shorter
- Freshly emitted functions now have proper attributes indicating
  architecture, floating point behaviour etc.  That's what previously
  prevented the inliner of doing its job without forcing its hand. That
  yields a bit of a speedup.
- reduced size of code a bit by deduplicating code, in particular
  don't "manually" create signatures for function declarations
  anymore. Besides deduplicating, this also ensures code generation time
  errors when function signatures change.
- fixed a number of FIXMEs etc
- added a lot of comments
- portability fixes (OSX, freebsd)

Todo:
- some build issues with old clang versions pointed out by Thomas Munro
- when to take jit_expressions into account (both exec and plan or just
  latter)
- EXPLAIN for queries that are JITed should display units. Starting
  thread about effort to not duplicate code for that
- more explanations of type & function signature syncing
- GUC docs (including postgresql.conf.sample)

Thanks everyone, particularly Peter in this update, for helping me
along!


Regards,

Andres

Re: neqjoinsel versus "refresh materialized view concurrently"

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 8:07 AM, Jeff Janes  wrote:
> The following commit has caused a devastating performance regression
> in concurrent refresh of MV:
>
> commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97
> Author: Tom Lane 
> Date:   Wed Nov 29 22:00:29 2017 -0500
>
> Fix neqjoinsel's behavior for semi/anti join cases.
>
>
> The below reproduction goes from taking about 1 second to refresh, to taking
> an amount of time I don't have the patience to measure.
>
> drop table foobar2 cascade;
> create table foobar2 as select * from generate_series(1,20);
> create materialized view foobar3 as select * from foobar2;
> create unique index on foobar3 (generate_series );
> analyze foobar3;
> refresh materialized view CONCURRENTLY foobar3 ;
>
>
> When I interrupt the refresh, I get a message including this line:
>
> CONTEXT:  SQL statement "SELECT newdata FROM pg_temp_3.pg_temp_16420 newdata
> WHERE newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16420
> newdata2 WHERE newdata2 IS NOT NULL AND newdata2 OPERATOR(pg_catalog.*=)
> newdata AND newdata2.ctid OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1"
>
> So I makes sense that the commit in question could have caused a change in
> the execution plan.  Because these are temp tables, I can't easily get my
> hands on them to investigate further.

Ouch.  A quadratic join.  This looks like an invisible correlation problem.

load 'auto_explain';
set auto_explain.log_min_duration = 0;
set auto_explain.log_analyze = true;

drop table if exists t cascade;

create table t as select generate_series(1, 5000);
create materialized view mv as select * from t;
create unique index on mv(generate_series);
analyze mv;
refresh materialized view concurrently mv;

HEAD:

LOG:  duration: 26101.612 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16452 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16452
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=0.00..90.52 rows=1 width=28) (actual
time=26101.608..26101.608 rows=0 loops=1)
 ->  Nested Loop Semi Join  (cost=0.00..225220.96 rows=2488 width=28)
(actual time=26101.606..26101.606 rows=0 loops=1)
   Join Filter: ((newdata2.ctid <> newdata.ctid) AND (newdata.* *=
newdata2.*))
   Rows Removed by Join Filter: 2500
   ->  Seq Scan on pg_temp_16452 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.022..15.448 rows=5000 loops=1)
 Filter: (newdata.* IS NOT NULL)
   ->  Materialize  (cost=0.00..97.88 rows=4975 width=34) (actual
time=0.000..0.500 rows=5000 loops=5000)
 ->  Seq Scan on pg_temp_16452 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.010..4.033 rows=5000 loops=1)
   Filter: (newdata2.* IS NOT NULL)

And with commit 7ca25b7de6aefa5537e0dbe56541bc41c0464f97 reverted:

LOG:  duration: 36.358 ms  plan:
Query Text: SELECT newdata FROM pg_temp_3.pg_temp_16470 newdata WHERE
newdata IS NOT NULL AND EXISTS (SELECT * FROM pg_temp_3.pg_temp_16470
newdata2 WHERE newdata2 IS NOT NULL AND newdata2
OPERATOR(pg_catalog.*=) newdata AND newdata2.ctid
OPERATOR(pg_catalog.<>) newdata.ctid) LIMIT 1
Limit  (cost=756.95..939.50 rows=1 width=28) (actual
time=36.354..36.354 rows=0 loops=1)
 ->  Merge Semi Join  (cost=756.95..2947.51 rows=12 width=28) (actual
time=36.352..36.352 rows=0 loops=1)
   Merge Cond: (newdata.* *= newdata2.*)
   Join Filter: (newdata2.ctid <> newdata.ctid)
   Rows Removed by Join Filter: 5000
   ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=9.622..10.300 rows=5000 loops=1)
 Sort Key: newdata.* USING *<
 Sort Method: quicksort  Memory: 622kB
 ->  Seq Scan on pg_temp_16470 newdata  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.021..4.986 rows=5000 loops=1)
   Filter: (newdata.* IS NOT NULL)
   ->  Sort  (cost=378.48..390.91 rows=4975 width=34) (actual
time=7.378..8.010 rows=5000 loops=1)
 Sort Key: newdata2.* USING *<
 Sort Method: quicksort  Memory: 622kB
 ->  Seq Scan on pg_temp_16470 newdata2  (cost=0.00..73.00
rows=4975 width=34) (actual time=0.017..3.034 rows=5000 loops=1)
   Filter: (newdata2.* IS NOT NULL)

-- 
Thomas Munro
http://www.enterprisedb.com

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Chapman Flack

On 03/13/2018 10:40 AM, Stephen Frost wrote:
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> Well, one thing is that the current checks in the postmaster make sure
>> that a data folder is never using anything else than 0700.  From a
>> security point of view, making it possible to allow a postmaster to
>> start with 0750 is a step backwards ...

> Lastly, the user *is* able to enforce the privileges on the data
> directory if they wish to, using tools such as tripwire which are built
> specifically to provide such checks and to do so regularly instead of
> the extremely ad-hoc check provided by PG.
> 
> ...  Ultimately, the default which makes sense here isn't a
> one-size-fits-all but is system dependent and the administrator should
> be able to choose what permissions they want to have.

Hear, hear. Returning for a moment again to

https://www.postgresql.org/message-id/8b16cc30-0187-311e-04b5-1f32446d89dd%40anastigmatix.net

we see that a stat() returning mode 0750 on a modern system may not
even mean there is any group access at all. In that example, the
datadir had these permissions:

# getfacl .
# file: .
# owner: postgres
# group: postgres
user::rwx
user:root:r-x
group::---
mask::r-x
other::---

While PostgreSQL does its stat() and interprets the mode as if it is
still on a Unix box from the '80s, two things have changed underneath
it: POSIX ACLs and Linux capabilities. Capabilities take the place of
the former specialness of root, who now needs to be granted r-x
explicitly in the ACL to be able to read stuff there at all, and there
is clearly no access to group and no access to other. It would be hard
for anybody to call this an insecure configuration. But when you stat()
an object with a POSIX ACL, you get the 'mask' value where the 'group'
bits used to go, so postgres sees this as 0750, thinks the 5 represents
group access, and refuses to start. Purely a mistake.

It's the kind of mistake that is inherent in this sort of check,
which tries to draw conclusions from the semantics it assumes, while
systems evolve and semantics march along. One hypothetical fix would
be to add:

#ifdef HAS_POSIX_ACLS
... check if there's really an ACL here, and the S_IRWXG bits are
really just the mask, or even try to pass judgment on whether the
admin's chosen ACL is adequately secure ...
#endif

but then sooner or later it will still end up making assumptions
that aren't true under, say, SELinux, so there's another #ifdef,
and where does it end?

On 03/13/2018 11:00 AM, Tom Lane wrote:
> In a larger sense, this fails to explain the operating principle,
> namely what I stated above, that we allow the existing permissions
> on PGDATA to decide what we allow as group permissions.

I admit I've only skimmed the patches, trying to determine what
that will mean in practice. In a case like the ACL example above,
does this mean that postgres will stat PGDATA, conclude incorrectly
that group access is granted, and then, based on that, actually go
granting unwanted group access for real on newly-created files
and directories?

That doesn't seem ideal.

On 03/13/2018 10:45 AM, David Steele wrote:
> As Stephen notes, this can be enforced by the user if they want to,
> and without much effort (and with better tools).

To me, that seems really the key. An --allow-group-access option is
nice (but, as we see, misleading if its assumptions are outdated
regarding how the filesystem works), but I would plug even harder for
a --permission-transparency option, which would just assume that the
admin is making security arrangements, through mechanisms that
postgres may or may not even understand. The admin can create ACLs
with default entries that propagate to newly created objects.
SELinux contexts can work in similar ways. The admin controls the
umask inherited by the postmaster. A --permission-transparency option
would simply use open and mkdir in the traditional ways, allowing
the chosen umask, ACL defaults, SELinux contexts, etc., to do their
thing, and would avoid then stepping on the results with explicit
chmods (and of course skip the I-refuse-to-start-because-I-
misunderstand-your-setup check).

It wouldn't be for every casual install, but it would be the best
way to stay out of the way of an admin securing a system with modern
facilities.

A lot of the design effort put into debating what postgres itself
should or shouldn't insist on could then, perhaps, go into writing
postgres-specific configuration rule packages for some of those
better configuration-checking tools, and there it might even be
possible to write tests that incorporate knowledge of ACLs, SELinux,
etc.

-Chap

Re: Additional Statistics Hooks

2018-03-13 Thread Tom Lane

Mat Arye  writes:
> An example, of a case a protransform type system would not be able to
> optimize is mathematical operator expressions like bucketing integers by
> decile --- (integer / 10) * 10.

Uh, why not?  An estimation function that is specific to integer divide
shouldn't have much trouble figuring out that x/10 has one-tenth as many
distinct values as x does.  I'd certainly rather have that knowledge
associated directly with int4div, and the corresponding knowledge about
date_trunc associated with that function, and similar knowledge about
extension-provided operators provided by the extensions, than try to
maintain a hook function that embeds all such knowledge.

> I also think that the point with extended statistics is a good one and
> points to the need for more experimentation/experience which I think
> a C hook is better suited for. Putting in a hook will allow extension
> writers like us to experiment and figure out the kinds of transform on
> statistics that are useful while having
> a small footprint on the core.

If you're experimenting you might as well just change the source code.
A hook is only useful if you're trying to ship something for production,
and I doubt that factorizing things this way is a credible production
solution.

regards, tom lane

Inquiry regarding the projects under GSOC

2018-03-13 Thread Dr K.G Yadav

 Hello Postgres Team

I am Ankit 3rd year B.tech student from India

I am really interested to work along with your organisation as an intern to
rectify the issues and bugs as I love challenges and stated in the docs
that some features might be not possible to implement so I would love to
try. My inquiry is that what are the current bug fixes or features updates
available for interns to work upon

Thanks !!

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Stephen Frost

Greetings Chapman,

* Chapman Flack (c...@anastigmatix.net) wrote:
> On 03/13/2018 10:40 AM, Stephen Frost wrote:
> > ...  Ultimately, the default which makes sense here isn't a
> > one-size-fits-all but is system dependent and the administrator should
> > be able to choose what permissions they want to have.
> 
> Hear, hear. Returning for a moment again to
> 
> https://www.postgresql.org/message-id/8b16cc30-0187-311e-04b5-1f32446d89dd%40anastigmatix.net
> 
> we see that a stat() returning mode 0750 on a modern system may not
> even mean there is any group access at all. In that example, the
> datadir had these permissions:

Yes, that's true, but PG has never paid attention to POSIX ACLs or Linux
capabilities.  Changing it to do so is quite a bit beyond the scope of
this particular patch and I don't see anything in what this patch is
doing which would preclude someone from putting in that effort in the
future.

> While PostgreSQL does its stat() and interprets the mode as if it is
> still on a Unix box from the '80s, two things have changed underneath
> it: POSIX ACLs and Linux capabilities. Capabilities take the place of
> the former specialness of root, who now needs to be granted r-x
> explicitly in the ACL to be able to read stuff there at all, and there
> is clearly no access to group and no access to other. It would be hard
> for anybody to call this an insecure configuration. But when you stat()
> an object with a POSIX ACL, you get the 'mask' value where the 'group'
> bits used to go, so postgres sees this as 0750, thinks the 5 represents
> group access, and refuses to start. Purely a mistake.

I'll point out that PG does run on quite a few other OS's beyond Linux.

> It's the kind of mistake that is inherent in this sort of check,
> which tries to draw conclusions from the semantics it assumes, while
> systems evolve and semantics march along. One hypothetical fix would
> be to add:
> 
> #ifdef HAS_POSIX_ACLS
> ... check if there's really an ACL here, and the S_IRWXG bits are
> really just the mask, or even try to pass judgment on whether the
> admin's chosen ACL is adequately secure ...
> #endif
> 
> but then sooner or later it will still end up making assumptions
> that aren't true under, say, SELinux, so there's another #ifdef,
> and where does it end?

I agree with this general concern.

> On 03/13/2018 11:00 AM, Tom Lane wrote:
> > In a larger sense, this fails to explain the operating principle,
> > namely what I stated above, that we allow the existing permissions
> > on PGDATA to decide what we allow as group permissions.
> 
> I admit I've only skimmed the patches, trying to determine what
> that will mean in practice. In a case like the ACL example above,
> does this mean that postgres will stat PGDATA, conclude incorrectly
> that group access is granted, and then, based on that, actually go
> granting unwanted group access for real on newly-created files
> and directories?

PG will stat PGDATA and conclude that the system is saying that group
access has been granted on PGDATA and will do the same for subsequent
files created later on.  This is new in PG, so there isn't any concern
about this causing problems in an existing environment- you couldn't
have had those ACLs on an existing PGDATA dir in the first place, as you
note above.  The only issue that remains is that PG doesn't understand
or work with POSIX ACLs or Linux capabilities, but that's not anything
new.

> On 03/13/2018 10:45 AM, David Steele wrote:
> > As Stephen notes, this can be enforced by the user if they want to,
> > and without much effort (and with better tools).
> 
> To me, that seems really the key. An --allow-group-access option is
> nice (but, as we see, misleading if its assumptions are outdated
> regarding how the filesystem works), but I would plug even harder for
> a --permission-transparency option, which would just assume that the
> admin is making security arrangements, through mechanisms that
> postgres may or may not even understand. The admin can create ACLs
> with default entries that propagate to newly created objects.
> SELinux contexts can work in similar ways. The admin controls the
> umask inherited by the postmaster. A --permission-transparency option
> would simply use open and mkdir in the traditional ways, allowing
> the chosen umask, ACL defaults, SELinux contexts, etc., to do their
> thing, and would avoid then stepping on the results with explicit
> chmods (and of course skip the I-refuse-to-start-because-I-
> misunderstand-your-setup check).
> 
> It wouldn't be for every casual install, but it would be the best
> way to stay out of the way of an admin securing a system with modern
> facilities.
> 
> A lot of the design effort put into debating what postgres itself
> should or shouldn't insist on could then, perhaps, go into writing
> postgres-specific configuration rule packages for some of those
> better configuration-checking tools, and there it might even be
> possible to write tests

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread Jesper Pedersen


Hi Amit,

On 03/13/2018 07:37 AM, Amit Langote wrote:

I will continue working on improving the comments / cleaning things up and
post a revised version soon, but until then please look at the attached.



Passes check-world.

Some minor comments:

0001: Ok

0002: Ok

0003:
* Trailing white space
* pruning.c
  - partkey_datum_from_expr
* "Add more expression types..." -- Are you planning to add more of 
these ? Otherwise change the comment

  - get_partitions_for_null_keys
* Documentation for method
* 'break;' missing for _HASH and default case
  - get_partitions_for_keys
* 'break;'s are outside of the 'case' blocks
* The 'switch(opstrategy)'s could use some {} blocks
  * 'break;' missing from default
  - perform_pruning_combine_step
* Documentation for method
* nodeFuncs.c
  - Missing 'break;'s to follow style

0004: Ok

Best regards,
 Jesper

[submit code] I develop a tool for pgsql, how can I submit it

2018-03-13 Thread leap

hello!
I develop a tool for pgsql, I want to submit it on pgsql.
how can I submit it?


address: https://github.com/leapking/pgcheck

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Tom Lane

David Steele  writes:
> On 3/12/18 3:28 AM, Michael Paquier wrote:
>> In pg_rewind and pg_resetwal, isn't that also a portion which is not
>> necessary without the group access feature?

> These seem like a good idea to me with or without patch 03.  Some of our
> front-end tools (initdb, pg_upgrade) were setting umask and others
> weren't.  I think it's more consistent (and safer) if they all do, at
> least if they are writing into PGDATA.

+1 ... see a926eb84e for an example of how easy it is to screw up if
the process's overall umask is permissive.

regards, tom lane

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Michael Paquier

On Wed, Mar 14, 2018 at 05:36:26PM +1300, Thomas Munro wrote:
> Hmm.  I suppose we could have invented a new extended hook with a
> different name and back-patched it so that PG10 would support both.
> Then binary compatibility with existing compiled extensions wouldn't
> be affected AFAICS, but you could use the new extended hook in (say)
> 10.4 or higher.  Then for PG11 (or later) we could remove the old hook
> and just keep the new one.  I suppose that option is still technically
> open to us, though I'm not sure of the committers' appetite for messing
> with back branches like that.

The interactions between both hooks would not be difficult to define: if
the original hook is not defined, just do not trigger the second.  Still
that's too late for v10, so I would rather let it go.  New features are
not backpatched.
--
Michael


signature.asc
Description: PGP signature

Re: PQHost() undefined behavior if connecting string contains both host and hostaddr types

2018-03-13 Thread Michael Paquier

On Fri, Mar 09, 2018 at 04:42:30PM -0500, Peter Eisentraut wrote:
> It seems, however, that PGhost() has always been broken for hostaddr
> use.  In 9.6 (before the multiple-hosts stuff was introduced), when
> connecting to "hostaddr=127.0.0.1", PGhost() returns "/tmp".  Urgh.
> 
> I think we should decide what PGhost() is supposed to mean when hostaddr
> is in use, and then make a fix for that consistently across all versions.

Hm.  The only inconsistency I can see here is when "host" is not defined
but "hostaddr" is, so why not make PQhost return the value of "hostaddr"
in this case?
--
Michael


signature.asc
Description: PGP signature

Re: User defined data types in Logical Replication

2018-03-13 Thread Masahiko Sawada

On Wed, Mar 7, 2018 at 9:19 PM, Masahiko Sawada  wrote:
> On Wed, Mar 7, 2018 at 2:52 AM, Alvaro Herrera  
> wrote:
>> Masahiko Sawada wrote:
>>> On Tue, Mar 6, 2018 at 8:35 AM, Alvaro Herrera  
>>> wrote:
>>
>>> > Therefore, I'm inclined to make this function raise a warning, then
>>> > return a substitute value (something like "unrecognized type XYZ").
>>> > [...]
>>>
>>> I agree with you about not hiding the actual reason for the error but
>>> if we raise a warning at logicalrep_typmap_gettypname don't we call
>>> slot_store_error_callback recursively?
>>
>> Hmm, now that you mention it, I don't really know.  I think it's
>> supposed not to happen, since calling ereport() again opens a new
>> recursion level, but then maybe errcontext doesn't depend on the
>> recursion level ... I haven't checked.  This is why the TAP test would
>> be handy :-)
>
> The calling ereport opens a new recursion level. The calling ereport
> with error doesn't return to caller but the calling with warning does.
> So the recursively calling ereport(WARNING) ends up with exceeding the
> errordata stack size. So it seems to me that we can set errcontext in
> logicalrep_typmap_gettypname() instead of raising warning or error.
>
>>
>>> Agreed. Will add a TAP test.
>>
>> Great.  This patch waits on that, then.
>>
>
> Okay. I think the most simple and convenient way to reproduce this
> issue is to call an elog(LOG) in input function of a user-defined data
> type. So I'm thinking to create the test in src/test/module directory.
>

After more thought, I think since the errors in
logicalrep_typmap_gettypname are not relevant with the actual error
(i.g. type conversion error) it would not be good idea to show the
error message of "could not found data type entry" in errcontext.
It might be more appropriate if we return a substitute value
("unrecognized type" or "unrecognized built-in type") without raising
neither an error nor a warning. Thoughts?

Regarding to regression test, I added a new test module
test_subscription that creates a new user-defined data type. In a
subscription regression test, using test_subscription we make
subscriber call slot_store_error_callback and check if the subscriber
can correctly look up both remote and local data type strings. One
downside of this regression test is that in a failure case, the
duration of the test will be long (up to 180sec) because it has to
wait for the polling timeout.
Attached an updated patch with a regression test.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

fix_slot_store_error_callback_v13.patch
Description: Binary data

planner bug regarding lateral and subquery?

2018-03-13 Thread Tatsuro Yamada


Hi Hackers,

I found a bug, maybe.
If it is able to get an explain command result from below query successfully,
I think that it means the query is executable. However, I got an error by
executing the query without an explain command. I guess that planner makes a 
wrong plan.

I share a reproduction procedure and query results on 
3b7ab4380440d7b14ee390fabf39f6d87d7491e2.

* Reproduction

create table test (c1 integer, c2 integer, c3 text);
insert into test values (1, 3, 'a');
insert into test values (2, 4, 'b');

explain (costs off)
select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;


* Result of Explain: succeeded

# explain (costs off)
select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

QUERY PLAN
---
 Nested Loop
   InitPlan 1 (returns $0)
 ->  Seq Scan on test
   InitPlan 2 (returns $1)
 ->  Seq Scan on test test_1
   ->  Seq Scan on test ref_0
   ->  Nested Loop Left Join
 Join Filter: ($1 = ref_2.c1)
 ->  Seq Scan on test ref_2
 ->  Materialize
   ->  Result
 One-Time Filter: ($0 IS NULL)
 ->  Seq Scan on test ref_1

* Result of Select: failed

# select
  subq_1.c0
from
  test as ref_0,
  lateral (select subq_0.c0 as c0
   from
(select ref_0.c2 as c0,
(select c1 from test) as c1 from test as ref_1
   where (select c3 from test) is NULL) as subq_0
   right join test as ref_2
   on (subq_0.c1 = ref_2.c1 )) as subq_1;

ERROR:  more than one row returned by a subquery used as an expression


* The error message came from here

./src/backend/executor/nodeSubplan.c

   if (found &&
(subLinkType == EXPR_SUBLINK ||
 subLinkType == MULTIEXPR_SUBLINK ||
 subLinkType == ROWCOMPARE_SUBLINK))
ereport(ERROR,
(errcode(ERRCODE_CARDINALITY_VIOLATION),
 errmsg("more than one row returned by a subquery used as an 
expression")));


Thanks,
Tatsuro Yamada

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread David G. Johnston

On Tuesday, March 13, 2018, Tatsuro Yamada 
wrote:

> Hi Hackers,
>
> I found a bug, maybe.
> If it is able to get an explain command result from below query
> successfully,
> I think that it means the query is executable.
>

There is a difference between executable, compilable, and able to execute
to completion, runtime, on specific data.  You've proven the former but as
the error indicates specific data causes the complete execution of the
query to fail.

I can write "select cola / colb from tbl" and as long as there are no zeros
in colb the query will complete, but if there is you get a divide by zero
runtime error.  This is similar.

David J.

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Pavan Deolasee

On Sun, Mar 11, 2018 at 9:18 PM, Claudio Freire 
wrote:

> On Sun, Mar 11, 2018 at 2:27 AM, Pavan Deolasee
>
> >
> > Yes, I will try that next - it seems like a good idea. So the idea would
> be:
> > check if the block is still the rightmost block and the insertion-key is
> > greater than the first key in the page. If those conditions are
> satisfied,
> > then we do a regular binary search within the page to find the correct
> > location. This might add an overhead of binary search when keys are
> strictly
> > ordered and a single client is inserting the data. If that becomes a
> > concern, we might be able to look for that special case too and optimise
> for
> > it too.
>
> Yeah, pretty much that's the idea. Beware, if the new item doesn't
> fall in the rightmost place, you still need to check for serialization
> conflicts.
>

So I've been toying with this idea since yesterday and I am quite puzzled
with the results. See the attached patch which compares the insertion key
with the last key inserted by this backend, if the cached block is still
the rightmost block in the tree. I initially only compared with the first
key in the page, but I tried this version because of the strange
performance regression which I still have no answers.

For a small number of clients, the patched version does better. But as the
number of clients go up, the patched version significantly underperforms
master. I roughly counted the number of times the fastpath is taken and I
noticed that almost 98% inserts take the fastpath. I first thought that the
"firstkey" location in the page might be becoming a hot-spot for concurrent
processes and hence changed that to track the per-backend last offset and
compare against that the next time. But that did not help much.

+-++---+
| clients | Master - Avg load time in sec | Patched - Avg load time in sec |
+-++---+
| 1   | 500.0725203| 347.632079|
+-++---+
| 2   | 308.4580771| 263.9120163   |
+-++---+
| 4   | 359.4851779| 514.7187444   |
+-++---+
| 8   | 476.4062592| 780.2540855   |
+-++---+

The perf data does not show anything interesting either. I mean there is a
reduction in CPU time spent in btree related code in the patched version,
but the overall execution time to insert the same number of records go up
significantly.

Perf (master):
===

+   72.59% 1.81%  postgres  postgres[.] ExecInsert

+   47.55% 1.27%  postgres  postgres[.]
ExecInsertIndexTuples
+   44.24% 0.48%  postgres  postgres[.] btinsert

-   42.40% 0.87%  postgres  postgres[.] _bt_doinsert

   - 41.52% _bt_doinsert

  + 21.14% _bt_search

  + 12.57% _bt_insertonpg

  + 2.03% _bt_binsrch

1.60% _bt_mkscankey

1.20% LWLockAcquire

  + 1.03% _bt_freestack

0.67% LWLockRelease

0.57% _bt_check_unique

   + 0.87% _start

+   26.03% 0.95%  postgres  postgres[.] ExecScan

+   21.14% 0.82%  postgres  postgres[.] _bt_search

+   20.70% 1.31%  postgres  postgres[.] ExecInterpExpr

+   19.05% 1.14%  postgres  postgres[.] heap_insert

+   18.84% 1.16%  postgres  postgres[.] nextval_internal

+   18.08% 0.84%  postgres  postgres[.] ReadBufferExtended

+   17.24% 2.03%  postgres  postgres[.] ReadBuffer_common

+   12.57% 0.59%  postgres  postgres[.] _bt_insertonpg

+   11.12% 1.63%  postgres  postgres[.] XLogInsert

+9.90% 0.10%  postgres  postgres[.] _bt_relandgetbuf

+8.97% 1.16%  postgres  postgres[.] LWLockAcquire

+8.42% 2.03%  postgres  postgres[.] XLogInsertRecord

+7.26% 1.01%  postgres  postgres[.] _bt_binsrch

+7.07% 1.20%  postgres  postgres[.]
RelationGetBufferForTuple
+6.27% 4.92%  postgres  postgres[.] _bt_compare

+5.97% 0.63%  postgres  postgres[.]
read_seq_tuple.isra.3
+5.70% 4.89%  postgres  postgres[.]
hash_search_with_hash_value
+5.44% 5.44%  postgres  postgres[.] LWLockAttemptLock

Perf (Patched):

+   69.33% 2.36%  postgres  postgres[.] ExecInsert

+   35.21% 0.64%  postgres  postgres[.]
ExecInsertIndexTuples
-   32.14% 0.45%  postgres

Re: inserts into partitioned table may cause crash

2018-03-13 Thread Amit Langote

Fujita-san,

Thanks for the updates and sorry I couldn't reply sooner.

On 2018/03/06 21:26, Etsuro Fujita wrote:
> One thing I notice while working on this is this in ExecInsert/CopyFrom,
> which I moved to ExecPrepareTupleRouting as-is for the former:
>
> /*
>  * If we're capturing transition tuples, we might need to convert from
> the
>  * partition rowtype to parent rowtype.
>  */
> if (mtstate->mt_transition_capture != NULL)
> {
> if (resultRelInfo->ri_TrigDesc &&
> (resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
>  resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
> {
> /*
>  * If there are any BEFORE or INSTEAD triggers on the partition,
>  * we'll have to be ready to convert their result back to
>  * tuplestore format.
>  */
> mtstate->mt_transition_capture->tcs_original_insert_tuple =
NULL;
> mtstate->mt_transition_capture->tcs_map =
> TupConvMapForLeaf(proute, rootRelInfo, leaf_part_index);
> }
>
> Do we need to consider INSTEAD triggers here?  The partition is either a
> plain table or a foreign table, so I don't think it can have those
> triggers.  Am I missing something?

I think you're right.  We don't need to consider INSTEAD OF triggers here
if we know we're dealing with a partition which cannot have those.

Just to make sure, a word from Thomas would help.

On 2018/03/12 12:25, Etsuro Fujita wrote:
> (2018/03/09 20:18), Etsuro Fujita wrote:
>> Here are updated patches for PG10 and HEAD.
>>
>> Other changes:
>> * Add regression tests based on your test cases shown upthread
> 
> I added a little bit more regression tests and revised comments.  Please
> find attached an updated patch.

Thanks for adding the test.

I was concerned that ExecMaterializeSlot will be called twice now -- first
in ExecPrepareTupleRouting and then in ExecInsert -- instead of only once
in ExecInsert with the existing code, but perhaps it doesn't matter much
because most of the work would be done in the 1st call anyway.

Sorry that this may be nitpicking that I should've brought up before, but
doesn't ExecPrepareTupleRouting do all the work that's needed for routing
a tuple and hence isn't the name a bit misleading?  Maybe,
ExecPerformTupleRouting?

Btw, I noticed that the patches place ExecPrepareTupleRouting (both the
declaration and the definition) at different relative locations within
nodeModifyTable.c in the HEAD branch vs. in the 10 branch.  It might be a
good idea to bring them to the same relative location to avoid hassle when
back-patching relevant patches in the future.  I did that in the attached
updated version (v4) of the patch for HEAD, which does not make any other
changes.  Although, the patch for PG-10 is unchanged, I still changed its
file name to contain v4.

Regards,
Amit
*** a/src/backend/commands/copy.c
--- b/src/backend/commands/copy.c
***
*** 2656,2668  CopyFrom(CopyState cstate)
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
(resultRelInfo->ri_TrigDesc->trig_insert_before_row ||
!
resultRelInfo->ri_TrigDesc->trig_insert_instead_row))
{
/*
!* If there are any BEFORE or INSTEAD 
triggers on the
!* partition, we'll have to be ready to 
convert their
!* result back to tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
--- 2656,2667 
if (cstate->transition_capture != NULL)
{
if (resultRelInfo->ri_TrigDesc &&
!   
resultRelInfo->ri_TrigDesc->trig_insert_before_row)
{
/*
!* If there are any BEFORE triggers on 
the partition,
!* we'll have to be ready to convert 
their result back to
!* tuplestore format.
 */

cstate->transition_capture->tcs_original_insert_tuple = NULL;
cstate->transition_capture->tcs_map =
***
*** 2803,2814  CopyFrom(CopyState cstate)
 * tuples inserted by an INSERT command.
 */
processed++;

!

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-13 Thread Masahiko Sawada

On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
 wrote:
> On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada 
> wrote:
>>
>> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
>>  wrote:
>> > 2) These parameters are reset during btbulkdelete() and set during
>> > btvacuumcleanup().
>>
>> Can't we set these parameters even during btbulkdelete()? By keeping
>> them up to date, we will able to avoid an unnecessary cleanup vacuums
>> even after index bulk-delete.
>
>
> We certainly can update cleanup-related parameters during btbulkdelete().
> However, in this case we would update B-tree meta-page during each
> VACUUM cycle.  That may cause some overhead for non append-only
> workloads.  I don't think this overhead would be sensible, because in
> non append-only scenarios VACUUM typically writes much more of information.
> But I would like this oriented to append-only workload patch to be
> as harmless as possible for other workloads.

What overhead are you referring here? I guess the overhead is only the
calculating the oldest btpo.xact. And I think it would be harmless.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: [HACKERS] path toward faster partition pruning

2018-03-13 Thread David Rowley

On 14 March 2018 at 06:54, Jesper Pedersen  wrote:
> * "Add more expression types..." -- Are you planning to add more of
> these ? Otherwise change the comment

Run-time pruning will deal with Param types here. The comment there
might be just to remind us all that the function must remain generic
enough so we can support more node types later. I don't particularly
need the comment for that patch, and I'm not quite sure if I should be
removing it in that patch. My imagination is not stretching far enough
today to think what we could use beyond Const and Param.

I don't feel strongly about the comment either way. This is just to
let you know that there are more up and coming things to do in that
spot.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re:Re: Re: [GSOC 18] Performance Farm Project——Initialization Project

2018-03-13 Thread Hongyuan Ma

Hi Dave,
I am willing to use React to re-create the front-end application. Since I plan 
to use separate front-end and back-end development methods, this means that 
there will be no html files in Django applications. Front-end and back-end 
applications will interact via restful api. I will use react to rewrite some 
existing html files if needed. Before initializing the react application, I 
want to learn more about your tendency toward front-end technology.

 - About React: Which version of React (15.x or 16) and react-router (2.x or 4) 
do you tend to use?
 - About Package Manager: Do you tend to use yarn or npm?
 - About the UI library: I guess you might prefer Bootstrap or Material-UI.

At the same time I hope you can help me understand the functional requirements 
of this project more clearly. Here are some of my thoughts on PerfFarm:

 - I see this comment in the client folder (In line 15 of the "pgperffarm \ 
client \ benchmarks \ pgbench.py" file):
'''
# TODO allow running custom scripts, not just the default
'''
Will PerfFarm have many test items so that the test item search function or 
even the test item sort function is expected to be provided?
 
 - What value will be used to determine if a machine's performance is improving 
or declining?
 - After the user logs in to the site, I think it may be necessary to provide a 
page for the user to browse or manage his own machine. Is there any function 
that requires users to log in before they can use it?
 - When I registered a machine on BuildFarm, I did not receive the confirmation 
email immediately but several days later. In PerfFarm, when a user registers a 
machine, will the administrator review it before sending a confirmation email?
 - I see BuildFarm assigning an animal name to each registered machine. Will 
PerfFarm also have this interesting feature?

My postgresql.org community account is:
 - Username: maleicacid
 - Email: cs_maleica...@163.com

I hope to get the commit permission of the pgperffarm.git repository. I am 
willing to continue coding and complete the project on the existing 
code.Looking forward to your reply.

Best Regards,
Hongyuan Ma (cs_maleica...@163.com)


At 2018-03-13 03:03:08, "Dave Page"  wrote:

Hi


On Mon, Mar 12, 2018 at 9:57 AM, Hongyuan Ma  wrote:

Hi Dave,
Thank you for your reminder. This is indeed my negligence.
In fact, I have browsed the code in the pgperffarm.git repository, including 
the client folder and the web folder. However, I found that the web folder has 
some unfinished html files and does not contain model class files. And the 
project used Django 1.8 without importing the Django REST Framework (When I 
talked to Mark about the PerfFarm project, he told me he insisted on using 
Django 1.11 and considered using the REST framework). So I mistakenly thought 
that the code in the web folder had been shelved.


Nope, not at all. It just wasn't updated to meet the latest PG infrastructure 
requirements yet (basically just the update to Django 11). The rest should be 
fine as-is.
 


In the newly initialized Django application, I upgraded its version and 
assigned the directory to make the project structure clearer. I want to use a 
separate front-end development approach (The front-end applications will use 
vue.js for programming.). I hope you can allow me to use it instead of the old 
one. I am willing to integrate the auth module into this new application as 
soon as possible.


I would much prefer to see jQuery + React, purely because there are likely more 
PostgreSQL developers (particularly from the pgAdmin team) that know those 
technologies. It is important to consider long term maintenance as well as ease 
of initial development with any project.


Thanks.






Regards,
Hongyuan Ma (cs_maleica...@163.com) 

在 2018-03-12 02:26:43，"Dave Page"  写道：
Hi


Maybe I’m missing something (I’ve been offline a lot recently for unavoidable 
reasons), but the perf farm project already has a Django backend initialised 
and configured to work with community auth, on community infrastructure.


https://git.postgresql.org/gitweb/?p=pgperffarm.git;a=summary

On Sunday, March 11, 2018, Hongyuan Ma  wrote:

Hello, mark
I initialized a Django project and imported the Django REST Framework. Its 
github address is: https://github.com/PGPerfFarm/server-code
I created some model classes and also wrote scripts in the dbtools folder to 
import simulation data for development. I am hesitant to use admin or xadmin as 
the administrator's backend for the site (I prefer to use xadmin).


I also initialized the website's front-end application. Here is its address: 
https://github.com/PGPerfFarm/front-end-code.git
I wrote the header component as shown:


I hope this effect can enhance the future user experience:).
This application uses vue.js and element-ui, but if you insist on using other 
js libraries or not using js, please let me

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Claudio Freire

On Wed, Mar 14, 2018 at 1:36 AM, Pavan Deolasee
 wrote:
>
>
> On Sun, Mar 11, 2018 at 9:18 PM, Claudio Freire 
> wrote:
>>
>> On Sun, Mar 11, 2018 at 2:27 AM, Pavan Deolasee
>>
>> >
>> > Yes, I will try that next - it seems like a good idea. So the idea would
>> > be:
>> > check if the block is still the rightmost block and the insertion-key is
>> > greater than the first key in the page. If those conditions are
>> > satisfied,
>> > then we do a regular binary search within the page to find the correct
>> > location. This might add an overhead of binary search when keys are
>> > strictly
>> > ordered and a single client is inserting the data. If that becomes a
>> > concern, we might be able to look for that special case too and optimise
>> > for
>> > it too.
>>
>> Yeah, pretty much that's the idea. Beware, if the new item doesn't
>> fall in the rightmost place, you still need to check for serialization
>> conflicts.
>
>
> So I've been toying with this idea since yesterday and I am quite puzzled
> with the results. See the attached patch which compares the insertion key
> with the last key inserted by this backend, if the cached block is still the
> rightmost block in the tree. I initially only compared with the first key in
> the page, but I tried this version because of the strange performance
> regression which I still have no answers.
>
> For a small number of clients, the patched version does better. But as the
> number of clients go up, the patched version significantly underperforms
> master. I roughly counted the number of times the fastpath is taken and I
> noticed that almost 98% inserts take the fastpath. I first thought that the
> "firstkey" location in the page might be becoming a hot-spot for concurrent
> processes and hence changed that to track the per-backend last offset and
> compare against that the next time. But that did not help much.
>
> +-++---+
> | clients | Master - Avg load time in sec | Patched - Avg load time in sec |
> +-++---+
> | 1   | 500.0725203| 347.632079|
> +-++---+
> | 2   | 308.4580771| 263.9120163   |
> +-++---+
> | 4   | 359.4851779| 514.7187444   |
> +-++---+
> | 8   | 476.4062592| 780.2540855   |
> +-++---+
>
> The perf data does not show anything interesting either. I mean there is a
> reduction in CPU time spent in btree related code in the patched version,
> but the overall execution time to insert the same number of records go up
> significantly.
>
> Perf (master):
> ===
>
> +   72.59% 1.81%  postgres  postgres[.] ExecInsert
> +   47.55% 1.27%  postgres  postgres[.]
> ExecInsertIndexTuples
> +   44.24% 0.48%  postgres  postgres[.] btinsert
> -   42.40% 0.87%  postgres  postgres[.] _bt_doinsert
>- 41.52% _bt_doinsert
>   + 21.14% _bt_search
>   + 12.57% _bt_insertonpg
>   + 2.03% _bt_binsrch
> 1.60% _bt_mkscankey
> 1.20% LWLockAcquire
>   + 1.03% _bt_freestack
> 0.67% LWLockRelease
> 0.57% _bt_check_unique
>+ 0.87% _start
> +   26.03% 0.95%  postgres  postgres[.] ExecScan
> +   21.14% 0.82%  postgres  postgres[.] _bt_search
> +   20.70% 1.31%  postgres  postgres[.] ExecInterpExpr
> +   19.05% 1.14%  postgres  postgres[.] heap_insert
> +   18.84% 1.16%  postgres  postgres[.] nextval_internal
> +   18.08% 0.84%  postgres  postgres[.] ReadBufferExtended
> +   17.24% 2.03%  postgres  postgres[.] ReadBuffer_common
> +   12.57% 0.59%  postgres  postgres[.] _bt_insertonpg
> +   11.12% 1.63%  postgres  postgres[.] XLogInsert
> +9.90% 0.10%  postgres  postgres[.] _bt_relandgetbuf
> +8.97% 1.16%  postgres  postgres[.] LWLockAcquire
> +8.42% 2.03%  postgres  postgres[.] XLogInsertRecord
> +7.26% 1.01%  postgres  postgres[.] _bt_binsrch
> +7.07% 1.20%  postgres  postgres[.]
> RelationGetBufferForTuple
> +6.27% 4.92%  postgres  postgres[.] _bt_compare
> +5.97% 0.63%  postgres  postgres[.]
> read_seq_tuple.isra.3
> +5.70% 4.89%  postgres  postgres[.]
> hash_search_with_hash_value
> +5.44% 5.44%  postgres  postgres[.]

Re: Faster inserts with mostly-monotonically increasing values

2018-03-13 Thread Pavan Deolasee

On Wed, Mar 14, 2018 at 10:58 AM, Claudio Freire 
wrote:

>
>
> I'm thinking there could be contention on some lock somewhere.
>
> Can you attach the benchmark script you're using so I can try to reproduce
> it?
>


I am using a very ad-hoc script.. But here it is.. It assumes a presence of
a branch named "btree_rightmost" with the patched code.

You will need to make necessary adjustments of course.

Thanks,
Pavan


-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

start_options="-c shared_buffers=16GB -c max_wal_size=64GB -c min_wal_size=16GB -c checkpoint_timeout=60min -c bgwriter_delay=100 -c bgwriter_lru_maxpages=100 -c bgwriter_lru_multiplier=3 -c bgwriter_flush_after=256 -c checkpoint_completion_target=0.9"
PGDATA=/data/pgsql

echo "INSERT INTO testtab(b) SELECT generate_series(1,10);" > s1.sql

for b in master btree_rightmost; do
git checkout $b > /dev/null 2>&1
make -s clean > /dev/null 2>&1
./configure --prefix=$HOME/pg-install/$b
make -s -j8 install > /dev/null 2>&1
export PATH=$HOME/pg-install/$b/bin:$PATH
for c in 1 2 4 8; do
pg_ctl -D $PGDATA -w stop
rm -rf $PGDATA
initdb -D $PGDATA
pg_ctl -D $PGDATA -o "$start_options" -w start -l logfile.$b
psql -c "CREATE TABLE testtab (a bigserial UNIQUE, b bigint);" postgres
echo "$b $c" >> results.txt
num_txns_per_client=$((1024 / $c))
time pgbench -n -l -c $c -j 1 -t $num_txns_per_client -f s1.sql postgres >> results.txt
done
done

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Tom Lane

Michael Paquier  writes:
> On Wed, Mar 14, 2018 at 05:36:26PM +1300, Thomas Munro wrote:
>> Hmm.  I suppose we could have invented a new extended hook with a
>> different name and back-patched it so that PG10 would support both.
>> Then binary compatibility with existing compiled extensions wouldn't
>> be affected AFAICS, but you could use the new extended hook in (say)
>> 10.4 or higher.  Then for PG11 (or later) we could remove the old hook
>> and just keep the new one.  I suppose that option is still technically
>> open to us, though I'm not sure of the committers' appetite for messing
>> with back branches like that.

> The interactions between both hooks would not be difficult to define: if
> the original hook is not defined, just do not trigger the second.  Still
> that's too late for v10, so I would rather let it go.  New features are
> not backpatched.

Yeah.  There would be no good way for a v10 extension to know whether the
additional hook is available --- it would have to know that at compile
time, and it can't --- so I don't see that this is practical.

Ideally we'd have noticed the problem before v10 got out ... so my own
takeaway here is that this is a reminder to extension authors that they
ought to test their stuff during beta period.

regards, tom lane

Re: Ambigous Plan - Larger Table on Hash Side

2018-03-13 Thread Ashutosh Bapat

On Tue, Mar 13, 2018 at 4:32 PM, Narendra Pradeep U U
 wrote:
> Hi,
>   Thanks everyone for your suggestions. I would like to add  explain
> analyze of both the plans so that we can have broader picture.
>
> I have a work_mem of 1000 MB.
>
> The Plan which we get regularly with table being analyzed .
>
> tpch=# explain analyze  select b from tab2 left join tab1 on a = b;
>QUERY PLAN
> 
>  Hash Left Join  (cost=945515.68..1071064.34 rows=78264 width=4) (actual
> time=9439.410..20445.620 rows=78264 loops=1)
>Hash Cond: (tab2.b = tab1.a)
>->  Seq Scan on tab2  (cost=0.00..1129.64 rows=78264 width=4) (actual
> time=0.006..5.116 rows=78264 loops=1)
>->  Hash  (cost=442374.30..442374.30 rows=30667630 width=4) (actual
> time=9133.593..9133.593 rows=30667722 loops=1)
>  Buckets: 33554432  Batches: 2  Memory Usage: 801126kB
>  ->  Seq Scan on tab1  (cost=0.00..442374.30 rows=30667630 width=4)
> (actual time=0.030..3584.652 rows=30667722 loops=1)
>  Planning time: 0.055 ms
>  Execution time: 20472.603 ms
> (8 rows)
>
>
>
> I reproduced the  other plan by not analyzing the smaller table.
>
> tpch=# explain analyze  select b from tab2 left join tab1 on a = b;
> QUERY PLAN
> --
>  Hash Right Join  (cost=2102.88..905274.97 rows=78039 width=4) (actual
> time=15.331..7590.406 rows=78264 loops=1)
>Hash Cond: (tab1.a = tab2.b)
>->  Seq Scan on tab1  (cost=0.00..442375.48 rows=30667748 width=4)
> (actual time=0.046..2697.480 rows=30667722 loops=1)
>->  Hash  (cost=1127.39..1127.39 rows=78039 width=4) (actual
> time=15.133..15.133 rows=78264 loops=1)
>  Buckets: 131072  Batches: 1  Memory Usage: 3776kB
>  ->  Seq Scan on tab2  (cost=0.00..1127.39 rows=78039 width=4)
> (actual time=0.009..5.516 rows=78264 loops=1)
>  Planning time: 0.053 ms
>  Execution time: 7592.688 ms
> (8 rows)

I am surprised to see the estimates to be very close to the actual
values even without analysing the small table.

>
>
> The actual plan seems to be Slower. The smaller table (tab2) has exactly
> each row duplicated 8 times  and all the rows in larger table (tab2) are
> distinct. what may be the exact reason  and  can we fix this ?

After analysing the small table, the first plan is chosen as the
cheapest. This means that the plan with smaller table being hashed has
cost higher than the plan with larger table being hashed. We need to
examine that costing to see what went wrong in costing.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

Re: Changing the autovacuum launcher scheduling; oldest table first algorithm

2018-03-13 Thread Masahiko Sawada

On Tue, Mar 6, 2018 at 11:27 PM, Alvaro Herrera  wrote:
> Hello
>
> I haven't read your respective patches yet, but both these threads
> brought to memory a patch I proposed a few years ago that I never
> completed:
>
> https://www.postgresql.org/message-id/flat/20130124215715.GE4528%40alvh.no-ip.org

Thank you for sharing the thread.

>
> In that thread I posted a patch to implement a prioritisation scheme for
> autovacuum, based on an equation which was still under discussion when
> I abandoned it.  Chris Browne proposed a crazy equation to mix in both
> XID age and fraction of dead tuples; probably that idea is worth
> studying further.  I tried to implement that in my patch but I probably
> didn't do it correctly (because, as I recall, it failed to work as
> expected).  Nowadays I think we would also consider the multixact freeze
> age, too.
>
> Maybe that's worth giving a quick look in case some of the ideas there
> are useful for the patches now being proposed.

Yeah, that's definitely useful for the patches. I'll look at this and
will discuss that here.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread Stephen Frost

Greetings,

* Tatsuro Yamada (yamada.tats...@lab.ntt.co.jp) wrote:
> I found a bug, maybe.

I don't think so...

> * Result of Select: failed
> 
> # select
>   subq_1.c0
> from
>   test as ref_0,
>   lateral (select subq_0.c0 as c0
>from
> (select ref_0.c2 as c0,
> (select c1 from test) as c1 from test as ref_1
>where (select c3 from test) is NULL) as subq_0
>right join test as ref_2
>on (subq_0.c1 = ref_2.c1 )) as subq_1;
> 
> ERROR:  more than one row returned by a subquery used as an expression

You don't need LATERAL or anything complicated to reach that error,
simply do:

=*> select * from test where (select c1 from test) is null;
ERROR:  more than one row returned by a subquery used as an expression

The problem there is that the WHERE clause is trying to evaluate an
expression, which is "(select c1 from test) is null" and you aren't
allowed to have multiple rows returned from that subquery (otherwise,
how would we know which row to compare in the expression..?).

If you're actually intending to refer to the 'c3' column from the test
through the lateral join, you would just refer to it as 'ref_0.c3', as
you do in another part of that query.

Thanks!

Stephen

signature.asc
Description: PGP signature

Re: remove pg_class.relhaspkey

2018-03-13 Thread Michael Paquier

On Sat, Mar 10, 2018 at 01:52:56PM +0100, Tomas Vondra wrote:
> I agree with this sentiment - I don't think those flags are particularly
> helpful for client applications, and would vote +1 for removal.

OK, so I can see that we are moving to a consensus here.

> For the other flags we would probably need to test what impact would it
> have (e.g. table with no indexes, many indexes on other tables, and
> something calling get_relation_info a lot). But this patch proposes to
> remove only relhaspkey.

Yes, you are right here.  Peter, would you do that within this commit
fest or not?  As we are half-way through the commit fest, we could also
cut the apple in half and just remove relhaspkey for now as that's a
no-brainer.  So I would suggest to just do the latter and consider this
patch as done.

Attached is a rebased patch, there were some conflicts in pg_class.h by
the way.
--
Michael
From 1601b84f9ef2e8d0467b109c0b1d3df435edb59a Mon Sep 17 00:00:00 2001
From: Michael Paquier 
Date: Wed, 14 Mar 2018 14:46:43 +0900
Subject: [PATCH] Remove pg_class.relhaspkey

It's not used for anything internally, and it can't be relied on for
external uses, so it can just be removed.

Patch by Peter Eisentraut.
---
 doc/src/sgml/catalogs.sgml  |  9 -
 src/backend/catalog/heap.c  |  1 -
 src/backend/catalog/index.c | 32 ++-
 src/backend/commands/vacuum.c   | 10 --
 src/backend/rewrite/rewriteDefine.c |  1 -
 src/include/catalog/pg_class.h  | 38 ++---
 6 files changed, 20 insertions(+), 71 deletions(-)

diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..30e6741305 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1848,15 +1848,6 @@ SCRAM-SHA-256$iteration count:
   
  
 
- 
-  relhaspkey
-  bool
-  
-  
-   True if the table has (or once had) a primary key
-  
- 
-
  
   relhasrules
   bool
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index cf36ce4add..3d80ff9e5b 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -798,7 +798,6 @@ InsertPgClassTuple(Relation pg_class_desc,
 	values[Anum_pg_class_relnatts - 1] = Int16GetDatum(rd_rel->relnatts);
 	values[Anum_pg_class_relchecks - 1] = Int16GetDatum(rd_rel->relchecks);
 	values[Anum_pg_class_relhasoids - 1] = BoolGetDatum(rd_rel->relhasoids);
-	values[Anum_pg_class_relhaspkey - 1] = BoolGetDatum(rd_rel->relhaspkey);
 	values[Anum_pg_class_relhasrules - 1] = BoolGetDatum(rd_rel->relhasrules);
 	values[Anum_pg_class_relhastriggers - 1] = BoolGetDatum(rd_rel->relhastriggers);
 	values[Anum_pg_class_relrowsecurity - 1] = BoolGetDatum(rd_rel->relrowsecurity);
diff --git a/src/backend/catalog/index.c b/src/backend/catalog/index.c
index 431bc31969..9e2dd0e729 100644
--- a/src/backend/catalog/index.c
+++ b/src/backend/catalog/index.c
@@ -125,7 +125,7 @@ static void UpdateIndexRelation(Oid indexoid, Oid heapoid,
 	bool isvalid,
 	bool isready);
 static void index_update_stats(Relation rel,
-   bool hasindex, bool isprimary,
+   bool hasindex,
    double reltuples);
 static void IndexCheckExclusion(Relation heapRelation,
 	Relation indexRelation,
@@ -1162,7 +1162,6 @@ index_create(Relation heapRelation,
 		 */
 		index_update_stats(heapRelation,
 		   true,
-		   isprimary,
 		   -1.0);
 		/* Make the above update visible */
 		CommandCounterIncrement();
@@ -1364,21 +1363,6 @@ index_constraint_create(Relation heapRelation,
 			 InvalidOid, conOid, indexRelationId, true);
 	}
 
-	/*
-	 * If needed, mark the table as having a primary key.  We assume it can't
-	 * have been so marked already, so no need to clear the flag in the other
-	 * case.
-	 *
-	 * Note: this might better be done by callers.  We do it here to avoid
-	 * exposing index_update_stats() globally, but that wouldn't be necessary
-	 * if relhaspkey went away.
-	 */
-	if (mark_as_primary)
-		index_update_stats(heapRelation,
-		   true,
-		   true,
-		   -1.0);
-
 	/*
 	 * If needed, mark the index as primary and/or deferred in pg_index.
 	 *
@@ -2041,7 +2025,6 @@ FormIndexDatum(IndexInfo *indexInfo,
  * to ensure we can do all the necessary work in just one update.
  *
  * hasindex: set relhasindex to this value
- * isprimary: if true, set relhaspkey true; else no change
  * reltuples: if >= 0, set reltuples to this value; else no change
  *
  * If reltuples >= 0, relpages and relallvisible are also updated (using
@@ -2058,7 +2041,6 @@ FormIndexDatum(IndexInfo *indexInfo,
 static void
 index_update_stats(Relation rel,
    bool hasindex,
-   bool isprimary,
    double reltuples)
 {
 	Oid			relid = RelationGetRelid(rel);
@@ -2088,7 +2070,7 @@ index_update_stats(Relation rel,
 	 * It is safe to use a non-transactional update even though our
 	 * transaction could still fail

Re: Fixes for missing schema qualifications

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 07:30:23PM -0700, David G. Johnston wrote:
> On Tue, Mar 13, 2018 at 6:50 PM, Michael Paquier 
> wrote:
>> To simplify user's life, we
>> could also recommend just to users to issue a ALTER FUNCTION SET
>> search_path to fix the problem for all functions, that's easier to
>> digest.
> 
> I'm unclear as to what scope you are suggesting the above advice (and
> option #1) apply to.  All pg_catalog/information_schema functions or all
> functions including those created by users?
> 

I am suggesting that to keep simple the release notes of the next minor
versions, but still patch information_schema.sql with the changes based
on operator(pg_catalog.foo) for all functions internally as as new
deployments get the right call.
--
Michael


signature.asc
Description: PGP signature

Re: add queryEnv to ExplainOneQuery_hook

2018-03-13 Thread Thomas Munro

On Wed, Mar 14, 2018 at 2:57 PM, Jim Finnerty  wrote:
> Passing NULL in place of queryEnv in PG10 causes a failure in installcheck
> tests portals and plpgsql.
>
> For PG10, if you want a both an ExplainOneQuery hook and clean run of
> installcheck, is there a better workaround than to (a) pass NULL in place of
> queryEnv, and (b) to comment out the portals and plpgsql tests?

Hi Jim,

I can't think of a good way right now.  It's unfortunate that we
couldn't back-patch 4d41b2e0 because 10 was out the door; but perhaps
you can?

Hmm.  I suppose we could have invented a new extended hook with a
different name and back-patched it so that PG10 would support both.
Then binary compatibility with existing compiled extensions wouldn't
be affected AFAICS, but you could use the new extended hook in (say)
10.4 or higher.  Then for PG11 (or later) we could remove the old hook
and just keep the new one.  I suppose that option is still technically
open to us, though I'm not sure of the committers' appetite for messing
with back branches like that.

-- 
Thomas Munro
http://www.enterprisedb.com

Re: planner bug regarding lateral and subquery?

2018-03-13 Thread Tatsuro Yamada


Hi Stephen,


On 2018/03/14 12:36, Stephen Frost wrote:

Greetings,

* Tatsuro Yamada (yamada.tats...@lab.ntt.co.jp) wrote:

I found a bug, maybe.


I don't think so...


* Result of Select: failed

# select
   subq_1.c0
from
   test as ref_0,
   lateral (select subq_0.c0 as c0
from
 (select ref_0.c2 as c0,
 (select c1 from test) as c1 from test as ref_1
where (select c3 from test) is NULL) as subq_0
right join test as ref_2
on (subq_0.c1 = ref_2.c1 )) as subq_1;

ERROR:  more than one row returned by a subquery used as an expression


You don't need LATERAL or anything complicated to reach that error,
simply do:

=*> select * from test where (select c1 from test) is null;
ERROR:  more than one row returned by a subquery used as an expression

The problem there is that the WHERE clause is trying to evaluate an
expression, which is "(select c1 from test) is null" and you aren't
allowed to have multiple rows returned from that subquery (otherwise,
how would we know which row to compare in the expression..?).

If you're actually intending to refer to the 'c3' column from the test
through the lateral join, you would just refer to it as 'ref_0.c3', as
you do in another part of that query.


Thanks for your reply.

The query is not useful for me and it's just a test query for planner
because it is made by sqlsmith. :)
My question is that was it possible to handle the error only in
executer phase? I expected that it is checked in parsing or planning phase.


Thanks,
Tatsuro Yamada

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 01:28:17PM -0400, Tom Lane wrote:
> David Steele  writes:
>> On 3/12/18 3:28 AM, Michael Paquier wrote:
>>> In pg_rewind and pg_resetwal, isn't that also a portion which is not
>>> necessary without the group access feature?
> 
>> These seem like a good idea to me with or without patch 03.  Some of our
>> front-end tools (initdb, pg_upgrade) were setting umask and others
>> weren't.  I think it's more consistent (and safer) if they all do, at
>> least if they are writing into PGDATA.
> 
> +1 ... see a926eb84e for an example of how easy it is to screw up if
> the process's overall umask is permissive.

Okay.  A suggestion that I have here would be to split those extra calls
into a separate patch.  That's a useful self-contained improvement.
--
Michael


signature.asc
Description: PGP signature

Re: PATCH: Configurable file mode mask

2018-03-13 Thread Michael Paquier

On Tue, Mar 13, 2018 at 12:19:07PM -0400, David Steele wrote:
> I'll attach new patches in a reply to [1] once I have made the changes
> Tom requested.

Cool, thanks for your patience.  Looking forward to seeing those.  I'll
spend time on 0003 at the same time.  Let's also put the additional
umask calls for pg_rewind and pg_resetwal into a separate patch. 
--
Michael


signature.asc
Description: PGP signature

1 2 >

1 - 100 of 120 matches

Mail list logo