Re: parallel append vs. simple UNION ALL

2018-03-07 Thread Rajkumar Raghuwanshi
On Thu, Mar 8, 2018 at 12:27 AM, Robert Haas  wrote:

> New patches attached, fixing all 3 of the issues you reported:
>

Thanks. new patches applied cleanly on head and fixing all reported issue.

Thanks & Regards,
Rajkumar Raghuwanshi
QMG, EnterpriseDB Corporation


Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-07 Thread Ashutosh Bapat
On Wed, Mar 7, 2018 at 8:07 PM, Jeevan Chalke
 wrote:
Here are some more review comments esp. on
try_partitionwise_grouping() function. BTW name of that function
doesn't go in sync with enable_partitionwise_aggregation (which btw is
in sync with enable_fooagg GUCs). But it goes in sync with
create_grouping_paths(). It looks like we have already confused
aggregation and grouping e.g. enable_hashagg may affect path creation
when there is no aggregation involved i.e. only grouping is involved
but create_grouping_paths will create paths when there is no grouping
involved. Generally it looks like the function names use grouping to
mean both aggregation and grouping but GUCs use aggregation to mean
both of those. So, the naming in this patch looks consistent with the
current naming conventions.

+grouped_rel->part_scheme = input_rel->part_scheme;
+grouped_rel->nparts = nparts;
+grouped_rel->boundinfo = input_rel->boundinfo;
+grouped_rel->part_rels = part_rels;

You need to set the part_exprs which will provide partition keys for this
partitioned relation. I think, we should include all the part_exprs of
input_rel which are part of GROUP BY clause. Since any other expressions in
part_exprs are not part of GROUP BY clause, they can not appear in the
targetlist without an aggregate on top. So they can't be part of the partition
keys of the grouped relation.

In create_grouping_paths() we fetch both partial as well as fully grouped rel
for given input relation. But in case of partial aggregation, we don't need
fully grouped rel since we are not computing full aggregates for the children.
Since fetch_upper_rel() creates a relation when one doesn't exist, we are
unnecessarily creating fully grouped rels in this case. For thousands of
partitions that's a lot of memory wasted.

I see a similar issue with create_grouping_paths() when we are computing only
full aggregates (either because partial aggregation is not possible or because
parallelism is not possible). In that case, we unconditionally create partially
grouped rels. That too would waste a lot of memory.

I think that partially_grouped_rel, when required, is partitioned irrespective
of whether we do full aggregation per partition or not. So, if we have its
part_rels and other partitioning properties set. I agree that right now we
won't use this information anywhere. It may be useful, in case we device a way
to use partially_grouped_rel directly without using grouped_rel for planning
beyond grouping, which seems unlikely.

+
+/*
+ * Parallel aggregation requires partial target, so compute it here
+ * and translate all vars. For partial aggregation, we need it
+ * anyways.
+ */
+partial_target = make_partial_grouping_target(root, target,
+  havingQual);

Don't we have this available in partially_grouped_rel?

That shows one asymmetry that Robert's refactoring has introduced. We don't set
reltarget of grouped_rel but set reltarget of partially_grouped_rel. If
reltarget of grouped_rel changes across paths (the reason why we have not set
it in grouped_rel), shouldn't reltarget of partially grouped paths change
accordingly?

+
+/*
+ * group_by_has_partkey
+ *
+ * Returns true, if all the partition keys of the given relation are part of
+ * the GROUP BY clauses, false otherwise.
+ */
+static bool
+group_by_has_partkey(RelOptInfo *input_rel, PathTarget *target,
+ List *groupClause)

We could modify this function to return the list of part_exprs which are part
of group clause and use that as the partition keys of the grouped_rel if
required. If group by doesn't have all the partition keys, the function would
return a NULL list.

Right now, in case of full aggregation, partially_grouped_rel is populated with
the partial paths created by adding partial aggregation to the partial paths of
input relation. But we are not trying to create partial paths by (parallel)
appending the (non)partial paths from the child partially_grouped_rel. Have we
thought about that? Would such paths have different shapes from the ones that
we create now and will they be better?

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: csv format for psql

2018-03-07 Thread Pavel Stehule
2018-03-07 22:16 GMT+01:00 David Fetter :

> On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote:
> > 2018-03-07 21:31 GMT+01:00 Daniel Verite :
> >
> > > David Fetter wrote:
> > >
> > > > We have some inconsistency here in that fewer table formats are
> > > > supported, but I think asciidoc, etc., do this correctly via
> > > > invocations like:
> > > >
> > > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
> > >
> > > -A is equivalent to -P format=unaligned, so in the above
> > > invocation, it cancels the effect of -P format=asciidoc.
> > > Anyway -P format=name on the command line
> > > is the same as "\pset format name" as a
> > > metacommand, so it works for all formats.
> > >
> > > Some formats (unaligned, html)  have corresponding
> > > command-line options (-A, -H), and others don't.
> > > In this patch, -C is used so that csv would be in the
> > > category of formats that can be switched on with the simpler
> > > invocation on the command line.
> > > If we don't like that, we can leave out -C for future use
> > > and let users write -P format=csv.
> > > That's not the best choice from my POV though, as csv
> > > is a primary choice to export tabular data.
> > >
> >
> > -C can be used for certificates or some similar. I like csv, but I am not
> > sure, so it is too important to get short option (the list of free chars
> > will be only shorter)
>
> +1 for not using up a single-letter option for this.
>

Is there some rule, so alone long options are disallowed? When this
software will be more mature, then we cannot to find "inteligent" short
option for lot of tasks.

Regards

Pavel


>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: INOUT parameters in procedures

2018-03-07 Thread Pavel Stehule
Hi

2018-03-08 1:53 GMT+01:00 Peter Eisentraut :

> On 3/6/18 04:22, Pavel Stehule wrote:
> > why just OUT variables are disallowed?
> >
> > The oracle initializes these values to NULL - we can do same?
>
> The problem is function call resolution.  If we see a call like
>
> CALL foo(a, b, c);
>
> the this could be foo() with zero input and three output parameters, or
> with one input parameter and two output parameters, etc.  We have no
> code to deal with that right now.
>

It looks like some error in this concept. The rules for enabling
overwriting procedures should modified, so this collision should not be
done.

When I using procedure from PL/pgSQL, then it is clear, so I place on *OUT
position variables. But when I call procedure from top, then I'll pass fake
parameters to get some result.

CREATE OR REPLACE PROCEDURE proc(IN a, OUT x, OUT y)
AS $$
BEGIN
  x := a * 10;
  y := a + 10;
END;
$$ LANGUAGE plpgsql;

CALL proc(10) -- has sense

but because just OUT variables are not possible, then the definition must
be changed to CREATE OR REPLACE PROCEDURE proc(IN a, INOUT x, INOUT y)

and CALL proc(10, NULL, NULL) -- looks little bit scarry

I understand so this is not easy solution (and it can be topic for other
releases), but I am thinking so it is solvable - but needs deeper change in
part, where is a routine is selected on signature. Now, this algorithm
doesn't calculate with OUT params.

This enhancing can be interesting for some purposes (and again it can helps
with migration from Oracle - although these techniques are usually used
inside system libraries):

a) taking more info from proc when it is required

PROCEDURE foo(a int);
PROCEDURE foo(a int, OUT detail text)

b) possible to directly specify expected result type

PROCEDURE from_json(a json, OUT int);
PROCEDURE from_json(a json, OUT date);
PROCEDURE from_json(a json, OUT text);

It is clear, so in environments when variables are not available, these
procedures cannot be called doe possible ambiguity.

This point can be closed now, I accept technical limits.




>
> > Minimally this message is not too friendly, there should be hint - "only
> > INOUT is suported" - but better support OUT too - from TOP OUT variables
> > should not be passed. from PL should be required.
>
> Added a hint.
>

ok


>
> > I wrote recursive procedure. The call finished by exception. Why?
>
> Fixed. (memory context issue)
>

tested, it is ok now


>
> I added your example as a test case.
>
> > This issue can be detected in compile time, maybe?
> >
> > postgres=# create or replace procedure p(x int,inout a int, inout b
> numeric)
> > as $$
> > begin raise notice 'xxx % %', a, b;if (x > 1) then
> >   a := x / 10;
> >   b := x / 2; call p(b::int, a, 10); <--- can be detected in compile
> time?
> > end if;
> > end;
> > $$ language plpgsql;
>
> Function resolution doesn't happen at compile time.  That would require
> significant work in PL/pgSQL (possible perhaps, but major work).  Right
> now, we do parse analysis at first execution.
>

ok, understand

looks well

all test passed,
code is well commented,
there are tests

   if (argmodes && (argmodes[i] == PROARGMODE_INOUT ||
argmodes[i] == PROARGMODE_OUT))
+   {
+   Param  *param;

Because PROARGMODE_OUT are disallowed, then this check is little bit messy.
Please, add some comment.

Regards

Pavel




>
> --
> Peter Eisentraut  http://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>


Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Pavan Deolasee
On Thu, Mar 8, 2018 at 12:31 PM, Amit Kapila 
wrote:

> On Thu, Mar 8, 2018 at 11:04 AM, Pavan Deolasee
>  wrote:
> >
> > On Tue, Feb 13, 2018 at 12:41 PM, amul sul  wrote:
> >>
> >> Thanks for the confirmation, updated patch attached.
> >>
> >
> > I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch
> does not
> > do anything to deal with the fact that t_ctid may no longer point to
> itself
> > to mark end of the chain. I just can't see how that would work.
> >
>
> I think it is not that patch doesn't care about the end of the chain.
>  For example, ctid pointing to itself is used to ensure that for
> deleted rows, nothing more needs to be done like below check in the
> ExecUpdate/ExecDelete code path.
>

Yeah, but it only looks for places where it needs to detect deleted tuples
and thus wants to throw an error. I am worried about other places where it
is assumed that the ctid points to a valid looking tid, self or otherwise.
I see no such places being either updated or commented.

Now may be there is no danger because of other protections in place, but it
looks hazardous.


>
> I have not tested, but it seems this could be problematic, but I feel
> we could deal with such cases by checking invalid block id in the
> above if check.  Another one such case is in EvalPlanQualFetch
>
>
Right.


>
> > What happens if a partition key update deletes a row, but the operation
> is
> > aborted? Do we need any special handling for that case?
> >
>
> If the transaction is aborted than future updates would update the
> ctid to a new row, do you see any problem with it?
>

I don't know. May be there is none. But it needs to explained why it's not
a problem.


>
> > I am actually worried that we're tinkering with ip_blkid to handle one
> > corner case of detecting partition key update. This is going to change
> > on-disk format and probably need more careful attention. Are we certain
> that
> > we would never require update-chain following when partition keys are
> > updated?
> >
>
> I think we should never need update-chain following when the row is
> moved from one partition to another partition, otherwise, we don't
> change anything on the tuple.
>

I am not sure I follow. I understand that it's probably a tough problem to
follow update chain from one partition to another. But why do you think we
would never need that? What if someone wants to improve on the restriction
this patch is imposing and actually implement partition key UPDATEs the way
we do for regular tables i.e. instead of throwing error, we actually
update/delete the row in the new partition?


>
> > If so, can we think about some other mechanism which actually even
> > leaves behind ? I am not saying we should do
> that,
> > but it warrants a thought.
> >
>
> Oh, this would much bigger disk-format change and need much more
> thoughts, where will we store new partition information.
>

Yeah, but the disk format will change probably change just once. Or may be
this can be done local to a partition table without requiring any disk
format changes? Like adding a nullable hidden column in each partition to
store the forward pointer?

Thanks,
Pavan


Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Amit Khandekar
On 8 March 2018 at 12:34, Amit Kapila  wrote:
> On Thu, Mar 8, 2018 at 11:57 AM, Amit Khandekar  
> wrote:
>> On 8 March 2018 at 09:15, Pavan Deolasee  wrote:
>>> For example, with your patches applied:
>>>
>>> CREATE TABLE pa_target (key integer, val text)
>>> PARTITION BY LIST (key);
>>> CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1);
>>> CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2);
>>> INSERT INTO pa_target VALUES (1, 'initial1');
>>>
>>> session1:
>>> BEGIN;
>>> UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1;
>>> UPDATE 1
>>> postgres=# SELECT * FROM pa_target ;
>>>  key | val
>>> -+-
>>>1 | initial1 updated by update1
>>> (1 row)
>>>
>>> session2:
>>> UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE
>>> key = 1
>>> 
>>>
>>> session1:
>>> postgres=# COMMIT;
>>> COMMIT
>>>
>>> 
>>>
>>> postgres=# SELECT * FROM pa_target ;
>>>  key | val
>>> -+-
>>>2 | initial1 updated by update2
>>> (1 row)
>>>
>>> Ouch. The committed updates by session1 are overwritten by session2. This
>>> clearly violates the rules that rest of the system obeys and is not
>>> acceptable IMHO.
>>>
>>> Clearly, ExecUpdate() while moving rows between partitions is missing out on
>>> re-constructing the to-be-updated tuple, based on the latest tuple in the
>>> update chain. Instead, it's simply deleting the latest tuple and inserting a
>>> new tuple in the new partition based on the old tuple. That's simply wrong.
>>
>> You are right. This need to be fixed. This is a different issue than
>> the particular one that is being worked upon in this thread, and both
>> these issues have different fixes.
>>
>
> I also think that this is a bug in the original patch and won't be
> directly related to the patch being discussed.

Yes. Will submit a patch for this in a separate thread.





-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



Re: PATCH: psql tab completion for SELECT

2018-03-07 Thread Edmund Horner
Some updates on patch status:
  - "version-dependent-tab-completion-1.patch" by Tom Lane was committed in 
722408bcd.
  - "psql-tab-completion-savepoint-v1.patch" by Edmund Horner is probably not 
needed.
  - "psql-select-tab-completion-v6.patch" (the latest) is still under 
development/review.

Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Amit Kapila
On Thu, Mar 8, 2018 at 11:57 AM, Amit Khandekar  wrote:
> On 8 March 2018 at 09:15, Pavan Deolasee  wrote:
>> For example, with your patches applied:
>>
>> CREATE TABLE pa_target (key integer, val text)
>> PARTITION BY LIST (key);
>> CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1);
>> CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2);
>> INSERT INTO pa_target VALUES (1, 'initial1');
>>
>> session1:
>> BEGIN;
>> UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1;
>> UPDATE 1
>> postgres=# SELECT * FROM pa_target ;
>>  key | val
>> -+-
>>1 | initial1 updated by update1
>> (1 row)
>>
>> session2:
>> UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE
>> key = 1
>> 
>>
>> session1:
>> postgres=# COMMIT;
>> COMMIT
>>
>> 
>>
>> postgres=# SELECT * FROM pa_target ;
>>  key | val
>> -+-
>>2 | initial1 updated by update2
>> (1 row)
>>
>> Ouch. The committed updates by session1 are overwritten by session2. This
>> clearly violates the rules that rest of the system obeys and is not
>> acceptable IMHO.
>>
>> Clearly, ExecUpdate() while moving rows between partitions is missing out on
>> re-constructing the to-be-updated tuple, based on the latest tuple in the
>> update chain. Instead, it's simply deleting the latest tuple and inserting a
>> new tuple in the new partition based on the old tuple. That's simply wrong.
>
> You are right. This need to be fixed. This is a different issue than
> the particular one that is being worked upon in this thread, and both
> these issues have different fixes.
>

I also think that this is a bug in the original patch and won't be
directly related to the patch being discussed.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Craig Ringer
On 8 March 2018 at 12:34, Tom Lane  wrote:

> Craig Ringer  writes:
> > As I understand it, because we allow multiple Pg instances on a system,
> we
> > identify the small sysv shmem segment we use by the postmaster's pid. If
> > you remove the DirLockFile (postmaster.pid) you remove the interlock
> > against starting a new postmaster. It'll think it's a new independent
> > instance on the same host, make a new shmem segment and go merrily on its
> > way mangling data horribly.
>
> Yeah.  If we realized that the old shmem segment was associated with this
> data directory, we could check for processes still attached to it ... but
> the lock file is exactly where that association is kept.
>
> > It'd be nice if the OS offered us some support here. Something like
> opening
> > a lockfile in exclusive lock mode, then inheriting the FD and lock on all
> > children, with each child inheriting the lock. So the exclusive lock
> > wouldn't get released until all FDs associated with it are released. But
> > AFAIK nothing like that is present, let alone portable.
>
> I've wondered about that too.  An inheritable lock on the data directory
> itself would be ideal, but that doesn't exist anywhere AFAIK.  We could
> imagine taking a BSD-style-flock(2) lock on some lock file, on systems that
> have that; but not all do, so it couldn't be our only protection.  Another
> problem is that since it'd have to be an ordinary file, there'd still be a
> risk of cowboy DBAs removing the lock file.  Maybe we could use pg_control
> as the lock file?
>

"The error mentioned that there was another active database using this
pg_control file, so I deleted it, and now my database won't start."

Like our discussion about pg_resetxlog, there's a limit to how much
operator error we can guard against. More importantly we'd have to be
_very_ sure it was correct or our attempts to protect the user would risk
creating a major outage.

That said, flock(2) with LOCK_NB seems to have just the semantics we want:

"
   A single file may not simultaneously have both shared and exclusive
locks.

   Locks  created by flock() are associated with an open file
description (see open(2)).  This means that duplicate file descriptors
(created by, for example, fork(2) or dup(2)) refer to the same lock,
   and this lock may be modified or released using any of these file
descriptors.  Furthermore, the lock is released either by an explicit
LOCK_UN operation on any of these duplicate file  descriptors,
   or when all such file descriptors have been closed.
"


Worth looking into, but really important not to make a mistake and make
things worse.


-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Amit Kapila
On Thu, Mar 8, 2018 at 11:04 AM, Pavan Deolasee
 wrote:
>
> On Tue, Feb 13, 2018 at 12:41 PM, amul sul  wrote:
>>
>> Thanks for the confirmation, updated patch attached.
>>
>
> I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch does not
> do anything to deal with the fact that t_ctid may no longer point to itself
> to mark end of the chain. I just can't see how that would work.
>

I think it is not that patch doesn't care about the end of the chain.
 For example, ctid pointing to itself is used to ensure that for
deleted rows, nothing more needs to be done like below check in the
ExecUpdate/ExecDelete code path.
if (!ItemPointerEquals(tupleid, &hufd.ctid))
{
..
}
..

It will deal with such cases by checking invalidblockid before these
checks.  So, we should be fine in such cases.


> But if it
> does, it needs good amount of comments explaining why and most likely
> updating comments at other places where chain following is done. For
> example, how's this code in heap_get_latest_tid() is still valid? Aren't we
> setting "ctid" to some invalid value here?
>
> 2302 /*
> 2303  * If there's a valid t_ctid link, follow it, else we're done.
> 2304  */
> 2305 if ((tp.t_data->t_infomask & HEAP_XMAX_INVALID) ||
> 2306 HeapTupleHeaderIsOnlyLocked(tp.t_data) ||
> 2307 ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid))
> 2308 {
> 2309 UnlockReleaseBuffer(buffer);
> 2310 break;
> 2311 }
> 2312
> 2313 ctid = tp.t_data->t_ctid;
>

I have not tested, but it seems this could be problematic, but I feel
we could deal with such cases by checking invalid block id in the
above if check.  Another one such case is in EvalPlanQualFetch

> This is just one example. I am almost certain there are many such cases that
> will require careful attention.
>

Right, I think we should be able to detect and fix such cases.

> What happens if a partition key update deletes a row, but the operation is
> aborted? Do we need any special handling for that case?
>

If the transaction is aborted than future updates would update the
ctid to a new row, do you see any problem with it?

> I am actually worried that we're tinkering with ip_blkid to handle one
> corner case of detecting partition key update. This is going to change
> on-disk format and probably need more careful attention. Are we certain that
> we would never require update-chain following when partition keys are
> updated?
>

I think we should never need update-chain following when the row is
moved from one partition to another partition, otherwise, we don't
change anything on the tuple.

> If so, can we think about some other mechanism which actually even
> leaves behind ? I am not saying we should do that,
> but it warrants a thought.
>

Oh, this would much bigger disk-format change and need much more
thoughts, where will we store new partition information.

>  May be it was discussed somewhere else and ruled
> out.

There were a couple of other options discussed in the original thread
"UPDATE of partition key".  One of them was to have an additional bit
on the tuple, but we found reusing ctid a better approach.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com



Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Amit Khandekar
On 8 March 2018 at 09:15, Pavan Deolasee  wrote:
> For example, with your patches applied:
>
> CREATE TABLE pa_target (key integer, val text)
> PARTITION BY LIST (key);
> CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1);
> CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2);
> INSERT INTO pa_target VALUES (1, 'initial1');
>
> session1:
> BEGIN;
> UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1;
> UPDATE 1
> postgres=# SELECT * FROM pa_target ;
>  key | val
> -+-
>1 | initial1 updated by update1
> (1 row)
>
> session2:
> UPDATE pa_target SET val = val || ' updated by update2', key = key + 1 WHERE
> key = 1
> 
>
> session1:
> postgres=# COMMIT;
> COMMIT
>
> 
>
> postgres=# SELECT * FROM pa_target ;
>  key | val
> -+-
>2 | initial1 updated by update2
> (1 row)
>
> Ouch. The committed updates by session1 are overwritten by session2. This
> clearly violates the rules that rest of the system obeys and is not
> acceptable IMHO.
>
> Clearly, ExecUpdate() while moving rows between partitions is missing out on
> re-constructing the to-be-updated tuple, based on the latest tuple in the
> update chain. Instead, it's simply deleting the latest tuple and inserting a
> new tuple in the new partition based on the old tuple. That's simply wrong.

You are right. This need to be fixed. This is a different issue than
the particular one that is being worked upon in this thread, and both
these issues have different fixes.

Like you said, the tuple needs to be reconstructed when ExecDelete()
finds that the row has been updated by another transaction. We should
send back this information from ExecDelete() (I think tupleid
parameter gets updated in this case), and then in ExecUpdate() we
should goto lreplace, so that the the row is fetched back similar to
how it happens when heap_update() knows that the tuple was updated.

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company



Re: ALTER TABLE ADD COLUMN fast default

2018-03-07 Thread Andrew Dunstan
On Tue, Mar 6, 2018 at 8:15 PM, David Rowley
 wrote:
> On 6 March 2018 at 22:40, David Rowley  wrote:
>> Okay, it looks like the patch should disable physical tlists when
>> there's a missing column the same way as we do for dropped columns.
>> Patch attached.
>
> Please disregard the previous patch in favour of the attached.
>


OK, nice work, thanks. We're making good progress. Two of the cases I
was testing have gone from worse than master to better than master.
Here are the results I got, all against Tomas' 100-col 64-row table.
using pgbench -T 100:

 select sum(c1000) from t;
 fastdef tps = 4724.988619
 master  tps = 1590.843085
 
 select c1000 from t;
 fastdef tps = 5093.667203
 master  tps = 2437.613368
 
 select sum(c1000) from (select c1000 from t offset 0) x;
 fastdef tps = 3315.900091
 master  tps = 2067.574581
 
 select * from t;
 fastdef tps = 107.145811
 master  tps = 150.207957
 
 select sum(c1), count(c500), avg(c1000) from t;
 fastdef tps = 2304.636410
 master  tps = 1409.791975
 
 select sum(c10) from t;
 fastdef tps = 4332.625917
 master  tps = 2208.757119

"select * from t" used to be about a wash, but with this patch it's
got worse. The last two queries were worse and are now better, so
that's a win. I'm going to do a test to see if I can find the
break-even point between the second query and the fourth. If it turns
out to be at quite a large number of columns selected then I think it
might be something we can live with.

cheers

andrew


-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-07 Thread Jeff Janes
On Sun, Mar 4, 2018 at 3:18 PM, David Gould  wrote:

> On Sun, 4 Mar 2018 07:49:46 -0800
> Jeff Janes  wrote:
>
> > On Wed, Jan 17, 2018 at 4:49 PM, David Gould  wrote:
> ...
> >
> > Maybe a well-timed crash caused n_dead_tup to get reset to zero and that
> is
> > why autovac is not kicking in?  What are the pg_stat_user_table number
> and
> > the state of the visibility map for your massively bloated table, if you
> > still have them?
>
> ...
>


> The main pain points are that when reltuples gets inflated there is no way
> to fix it, auto vacuum stops looking at the table and hand run ANALYZE
> can't
> reset the reltuples. The only cure is VACUUM FULL, but that is not really
> practical without unacceptable amounts of downtime.
>

But why won't an ordinary manual VACUUM (not FULL) fix it?  That seems like
that is a critical thing to figure out.

As for preventing it in the first place, based on your description of your
hardware and operations, I was going to say you need to increase the max
number of autovac workers, but then I remembered you from "Autovacuum slows
down with large numbers of tables. More workers makes it slower" (
https://www.postgresql.org/message-id/20151030133252.3033.4249%40wrigleys.postgresql.org).
So you are probably still suffering from that?  Your patch from then seemed
to be pretty invasive and so controversial.  I had a trivial but fairly
effective patch at the time, but it now less trivial because of how shared
catalogs are dealt with (commit 15739393e4c3b64b9038d75) and I haven't
rebased it over that issue.

Cheers,

Jeff


Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Pavan Deolasee
On Tue, Feb 13, 2018 at 12:41 PM, amul sul  wrote:

>
> Thanks for the confirmation, updated patch attached.
>
>
I am actually very surprised that 0001-Invalidate-ip_blkid-v5.patch does
not do anything to deal with the fact that t_ctid may no longer point to
itself to mark end of the chain. I just can't see how that would work. But
if it does, it needs good amount of comments explaining why and most likely
updating comments at other places where chain following is done. For
example, how's this code in heap_get_latest_tid() is still valid? Aren't we
setting "ctid" to some invalid value here?

2302 /*
2303  * If there's a valid t_ctid link, follow it, else we're done.
2304  */
2305 if ((tp.t_data->t_infomask & HEAP_XMAX_INVALID) ||
2306 HeapTupleHeaderIsOnlyLocked(tp.t_data) ||
2307 ItemPointerEquals(&tp.t_self, &tp.t_data->t_ctid))
2308 {
2309 UnlockReleaseBuffer(buffer);
2310 break;
2311 }
2312
2313 ctid = tp.t_data->t_ctid;

This is just one example. I am almost certain there are many such cases
that will require careful attention.

What happens if a partition key update deletes a row, but the operation is
aborted? Do we need any special handling for that case?

I am actually worried that we're tinkering with ip_blkid to handle one
corner case of detecting partition key update. This is going to change
on-disk format and probably need more careful attention. Are we certain
that we would never require update-chain following when partition keys are
updated? If so, can we think about some other mechanism which actually even
leaves behind ? I am not saying we should do that,
but it warrants a thought. May be it was discussed somewhere else and ruled
out. I happened to notice this patch because of the bug I encountered.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Testbed for predtest.c ... and some arguable bugs therein

2018-03-07 Thread Tom Lane
In the thread about being able to prove constraint exclusion from
an IN clause mentioning a NULL,
https://www.postgresql.org/message-id/flat/3bad48fc-f257-c445-feeb-8a2b2fb62...@lab.ntt.co.jp
I expressed concern about whether there were existing bugs in predtest.c
given the lack of clarity of the comments around recent changes to it.
I also noticed that coverage.postgresql.org shows we don't have great
test coverage for it.  This led me to feel that it'd be worth having
a testbed that would allow directly exercising the predtest code, rather
than having to construct queries whose plans would change depending on
the outcome of a predtest proof.

A bit of hacking later, I have the attached.  The set of test cases it
includes at the moment were mostly developed with an eye to getting to
full code coverage of predtest.c, but we could add more later.  What's
really interesting is that it proves that the "weak refutation" logic,
i.e. predicate_refuted_by() with clause_is_check = true, is not
self-consistent.  Now as far as I can tell, we are not using that
option anywhere yet, so this is just a latent problem not a live one.
Also, it's not very clear what semantics we'd be needing if we did have
a use for the case.  Presumably the starting assumption is "clause does
not return false", but do we need to prove "predicate returns false",
or just "predicate does not return true"?  I'm not sure, TBH, but if
it's the former then we're going to have results like "X does not refute
NOT X", which seems weird and is certainly not how that code acts now.
So I made the testbed assume that we're supposed to prove ""predicate does
not return true", and the conclusion is that the code mostly behaves
that way, but there are cases where it incorrectly claims a proof, and
more cases where it fails to prove relationships it perhaps could.

I'm not sure that that's worth fixing right now.  Instead I'm tempted
to revert the addition of the clause_is_check argument to
predicate_refuted_by, on the grounds that it's both broken and currently
unnecessary.

Anyway, barring objections, I'd like to push this, and then we can use
it to carry a test for the null-in-IN fix whenever that lands.

regards, tom lane

diff --git a/src/test/modules/test_predtest/.gitignore b/src/test/modules/test_predtest/.gitignore
index ...5dcb3ff .
*** a/src/test/modules/test_predtest/.gitignore
--- b/src/test/modules/test_predtest/.gitignore
***
*** 0 
--- 1,4 
+ # Generated subdirectories
+ /log/
+ /results/
+ /tmp_check/
diff --git a/src/test/modules/test_predtest/Makefile b/src/test/modules/test_predtest/Makefile
index ...1b50fa3 .
*** a/src/test/modules/test_predtest/Makefile
--- b/src/test/modules/test_predtest/Makefile
***
*** 0 
--- 1,21 
+ # src/test/modules/test_predtest/Makefile
+ 
+ MODULE_big = test_predtest
+ OBJS = test_predtest.o $(WIN32RES)
+ PGFILEDESC = "test_predtest - test code for optimizer/util/predtest.c"
+ 
+ EXTENSION = test_predtest
+ DATA = test_predtest--1.0.sql
+ 
+ REGRESS = test_predtest
+ 
+ ifdef USE_PGXS
+ PG_CONFIG = pg_config
+ PGXS := $(shell $(PG_CONFIG) --pgxs)
+ include $(PGXS)
+ else
+ subdir = src/test/modules/test_predtest
+ top_builddir = ../../../..
+ include $(top_builddir)/src/Makefile.global
+ include $(top_srcdir)/contrib/contrib-global.mk
+ endif
diff --git a/src/test/modules/test_predtest/README b/src/test/modules/test_predtest/README
index ...2c9bec0 .
*** a/src/test/modules/test_predtest/README
--- b/src/test/modules/test_predtest/README
***
*** 0 
--- 1,28 
+ test_predtest is a module for checking the correctness of the optimizer's
+ predicate-proof logic, in src/backend/optimizer/util/predtest.c.
+ 
+ The module provides a function that allows direct application of
+ predtest.c's exposed functions, predicate_implied_by() and
+ predicate_refuted_by(), to arbitrary boolean expressions, with direct
+ inspection of the results.  This could be done indirectly by checking
+ planner results, but it can be difficult to construct end-to-end test
+ cases that prove that the expected results were obtained.
+ 
+ In general, the use of this function is like
+ 	select * from test_predtest('query string')
+ where the query string must be a SELECT returning two boolean
+ columns, for example
+ 
+ 	select * from test_predtest($$
+ 	select x, not x
+ 	from (values (false), (true), (null)) as v(x)
+ 	$$);
+ 
+ The function parses and plans the given query, and then applies the
+ predtest.c code to the two boolean expressions in the SELECT list, to see
+ if the first expression can be proven or refuted by the second.  It also
+ executes the query, and checks the resulting rows to see whether any
+ claimed implication or refutation relationship actually holds.  If the
+ query is designed to exercise the expressions on a full set of possible
+ input values, as in the example above, then this provides a mechanical
+ cross-check as to whether the p

Re: [HACKERS] Another oddity in handling of WCO constraints in postgres_fdw

2018-03-07 Thread Ashutosh Bapat
On Wed, Mar 7, 2018 at 8:55 AM, Stephen Frost  wrote:
> Greetings Robert, Ashutosh, Arthur, Etsuro, all,
>
> * Arthur Zakirov (a.zaki...@postgrespro.ru) wrote:
>> On Tue, Mar 06, 2018 at 08:09:50PM +0900, Etsuro Fujita wrote:
>> > Agreed.  I added a comment to that function.  I think that that comment in
>> > combination with changes to the FDW docs in the patch would help FDW 
>> > authors
>> > understand why that is needed.  Please find attached an updated version of
>> > the patch.
>>
>> Thank you.
>>
>> All tests pass, the documentation builds. There was the suggestion [1]
>> of different approach. But the patch fix the issue in much more simple
>> way.
>>
>> Marked as "Ready for Commiter".
>>
>> 1 - 
>> https://www.postgresql.org/message-id/20171005.200631.134118679.horiguchi.kyotaro%40lab.ntt.co.jp
>
> Thanks,  I've looked through this patch and thread again and continue to
> feel that this is both a good and sensible improvment and that the patch
> is in pretty good shape.
>
> The remaining question is if the subsequent discussion has swayed the
> opinion of Robert and Ashutosh.  If we can get agreement that these
> semantics are acceptable and an improvement over the status quo then I'm
> happy to try and drive this patch to commit.
>
> Robert, Ashutosh?

If there is a local constraint on the foreign table, we don't check
it. So, a row that was inserted through this foreign table may not
show up when selected from the foreign table. Apply same logic to the
WCO on a view on the foreign table, it should be fine if a row
inserted through the view doesn't show up in the view. Somebody who
created the view, knew that it's a foreign table underneath.

Stephen said [1] that the view is local and irrespective of what's
underneath it, it should obey WCO. Which seems to be a fair point when
considered alone, but with the above context, it doesn't look any
fair.

Etsuro said [2] that WCO constraints can not be implemented on foreign
server and normal check constraints can be, and for that he provides
an example in [3]. But I think that example is going the wrong
direction. For local constraints to be enforced, we use remote
constraints. For local WCO we need to use remote WCO. That means we
create many foreign tables pointing to same local table on the foreign
server through many views, but it's not impossible.


[1] 
https://www.postgresql.org/message-id/20180117130021.GC2416%40tamriel.snowman.net
[2] https://www.postgresql.org/message-id/5A058F21.2040201%40lab.ntt.co.jp
[3] 
https://www.postgresql.org/message-id/a3955a1d-ad07-5b0a-7618-b6ef5ff0e1c5%40lab.ntt.co.jp

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company



Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Tom Lane
Craig Ringer  writes:
> As I understand it, because we allow multiple Pg instances on a system, we
> identify the small sysv shmem segment we use by the postmaster's pid. If
> you remove the DirLockFile (postmaster.pid) you remove the interlock
> against starting a new postmaster. It'll think it's a new independent
> instance on the same host, make a new shmem segment and go merrily on its
> way mangling data horribly.

Yeah.  If we realized that the old shmem segment was associated with this
data directory, we could check for processes still attached to it ... but
the lock file is exactly where that association is kept.

> It'd be nice if the OS offered us some support here. Something like opening
> a lockfile in exclusive lock mode, then inheriting the FD and lock on all
> children, with each child inheriting the lock. So the exclusive lock
> wouldn't get released until all FDs associated with it are released. But
> AFAIK nothing like that is present, let alone portable.

I've wondered about that too.  An inheritable lock on the data directory
itself would be ideal, but that doesn't exist anywhere AFAIK.  We could
imagine taking a BSD-style-flock(2) lock on some lock file, on systems that
have that; but not all do, so it couldn't be our only protection.  Another
problem is that since it'd have to be an ordinary file, there'd still be a
risk of cowboy DBAs removing the lock file.  Maybe we could use pg_control
as the lock file?

regards, tom lane



Re: PATCH: psql tab completion for SELECT

2018-03-07 Thread Edmund Horner
New patch that fixes a little bug with appending NULL addons to schema queries.


psql-select-tab-completion-v6.patch
Description: Binary data


Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Tom Lane
Kyotaro HORIGUCHI  writes:
> At Thu, 8 Mar 2018 00:28:04 +, "Tsunakawa, Takayuki" 
>  wrote in 
> <0A3221C70F24FB45833433255569204D1F8FF0D9@G01JPEXMBYT05>
>> Yes.  We are now facing the problem of too much memory use by PostgreSQL, 
>> where about some applications randomly access about 200,000 tables.  It is 
>> estimated based on a small experiment that each backend will use several to 
>> ten GBs of local memory for CacheMemoryContext.  The total memory use will 
>> become over 1 TB when the expected maximum connections are used.
>> 
>> I haven't looked at this patch, but does it evict all kinds of entries in 
>> CacheMemoryContext, ie. relcache, plancache, etc?

> This works only for syscaches, which could bloat with entries for
> nonexistent objects.

> Plan cache is a utterly deferent thing. It is abandoned at the
> end of a transaction or such like.

When I was at Salesforce, we had *substantial* problems with plancache
bloat.  The driving factor there was plans associated with plpgsql
functions, which Salesforce had a huge number of.  In an environment
like that, there would be substantial value in being able to prune
both the plancache and plpgsql's function cache.  (Note that neither
of those things are "abandoned at the end of a transaction".)

> Relcache is not based on catcache and out of the scope of this
> patch since it doesn't get bloat with nonexistent entries. It
> uses dynahash and we could introduce a similar feature to it if
> we are willing to cap relcache size.

I think if the case of concern is an application with 200,000 tables,
it's just nonsense to claim that relcache size isn't an issue.

In short, it's not really apparent to me that negative syscache entries
are the major problem of this kind.  I'm afraid that you're drawing very
large conclusions from a specific workload.  Maybe we could fix that
workload some other way.

regards, tom lane



Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Craig Ringer
On 8 March 2018 at 10:18, Andres Freund  wrote:

>
>
> On March 7, 2018 5:51:29 PM PST, Craig Ringer 
> wrote:
> >My favourite remains an organisation that kept "fixing" an issue by
> >kill
> >-9'ing the postmaster and removing postmaster.pid to make it start up
> >again. Without killing all the leftover backends. Of course, the system
> >kept getting more unstable and broken, so they did it more and more
> >often.
> >They were working on scripting it when they gave up and asked for help.
>
> Maybe I'm missing something, but that ought to not work. The shmem segment
> that we keep around would be a conflict, no?
>
>
As I understand it, because we allow multiple Pg instances on a system, we
identify the small sysv shmem segment we use by the postmaster's pid. If
you remove the DirLockFile (postmaster.pid) you remove the interlock
against starting a new postmaster. It'll think it's a new independent
instance on the same host, make a new shmem segment and go merrily on its
way mangling data horribly.

See CreateLockFile(). Also 7e2a18a9161 . In
particular src/backend/utils/init/miscinit.c +938,

if (isDDLock)
{
  
if (PGSharedMemoryIsInUse(id1, id2))
ereport(FATAL,
(errcode(ERRCODE_LOCK_FILE_EXISTS),
 errmsg("pre-existing shared memory block "
"(key %lu, ID %lu) is still in use",
id1, id2),
 errhint("If you're sure there are no old "
 "server processes still running,
remove "
 "the shared memory block "
 "or just delete the file \"%s\".",
 filename)));
   
}

I still think that error is a bit optimistic, and should really say "make
very sure there are no 'postgres' processes associated with this data
directory, then ...'


It'd be nice if the OS offered us some support here. Something like opening
a lockfile in exclusive lock mode, then inheriting the FD and lock on all
children, with each child inheriting the lock. So the exclusive lock
wouldn't get released until all FDs associated with it are released. But
AFAIK nothing like that is present, let alone portable.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Kyotaro HORIGUCHI
Hello,

At Thu, 8 Mar 2018 00:28:04 +, "Tsunakawa, Takayuki" 
 wrote in 
<0A3221C70F24FB45833433255569204D1F8FF0D9@G01JPEXMBYT05>
> From: Alvaro Herrera [mailto:alvhe...@alvh.no-ip.org]
> > The thing that comes to mind when reading this patch is that some time ago
> > we made fun of other database software, "they are so complicated to 
> > configure,
> > they have some magical settings that few people understand how to set".
> > Postgres was so much better because it was simple to set up, no magic crap.
> > But now it becomes apparent that that only was so because Postgres sucked,
> > ie., we hadn't yet gotten to the point where we
> > *needed* to introduce settings like that.  Now we finally are?
> 
> Yes.  We are now facing the problem of too much memory use by PostgreSQL, 
> where about some applications randomly access about 200,000 tables.  It is 
> estimated based on a small experiment that each backend will use several to 
> ten GBs of local memory for CacheMemoryContext.  The total memory use will 
> become over 1 TB when the expected maximum connections are used.
> 
> I haven't looked at this patch, but does it evict all kinds of entries in 
> CacheMemoryContext, ie. relcache, plancache, etc?

This works only for syscaches, which could bloat with entries for
nonexistent objects.

Plan cache is a utterly deferent thing. It is abandoned at the
end of a transaction or such like.

Relcache is not based on catcache and out of the scope of this
patch since it doesn't get bloat with nonexistent entries. It
uses dynahash and we could introduce a similar feature to it if
we are willing to cap relcache size.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: [HACKERS] Restrict concurrent update/delete with UPDATE of partition key

2018-03-07 Thread Pavan Deolasee
On Wed, Feb 28, 2018 at 12:38 PM, Rajkumar Raghuwanshi <
rajkumar.raghuwan...@enterprisedb.com> wrote:

> On Wed, Feb 14, 2018 at 5:44 PM, Amit Kapila 
> wrote:
>
>> +# Concurrency error from GetTupleForTrigger
>> +# Concurrency error from ExecLockRows
>>
>> I think you don't need to mention above sentences in spec files.
>> Apart from that, your patch looks good to me.  I have marked it as
>> Ready For Committer.
>>
>
> I too have tested this feature with isolation framework and this look good
> to me.
>
>
It looks to me that we are trying to fix only one issue here with
concurrent updates. What happens if a non-partition key is first updated
and then a second session updates the partition key?

For example, with your patches applied:

CREATE TABLE pa_target (key integer, val text)
PARTITION BY LIST (key);
CREATE TABLE part1 PARTITION OF pa_target FOR VALUES IN (1);
CREATE TABLE part2 PARTITION OF pa_target FOR VALUES IN (2);
INSERT INTO pa_target VALUES (1, 'initial1');

session1:
BEGIN;
UPDATE pa_target SET val = val || ' updated by update1' WHERE key = 1;
UPDATE 1
postgres=# SELECT * FROM pa_target ;
 key | val
-+-
   1 | initial1 *updated by update1*
(1 row)

session2:
UPDATE pa_target SET val = val || ' updated by update2', key = key + 1
WHERE key = 1


session1:
postgres=# COMMIT;
COMMIT



postgres=# SELECT * FROM pa_target ;
 key | val
-+-
   2 | initial1 updated by update2
(1 row)

Ouch. The committed updates by session1 are overwritten by session2. This
clearly violates the rules that rest of the system obeys and is not
acceptable IMHO.

Clearly, ExecUpdate() while moving rows between partitions is missing out
on re-constructing the to-be-updated tuple, based on the latest tuple in
the update chain. Instead, it's simply deleting the latest tuple and
inserting a new tuple in the new partition based on the old tuple. That's
simply wrong.

I haven't really thought carefully to see if this should be a separate
patch, but it warrants attention. We should at least think through all
different concurrency aspects of partition key updates and think about a
holistic solution, instead of fixing one problem at a time. This probably
also shows that isolation tests for partition key updates are either
missing (I haven't checked) or they need more work.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-07 Thread Robert Haas
On Fri, Mar 2, 2018 at 5:17 PM, Tom Lane  wrote:
> (1) do we really want to go over to treating ANALYZE's tuple density
> result as gospel, contradicting the entire thrust of the 2011 discussion?
>
>> This tables reltuples is 18 times the actual row count. It will never 
>> converge
>> because with 5953 pages analyze can only adjust reltuples by 0.0006 each 
>> time.
>
> But by the same token, analyze only looked at 0.0006 of the pages.  It's
> nice that for you, that's enough to get a robust estimate of the density
> everywhere; but I have a nasty feeling that that won't hold good for
> everybody.  The whole motivation of the 2011 discussion, and the issue
> that is also seen in some other nearby discussions, is that the density
> can vary wildly.

I think that viewing ANALYZE's result as fairly authoritative is
probably a good idea.  If ANALYZE looked at only 0.0006 of the pages,
that's because we decided that 0.0006 of the pages were all it needed
to look at in order to come up with good estimates.  Having decided
that, we should turn around and decide that they are 99.94% bunk.

Now, it could be that there are data sets out there were the number of
tuples per page varies widely between different parts of the table,
and therefore sampling 0.0006 of the pages can return quite different
estimates depending on which ones we happen to pick.  However, that's
a lot like saying that 0.0006 of the pages isn't really enough, and
maybe the solution is to sample more.  Still, it doesn't seem
unreasonable to do some kind of smoothing, where we set the new
estimate = (newly computed estimate * X) + (previous estimate * (1 -
X)) where X might be 0.25 or whatever; perhaps X might even be
configurable.

One thing to keep in mind is that VACUUM will, in many workloads, tend
to scan the same parts of the table over and over again.  For example,
consider a database of chess players which is regularly updated with
new ratings information.  The records for active players will be
updated frequently, but the ratings for deceased players will rarely
change.  Living but inactive players may occasionally become active
again, or may be updated occasionally for one reason or another.  So,
VACUUM will keep scanning the pages that contain records for active
players but will rarely or never be asked to scan the pages for dead
players.  If it so happens that these different groups of players have
rows of varying width -- perhaps we store more detailed data about
newer players but don't have full records for older ones -- then the
overall tuple density estimate will come to resemble more and more the
density of the rows that are actively being updated, rather than the
overall density of the whole table.

Even if all the tuples are the same width, it might happen in some
workload that typically insert a record, update it N times, and then
it stays fixed after that.  Suppose we can fit 100 tuples into a page.
On pages were all of the records have reached their final state, there
will be 100 tuples.  But on pages where updates are still happening
there will -- after VACUUM -- be fewer than 100 tuples per page,
because some fraction of the tuples on the page were dead row
versions.  That's why we were vacuuming them.  Suppose typically half
the tuples on each page are getting removed by VACUUM.  Then, over
time, as the table grows, if only VACUUM is ever run, our estimate of
tuples per page will converge to 50, but in reality, as the table
grows, the real number is converging to 100.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Robert Haas
On Wed, Mar 7, 2018 at 6:43 PM, Michael Paquier  wrote:
> On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote:
>> OK, seems like I'm on the short end of that vote.  I propose to push the
>> GUC-crosschecking patch I posted yesterday, but not the default-value
>> change, and instead push Kyotaro-san's initdb change.  Should we back-patch
>> these things to v10 where the problem appeared?
>
> I would vote for a backpatch.  If anybody happens to run initdb on v10
> and gets max_connections to 10, that would immediately cause a failure.
> We could also wait for sombody to actually complain about that, but a
> bit of prevention does not hurt to ease future user experience on this
> released version.

In theory, back-patching the GUC-crosschecking patch could cause the
cluster to fail to restart after the upgrade.  It's pretty unlikely.
We have to postulate someone with, say, default values but for
max_connections=12.  But it's not impossible.  I would be inclined to
back-patch the increase in the max_connections fallback from 10 to 20
because that fixes a real, if unlikely, failure mode, but treat the
GUC cross-checking stuff as a master-only improvement.  Although it's
unlikely to hurt many people, there's no real upside.  Nobody is going
to say "boy, it's a good thing they tidied that GUC cross-checking in
the latest major release -- that really saved my bacon!".  Nothing is
really broken as things stand.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: [patch] BUG #15005: ANALYZE can make pg_class.reltuples inaccurate.

2018-03-07 Thread David Gould
On Tue, 06 Mar 2018 11:16:04 -0500
Tom Lane  wrote:

> so that we can decide whether this bug is bad enough to justify
> back-patching a behavioral change.  I remain concerned that the proposed
> fix is too simplistic and will have some unforeseen side-effects, so
> I'd really rather just put it in HEAD and let it go through a beta test
> cycle before it gets loosed on the world.

It happens to us fairly regularly and causes lots of problems. However,
I'm agreeable to putting it in head for now, my client can build from
source to pick up this patch until that ships, but doesn't want to maintain
their own fork forever. That said, if it does get though beta I'd hope we
could back-patch at that time.

-dg

-- 
David Gould   da...@sonic.net
If simplicity worked, the world would be overrun with insects.



Re: Some message fixes

2018-03-07 Thread Kyotaro HORIGUCHI
At Wed, 7 Mar 2018 07:10:34 -0300, Alvaro Herrera  
wrote in <20180307101034.l7z7kqwqfkjg6c2p@alvherre.pgsql>
> Kyotaro HORIGUCHI wrote:
> 
> > 1. some messages are missing partitioned table/index
..
> I *think* the idea here is that a partitioned table is a table, so there
> is no need to say "foo is not a table or partitioned table".  We only
> mention partitioned tables when we want to make a distinction between
> those that are partitioned and those that aren't.

I agree with the direction and code seems following that. I
should have looked wider.

Thanks.

> > 2. GUC comment for autovacuum_work_mem has a typo, "max num of
> >parallel workers *than* can be..". (Third attached)
> 
> Yeah, pushed, though it's parallel_max_workers not autovacuum_work_mem.

Ugg. Inconsistency within a line.. My fingers may have decided to
do somewhat different from my brain.

Anyway, thank you for commiting.

regards,

-- 
Kyotaro Horiguchi
NTT Open Source Software Center




Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Andres Freund


On March 7, 2018 5:51:29 PM PST, Craig Ringer  wrote:
>My favourite remains an organisation that kept "fixing" an issue by
>kill
>-9'ing the postmaster and removing postmaster.pid to make it start up
>again. Without killing all the leftover backends. Of course, the system
>kept getting more unstable and broken, so they did it more and more
>often.
>They were working on scripting it when they gave up and asked for help.

Maybe I'm missing something, but that ought to not work. The shmem segment that 
we keep around would be a conflict, no?

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: FOR EACH ROW triggers on partitioned tables

2018-03-07 Thread Alvaro Herrera
Alvaro Herrera wrote:

> I reserve the right to revise this further, as I'm going to spend a
> couple of hours looking at it this afternoon, particularly to see how
> concurrent DDL behaves, but I don't see anything obviously wrong with
> it.

I do now.  TLDR; I'm afraid this cute idea crashed and burned, so I'm
back to the idea of just cloning the pg_trigger row for each partition.

The reason for the failure is pg_trigger->tgqual, which is an expression
tree.  In most cases, when set, that expression will contain references
to columns of the table, in the form of a varattno.  But this varattno
references the column number of the partitioned table; and if the
partition happens to require some attribute mapping, we're screwed
because there is no way to construct that without forming the
partitioned table's tuple descriptor.  But we can't do that without
grabbing a lock on the partitioned table; and we can't do that because
we would incur the deadlock risk Robert was talking about.

An example that causes the problem is:

create table parted_irreg (fd int, a int, fd2 int, b text) partition by range 
(b);
alter table parted_irreg drop column fd, drop column fd2;
create table parted1_irreg (b text, fd int, a int);
alter table parted1_irreg drop column fd;
alter table parted_irreg attach partition parted1_irreg for values from 
('') to ('');
create trigger parted_trig after insert on parted_irreg for each row when 
(new.a % 1 = 0) execute procedure trigger_notice_irreg();
insert into parted_irreg values (1, 'aardvark');
insert into parted1_irreg values ('aardwolf', 2);
drop table parted_irreg;
drop function trigger_notice_irreg();

Both inserts fail thusly:

ERROR:  attribute 2 of type parted1_irreg has been dropped

Now, I can fix the first failure by taking advantage of
ResultRelInfo->ri_PartitionRoot during trigger execution; it's easy and
trouble-free to call map_variable_attnos() using that relation.  But in
the second insert, ri_PartitionRoot is null (because of inserting into
the partition directly), so we have no relation to refer to for
map_variable_attnos().  I think it gets worse: if you have a three-level
partitioning scheme, and define the trigger in the second one, there is
no relation either.

Another option I think would be to always keep in the trigger descriptor
("somehow"), an open Relation on which the trigger is defined.  But this
has all sorts of problems also, so I'm not doing that.

I guess another option is to store a column map somewhere.


So, unless someone has a brilliant idea on how to construct a column
mapping from partitioned table to partition, I'm going back to the
design I was proposing earlier, ie., creating individual pg_trigger rows
for each partition that are essentially adjusted copies of the ones for
the partitioned table.  The only missing thing in that one was having
ALTER TABLE ENABLE/DISABLE for a trigger on the partitioned table
cascade to the partitions; I'll see about that.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: pgstat_report_activity() and parallel CREATE INDEX (was: Parallel index creation & pg_stat_activity)

2018-03-07 Thread Peter Geoghegan
On Thu, Mar 1, 2018 at 2:47 PM, Peter Geoghegan  wrote:
> No. Just an oversight. Looks like _bt_parallel_build_main() should
> call pgstat_report_activity(), just like ParallelQueryMain().
>
> I'll come up with a patch soon.

Attached patch has parallel CREATE INDEX propagate debug_query_string
to workers. Workers go on to use this string as their own
debug_query_string, as well as registering it using
pgstat_report_activity(). Parallel CREATE INDEX pg_stat_activity
entries will have a query text, too, which addresses Phil's complaint.

I wasn't 100% sure if I should actually show the leader's
debug_query_string within worker pg_stat_activity entries, since that
isn't what parallel query uses (the QueryDesc/Estate query string in
shared memory is *restored* as the debug_query_string for a parallel
query worker, though). I eventually decided that this
debug_query_string approach was okay. There is nothing remotely like a
QueryDesc or EState passed to btbuild(). I can imagine specific issues
with what I've done, such as a CREATE EXTENSION command that contains
a CREATE INDEX, and yet shows a CREATE EXTENSION in pg_stat_activity
for parallel workers. However, that's no different to what you'd see
with a serial index build. Existing tcop/postgres.c
pgstat_report_activity() calls are aligned with setting
debug_query_string.

-- 
Peter Geoghegan
From 1c380b337be86bb3c69cf70b39da1d3dd1fba18a Mon Sep 17 00:00:00 2001
From: Peter Geoghegan 
Date: Wed, 7 Mar 2018 14:22:56 -0800
Subject: [PATCH] Report query text in parallel CREATE INDEX workers.

Commit 4c728f38297 established that parallel query should propagate a
debug_query_string to worker processes, and display query text in worker
pg_stat_activity entries.  Parallel CREATE INDEX should follow this
precedent.

This fixes an oversight in commit 9da0cc35.

Author: Peter Geoghegan
Reported-By: Phil Florent
Discussion: https://postgr.es/m/cah2-wzkryapcqohbjkudkfni-hgfny31yjcbm-yp5ho-71i...@mail.gmail.com
---
 src/backend/access/nbtree/nbtsort.c | 30 +-
 1 file changed, 25 insertions(+), 5 deletions(-)

diff --git a/src/backend/access/nbtree/nbtsort.c b/src/backend/access/nbtree/nbtsort.c
index f0c276b52a..098e0ce1be 100644
--- a/src/backend/access/nbtree/nbtsort.c
+++ b/src/backend/access/nbtree/nbtsort.c
@@ -86,6 +86,7 @@
 #define PARALLEL_KEY_BTREE_SHARED		UINT64CONST(0xA001)
 #define PARALLEL_KEY_TUPLESORT			UINT64CONST(0xA002)
 #define PARALLEL_KEY_TUPLESORT_SPOOL2	UINT64CONST(0xA003)
+#define PARALLEL_KEY_QUERY_TEXT			UINT64CONST(0xA004)
 
 /*
  * DISABLE_LEADER_PARTICIPATION disables the leader's participation in
@@ -1195,6 +1196,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 	BTSpool*btspool = buildstate->spool;
 	BTLeader   *btleader = (BTLeader *) palloc0(sizeof(BTLeader));
 	bool		leaderparticipates = true;
+	char	   *sharedquery;
+	int			querylen;
 
 #ifdef DISABLE_LEADER_PARTICIPATION
 	leaderparticipates = false;
@@ -1223,9 +1226,8 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 		snapshot = RegisterSnapshot(GetTransactionSnapshot());
 
 	/*
-	 * Estimate size for at least two keys -- our own
-	 * PARALLEL_KEY_BTREE_SHARED workspace, and PARALLEL_KEY_TUPLESORT
-	 * tuplesort workspace
+	 * Estimate size for our own PARALLEL_KEY_BTREE_SHARED workspace, and
+	 * PARALLEL_KEY_TUPLESORT tuplesort workspace
 	 */
 	estbtshared = _bt_parallel_estimate_shared(snapshot);
 	shm_toc_estimate_chunk(&pcxt->estimator, estbtshared);
@@ -1234,7 +1236,7 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 
 	/*
 	 * Unique case requires a second spool, and so we may have to account for
-	 * a third shared workspace -- PARALLEL_KEY_TUPLESORT_SPOOL2
+	 * another shared workspace for that -- PARALLEL_KEY_TUPLESORT_SPOOL2
 	 */
 	if (!btspool->isunique)
 		shm_toc_estimate_keys(&pcxt->estimator, 2);
@@ -1244,6 +1246,11 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 		shm_toc_estimate_keys(&pcxt->estimator, 3);
 	}
 
+	/* Finally, estimate PARALLEL_KEY_QUERY_TEXT space */
+	querylen = strlen(debug_query_string);
+	shm_toc_estimate_chunk(&pcxt->estimator, querylen + 1);
+	shm_toc_estimate_keys(&pcxt->estimator, 1);
+
 	/* Everyone's had a chance to ask for space, so now create the DSM */
 	InitializeParallelDSM(pcxt);
 
@@ -1293,6 +1300,11 @@ _bt_begin_parallel(BTBuildState *buildstate, bool isconcurrent, int request)
 		shm_toc_insert(pcxt->toc, PARALLEL_KEY_TUPLESORT_SPOOL2, sharedsort2);
 	}
 
+	/* Store query string for workers */
+	sharedquery = (char *) shm_toc_allocate(pcxt->toc, querylen + 1);
+	memcpy(sharedquery, debug_query_string, querylen + 1);
+	shm_toc_insert(pcxt->toc, PARALLEL_KEY_QUERY_TEXT, sharedquery);
+
 	/* Launch workers, saving status for leader/caller */
 	LaunchParallelWorkers(pcxt);
 	btleader->pcxt = pcxt;
@@ -1459,6 +1471,7 @@ _bt_leader

Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Craig Ringer
On 8 March 2018 at 04:58, Robert Haas  wrote:

> On Wed, Feb 28, 2018 at 8:03 PM, Craig Ringer 
> wrote:
> > A huge +1 from me for the idea. I can't even count the number of black
> box
> > "WTF did you DO?!?" servers I've looked at, where bizarre behaviour has
> > turned out to be down to the user doing something very silly and not
> saying
> > anything about it.
>
> +1 from me, too.
>
>
My favourite remains an organisation that kept "fixing" an issue by kill
-9'ing the postmaster and removing postmaster.pid to make it start up
again. Without killing all the leftover backends. Of course, the system
kept getting more unstable and broken, so they did it more and more often.
They were working on scripting it when they gave up and asked for help.

The data recovery effort on that one was truly exciting. I remember looking
at bash history and having to take a short break to figure out how on earth
to communicate what was going on.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: PATCH: Configurable file mode mask

2018-03-07 Thread Michael Paquier
On Wed, Mar 07, 2018 at 10:56:32AM -0500, David Steele wrote:
> On 3/6/18 10:04 PM, Michael Paquier wrote:
>> Seems like you forgot to re-add the chmod calls in initdb.c.
> 
> Hmmm, I thought we were talking about moving the position of umask().
> 
> I don't think the chmod() calls are needed because umask() is being set.
>  The tests show that the config files have the proper permissions after
> inidb, so this just looks like redundant code to me.

Let's discuss that on a separate thread then, there could be something
we are missing, but I agree that those should not be needed.  Looking at
the code history, those calls have been around since the beginning of
PG-times.

> Another way to do this would be to make the function generic and
> stipulate that the postmaster must be shut down before running the
> function.  We could check postmaster.pid permissions as a separate
> test.

Yeah, that looks like a sensitive approach as this could be run
post-initdb or after taking a backup.  This way we would avoid other
similar behaviors in the future...  Still postmaster.pid is an
exception.

>> sub clean_rewind_test
>> {
>> +   ok (check_pg_data_perm($node_master->data_dir(), 0));
>> +
>> $node_master->teardown_node  if defined $node_master;
>> I have also a pending patch for pg_rewind which adds read-only files in
>> the data folders of a new test, so this would cause this part to blow
>> up.  Testing that for all the test sets does not bring additional value
>> as well, and doing it at cleanup phase is also confusing.  So could you
>> move that check into only one test's script?  You could remove the umask
>> call in 003_extrafiles.pl as well this way, and reduce the patch diffs.
> 
> I think I would rather have a way to skip the permission test rather
> than disable it for most of the tests.  pg_rewind does more writes into
> PGDATA that anything other than a backend.  Good coverage seems like a
> plus.

All the tests basically run the same scenario, wiht minimal variations,
so let's do the test once in the test touching the highest amount of
files and call it good.

>> Patch 2 is getting close for a committer lookup I think, still need to
>> look at patch 3 in details..  And from patch 3:
>>  # Expected permission
>> -my $expected_file_perm = 0600;
>> -my $expected_dir_perm = 0700;
>> +my $expected_file_perm = $allow_group ? 0640 : 0600;
>> +my $expected_dir_perm = $allow_group ? 0750 : 0700;
>> Or $expected_file_perm and $expected_dir_perm could just be used as
>> arguments.
> 
> This gets back to the check_pg_data_perm() discussion above.
> 
> I'll hold off on another set of patches until I hear back from you.
> There were only trivial changes as noted above.

OK, let's keep things simple here and assume that the grouping extension
is not available yet.  So no extra parameters should be needed.

I have begun to read through patch 3.

-if (stat_buf.st_mode & (S_IRWXG | S_IRWXO))
+if (stat_buf.st_mode & PG_MODE_MASK_ALLOW_GROUP)
 ereport(FATAL,
 (errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
- errmsg("data directory \"%s\" has group or world access",
+ errmsg("data directory \"%s\" has invalid permissions",
 DataDir),
- errdetail("Permissions should be u=rwx (0700).")));
+ errdetail("Permissions should be u=rwx (0700) or u=rwx,g=rx 
(0750).")));
Hm.  This relaxes the checks and concerns me a lot.  What if users want
to keep strict permission all the time and rely on the existing check to
be sure that this gets never changed post-initdb?  For such users, we
may want to track if cluster has been initialized with group access, in
which case tracking that in the control file would be more adapted. 
Then the startup checks should use this configuration.  Another idea
would be a startup option.  So, I cannot believe that all users would
like to see such checks relaxed.  Some of my users would surely complain
about such sanity checks relaxed unconditionally, so making this
configurable would be fine, and the current approach is not acceptable
in my opinion.

Patch 2 introduces some standards regarding file permissions as those
are copied all over the code tree, which is definitely good in my
opinion. I am rather reserved about patch 3 per what I am seeing now.
Looking in-depth at the security-related requirements would take more
time than I have now.
--
Michael


signature.asc
Description: PGP signature


Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation

2018-03-07 Thread Craig Ringer
On 8 March 2018 at 07:32, Tom Lane  wrote:

> Robert Haas  writes:
> > On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer 
> wrote:
> >> So I can't say it's definitely impossible. It seems astonishingly
> unlikely,
> >> but that's not always good enough.
>
> > Race conditions tend to happen a lot more often than one might think.
>
> Just to back that up --- we've seen cases where people could repeatably
> hit race-condition windows that are just an instruction or two wide.
> The first one I came to in an idle archive search is
> https://www.postgresql.org/message-id/15543.1130714273%40sss.pgh.pa.us
> I vaguely recall others but don't feel like digging harder right now.
>
>
That's astonishing.

I guess if you repeat something enough times...

The reason I'm less concerned about this one is that you have to crash in
exactly the wrong place, *while* during a badly timed point in a race. But
the downside is that the result would be an unusable logical slot.

The simplest solution is probably just to mark the slot dirty while we hold
the spinlock, at the same time we advance its restart lsn. Any checkpoint
will then CheckPointReplicationSlots() and flush it. We don't
remove/recycle xlog segments until after that's done in CheckPointGuts() so
it's guaranteed that the slot's new state will be on disk and we can never
have a stale restart_lsn pointing into truncated-away WAL.

-- 
 Craig Ringer   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Andres Freund


On March 7, 2018 5:40:18 PM PST, Peter Geoghegan  wrote:
>On Wed, Mar 7, 2018 at 5:16 PM, Tomas Vondra
> wrote:
>> FWIW that's usually written to the system log. Does dmesg say
>something
>> about the kill?
>
>While it would be nice to confirm that it was indeed the OOM killer,
>either way the crash happened because SIGKILL was sent to a parallel
>worker. There is no reason to suspect a bug.

Not impossible there's a leak somewhere though.

Andres
-- 
Sent from my Android device with K-9 Mail. Please excuse my brevity.



Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Peter Geoghegan
On Wed, Mar 7, 2018 at 5:16 PM, Tomas Vondra
 wrote:
> FWIW that's usually written to the system log. Does dmesg say something
> about the kill?

While it would be nice to confirm that it was indeed the OOM killer,
either way the crash happened because SIGKILL was sent to a parallel
worker. There is no reason to suspect a bug.

-- 
Peter Geoghegan



archive_command

2018-03-07 Thread Rui DeSousa

Hi, 

I’ve been encouraged to submit this code as there has been talk in the past 
about a simple pgcopy command to use with the archive_command.  Currently there 
is really no good solution in most base systems without having to introduce a 
dedicated third party Postgres solution.  The best base system solution and 
most commonly used today is rsync which doesn’t fsync the file after completing 
the transfer leaving no good recommendations. 

I not sure what would need to be done to introduce this into a given 
commitfest.  Also, would need to know if the command interface is acceptable 
and/or other features should be added.  It currently is very simple as it just 
reads from standard input and saves data to the indicated file.

Please let me know how to proceed.

Thanks,
Rui   

 
Example of local compressed archive:

archive_command=“xz -c %p | fwrite /mnt/server/archivedir/%f”


Example of remote archive via a shell script:

#!/usr/bin/env bash

.
.
. 

# SSH Command and options
SSH_CMD="ssh -o ServerAliveInterval=20 $ARCH_SERVER"
STS=3

OUTPUT=$(cat $XLOGFILE | $SSH_CMD "(mkdir -p $ARCH_DIR && fwrite 
$ARCH_DIR/$WALFILE) 2>&1")

echo ${PIPESTATUS[@]} | grep -qE '^[0 ]+$'
if [ $? == 0 ]; then 
  STS=0
fi

exit $STS




fwrite code:


#include 
#include 
#include 
#include 
#include 
#include 
#include 
#include 


#define BUFSIZE 131072

int
main(int argc, char *argv[])
{
int fd, r, w;
char *buf;
char *name;
struct stat fstat;
  
if (argc != 2) {
fprintf(stderr, "usage: fwrite [file]\n");
exit(1);
}

if ((buf = malloc(BUFSIZE)) == NULL)
err(1, "malloc");

++argv;
if ((name = (char *) malloc(strlen(*argv) + 8)) == NULL)
err(1, "malloc");

strcat(strcpy(name, *argv), ".XX");

if ((fd = mkstemp(name)) < 0)
err(1, "mkstemp");

while ((r = read(STDIN_FILENO, buf, BUFSIZE)) > 0)
if ((w = write(fd, buf, r)) == -1) {
unlink(name);
err(1, "write");
}

if (r < 0)
err(1, "read");

if (lseek(fd, 0, SEEK_CUR) <= 0) {
unlink(name);
errx(1, "zero byte file!");
}

if (fsync(fd) != 0)
err(1, "fsync");

if (close(fd) != 0)
err(1, "close");

if (access(*argv, F_OK) < 0 && errno == ENOENT) {
if (rename(name, *argv) != 0)
err(1, "rename");
} else {
unlink(name);
errx(1, "file exists already!");
}
 
free(name);
exit(0);
}



Re: Add default role 'pg_access_server_files'

2018-03-07 Thread Michael Paquier
On Thu, Mar 08, 2018 at 10:15:11AM +0900, Michael Paquier wrote:
> Other than that the patch looks in pretty good shape to me.

The regression tests of file_fdw are blowing up because of an error
string patch 2 changes.
--
Michael


signature.asc
Description: PGP signature


Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)

2018-03-07 Thread Tomas Vondra

On 03/07/2018 03:21 PM, Robert Haas wrote:
> On Wed, Mar 7, 2018 at 8:59 AM, Prabhat Sahu
> mailto:prabhat.s...@enterprisedb.com>>
> wrote:
> 
> 2018-03-07 19:24:44.263 IST [54400] LOG:  background worker
> "parallel worker" (PID 54482) was terminated by signal 9: Killed
> 
> 
> That looks like the background worker got killed by the OOM killer.  How
> much memory do you have in the machine where this occurred?
>  

FWIW that's usually written to the system log. Does dmesg say something
about the kill?

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Add default role 'pg_access_server_files'

2018-03-07 Thread Michael Paquier
On Wed, Mar 07, 2018 at 07:00:03AM -0500, Stephen Frost wrote:
> * Michael Paquier (mich...@paquier.xyz) wrote:
>> First, could it be possible to do a split for 1) and 2)?
> 
> Done, because it was less work than arguing about it, but I'm not
> convinced that we really need to split out patches to this level of
> granularity.  Perhaps something to consider for the future.

One patch should defend one idea, this makes the git history easier to
understand in my opinion.

> Consider a case like postgresql.conf residing outside of the data
> directory.  For an application to be able to read that with
> pg_read_file() is very straight-forward and applications already exist
> which do.  Forcing that application to go through COPY would require
> creating a TEMP table and then coming up with a COPY command that
> doesn't actually *do* what COPY is meant to do- that is, parse the file.
> By default, you'd get errors from such a COPY command as it would think
> there's extra columns defined in the "copy-format" or "csv" or
> what-have-you file.

Hm, OK...

>> Could you update the documentation of pg_rewind please?  It seems to me
>> that with your patch then superuser rights are not necessary mandatory,
> 
> This will require actual testing to be done before I'd make such a
> change.  I'll see if I can do that soon, but I also wouldn't complain if
> someone else wanted to actually go through and set that up and test that
> it works.

That's what I did, manually, using the attached SQL script with your
patch 1 applied.  You can check for the list of functions used by
pg_rewind in libpq_fetch.c where you just need to grant execution access
to those functions for a login user and you are good to go:
pg_ls_dir(text, boolean, boolean)
pg_stat_file(text, boolean)
pg_read_binary_file(text)
pg_read_binary_file(text, bigint, bigint, boolean)

So getting this documented properly would be nice.  Of course feel free
to test this by yourself as you wish.

Could you send separate files for each patch by the way?  This eases
testing, and I don't recall that git am has a way to enforce only a
subset of patches to be applied based on one file, though my git-fu is
limited in this area.

+  read or which program is run.  In principle regular users could be allowed to
   change the other options, but that's not supported at present.
Well, the parsing of file_fdw complains if "program" or "filename" is
not defined, so a user has to be in either pg_read_server_files to use
"filename" or in pg_execute_server_program to use "program", so I am not
sure that the last sentence of this paragraph makes much sense from the
beginning.

-Return the contents of a file.
+Return the contents of a file.  Restricted to superusers by
default, but other users can be granted EXECUTE to run the function.

You should make those paragraphs spawn into multiple lines.  EXECUTE
should use  markup, and I think that you should use "EXECUTE
privilege to run the function" on those doc portions.  That's all for
the nits.

Other than that the patch looks in pretty good shape to me.
--
Michael


rewind_grant.sql
Description: application/sql


signature.asc
Description: PGP signature


Re: INOUT parameters in procedures

2018-03-07 Thread Peter Eisentraut
On 3/6/18 04:22, Pavel Stehule wrote:
> why just OUT variables are disallowed?
> 
> The oracle initializes these values to NULL - we can do same?

The problem is function call resolution.  If we see a call like

CALL foo(a, b, c);

the this could be foo() with zero input and three output parameters, or
with one input parameter and two output parameters, etc.  We have no
code to deal with that right now.

> Minimally this message is not too friendly, there should be hint - "only
> INOUT is suported" - but better support OUT too - from TOP OUT variables
> should not be passed. from PL should be required.

Added a hint.

> I wrote recursive procedure. The call finished by exception. Why?

Fixed. (memory context issue)

I added your example as a test case.

> This issue can be detected in compile time, maybe?
> 
> postgres=# create or replace procedure p(x int,inout a int, inout b numeric)
> as $$
> begin raise notice 'xxx % %', a, b;if (x > 1) then
>   a := x / 10;
>   b := x / 2; call p(b::int, a, 10); <--- can be detected in compile time?
> end if;
> end;
> $$ language plpgsql;

Function resolution doesn't happen at compile time.  That would require
significant work in PL/pgSQL (possible perhaps, but major work).  Right
now, we do parse analysis at first execution.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 3c5ed2faab30dfcde34dfd58877e45a7f6477237 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Wed, 7 Mar 2018 19:15:35 -0500
Subject: [PATCH v3] Support INOUT parameters in procedures

In a top-level CALL, the values of INOUT parameters will be returned as
a result row.  In PL/pgSQL, the values are assigned back to the input
parameters.  In other languages, the same convention as for return a
record from a function is used.  That does not require any code changes
in the PL implementations.
---
 doc/src/sgml/plperl.sgml   |  14 +++
 doc/src/sgml/plpgsql.sgml  |  16 +++
 doc/src/sgml/plpython.sgml |  11 ++
 doc/src/sgml/pltcl.sgml|  12 ++
 doc/src/sgml/ref/create_procedure.sgml |   5 +-
 src/backend/catalog/pg_proc.c  |   3 +-
 src/backend/commands/functioncmds.c|  51 +++--
 src/backend/tcop/utility.c |   3 +-
 src/backend/utils/fmgr/funcapi.c   |  11 +-
 src/include/commands/defrem.h  |   3 +-
 src/include/funcapi.h  |   3 +-
 src/pl/plperl/expected/plperl_call.out |  25 +
 src/pl/plperl/sql/plperl_call.sql  |  22 
 src/pl/plpgsql/src/expected/plpgsql_call.out   |  89 +++
 .../plpgsql/src/expected/plpgsql_transaction.out   |   2 +-
 src/pl/plpgsql/src/pl_comp.c   |  10 +-
 src/pl/plpgsql/src/pl_exec.c   | 125 -
 src/pl/plpgsql/src/pl_funcs.c  |  25 +
 src/pl/plpgsql/src/pl_gram.y   |  38 +--
 src/pl/plpgsql/src/pl_scanner.c|   1 +
 src/pl/plpgsql/src/plpgsql.h   |  12 ++
 src/pl/plpgsql/src/sql/plpgsql_call.sql|  83 ++
 src/pl/plpython/expected/plpython_call.out |  23 
 src/pl/plpython/plpy_exec.c|  24 ++--
 src/pl/plpython/sql/plpython_call.sql  |  20 
 src/pl/tcl/expected/pltcl_call.out |  26 +
 src/pl/tcl/sql/pltcl_call.sql  |  23 
 src/test/regress/expected/create_procedure.out |   1 +
 28 files changed, 632 insertions(+), 49 deletions(-)

diff --git a/doc/src/sgml/plperl.sgml b/doc/src/sgml/plperl.sgml
index cff7a847de..9295c03db9 100644
--- a/doc/src/sgml/plperl.sgml
+++ b/doc/src/sgml/plperl.sgml
@@ -278,6 +278,20 @@ PL/Perl Functions and Arguments
hash will be returned as null values.
   
 
+  
+   Similarly, output parameters of procedures can be returned as a hash
+   reference:
+
+
+CREATE PROCEDURE perl_triple(INOUT a integer, INOUT b integer) AS $$
+my ($a, $b) = @_;
+return {a => $a * 3, b => $b * 3};
+$$ LANGUAGE plperl;
+
+CALL perl_triple(5, 10);
+
+  
+
   
 PL/Perl functions can also return sets of either scalar or
 composite types.  Usually you'll want to return rows one at a
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index c1e3c6a19d..6c25116538 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -1870,6 +1870,22 @@ Returning From a Procedure
  then NULL must be returned.  Returning any other value
  will result in an error.
 
+
+
+ If a procedure has output parameters, then the output values can be
+ assigned to the parameters as if they were variables.  For example:
+
+CREATE PROCEDURE triple(INOUT x int)
+LANGUAGE plpgsql
+AS $$
+BEGIN
+x :=

Re: [HACKERS] Subscription code improvements

2018-03-07 Thread Masahiko Sawada
On Wed, Mar 7, 2018 at 11:13 PM, Alvaro Herrera  wrote:
> 0001:
>
> there are a bunch of other messages of the same ilk in the file.  I
> don't like how the current messages are worded; maybe Peter or Petr had
> some reason why they're like that, but I would have split out the reason
> for not starting or stopping into errdetail.  Something like
>
> errmsg("logical replication apply worker for subscription \"%s\" will not 
> start", ...)
> errdetail("Subscription has been disabled during startup.")
>
> But I think we should change all these messages in unison rather than
> only one of them.
>
> Now, your patch changes "will not start" to "will stop".  But is that
> correct?  ISTM that this happens when a worker is starting and
> determines that it is not needed.  So "will not start" is more correct.
> "Will stop" would be correct if the worker had been running for some
> time and suddenly decided to terminate, but that doesn't seem to be the
> case, unless I misread the code.
>
> Your patch also changes "disabled during startup" to just "disabled".
> Is that a useful change?  ISTM that if the subscription had been
> disabled prior to startup, then the worker would not have started at
> all, so we would not be seeing this message in the first place.  Again,
> maybe I am misreading the code?  Please explain.

I think you're not misreading the code. I agree with you. "will not
start" and "disabled during startup" would be more appropriate here.
But Petr might have other opinion on this. Also I noticed I overlooked
one change of v1 patch proposed by Petr. Attached a new patch
incorporated your review comment and the hunk I overlooked.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center


0001-Improve-messaging-during-logical-replication-worker-_v3.patch
Description: Binary data


Re: [HACKERS] Lazy hash table for XidInMVCCSnapshot (helps Zipfian a bit)

2018-03-07 Thread Tomas Vondra
On 03/06/2018 06:23 AM, Yura Sokolov wrote:
> 05.03.2018 18:00, Tom Lane пишет:
>> Tomas Vondra  writes:
>>> Snapshots are static (we don't really add new XIDs into existing ones,
>>> right?), so why don't we simply sort the XIDs and then use bsearch to
>>> lookup values? That should fix the linear search, without need for any
>>> local hash table.
>>
>> +1 for looking into that, since it would avoid adding any complication
>> to snapshot copying.  In principle, with enough XIDs in the snap, an
>> O(1) hash probe would be quicker than an O(log N) bsearch ... but I'm
>> dubious that we are often in the range where that would matter.
>> We do need to worry about the cost of snapshot copying, too.
> 
> Sorting - is the first thing I've tried. Unfortunately, sorting
> itself eats too much cpu. Filling hash table is much faster.
> 

I've been interested in the sort-based approach, so I've spent a bit of
time hacking on it (patch attached). It's much less invasive compared to
the hash-table, but Yura is right the hashtable gives better results.

I've tried to measure the benefits using the same script I shared on
Tuesday, but I kept getting really strange numbers. In fact, I've been
unable to even reproduce the results I shared back then. And after a bit
of head-scratching I think the script is useless - it can't possibly
generate snapshots with many XIDs because all the update threads sleep
for exactly the same time. And first one to sleep is first one to wake
up and commit, so most of the other XIDs are above xmax (and thus not
included in the snapshot). I have no idea why I got the numbers :-/

But with this insight, I quickly modified the script to also consume
XIDs by another thread (which simply calls txid_current). With that I'm
getting snapshots with as many XIDs as there are UPDATE clients (this
time I actually checked that using gdb).

And for a 60-second run the tps results look like this (see the attached
chart, showing the same data):

writers master  hash   sort
   -
16   1068   1057   1070
32   1005995   1033
64   1064   1042   1117
128  1058   1021   1051
256   977   1004928
512   773935808
768   576815670
1000  413752573

The sort certainly does improve performance compared to master, but it's
only about half of the hashtable improvement.

I don't know how much this simple workload resembles the YCSB benchmark,
but I seem to recall it's touching individual rows. In which case it's
likely worse due to the pg_qsort being more expensive than building the
hash table.

So I agree with Yura the sort is not a viable alternative to the hash
table, in this case.

> Can I rely on snapshots being static? May be then there is no need
> in separate raw representation and hash table. I may try to fill hash
> table directly in GetSnapshotData instead of lazy filling. Though it
> will increase time under lock, so it is debatable and should be
> carefully measured.
> 

Yes, I think you can rely on snapshots not being modified later. That's
pretty much the definition of a snapshot.

You may do that in GetSnapshotData, but you certainly can't do that in
the for loop there. Not only we don't want to add more work there, but
you don't know the number of XIDs in that loop. And we don't want to
build hash tables for small number of XIDs.

One reason against building the hash table in GetSnapshotData is that
we'd build it even when the snapshot is never queried. Or when it is
queried, but we only need to check xmin/xmax.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/storage/ipc/procarray.c b/src/backend/storage/ipc/procarray.c
index afe1c03..b2dae85 100644
--- a/src/backend/storage/ipc/procarray.c
+++ b/src/backend/storage/ipc/procarray.c
@@ -1758,6 +1758,8 @@ GetSnapshotData(Snapshot snapshot)
 	snapshot->active_count = 0;
 	snapshot->regd_count = 0;
 	snapshot->copied = false;
+	snapshot->xip_sorted = false;
+	snapshot->subxip_sorted = false;
 
 	if (old_snapshot_threshold < 0)
 	{
diff --git a/src/backend/utils/time/tqual.c b/src/backend/utils/time/tqual.c
index f7c4c91..f155267 100644
--- a/src/backend/utils/time/tqual.c
+++ b/src/backend/utils/time/tqual.c
@@ -1461,6 +1461,45 @@ HeapTupleIsSurelyDead(HeapTuple htup, TransactionId OldestXmin)
 	return TransactionIdPrecedes(HeapTupleHeaderGetRawXmax(tuple), OldestXmin);
 }
 
+static int
+cmp_transaction_id(const void *a, const void *b)
+{
+	return memcmp(a, b, sizeof(TransactionId));
+}
+
+/*
+ * XidInXip searches xid in xip array.
+ *
+ * When xcnt is smaller than SnapshotSortThreshold, then the array is unsorted
+ * and we can simply do a linear search. If there are more elements, the array
+ * is sorte

RE: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Tsunakawa, Takayuki
From: Alvaro Herrera [mailto:alvhe...@alvh.no-ip.org]
> The thing that comes to mind when reading this patch is that some time ago
> we made fun of other database software, "they are so complicated to configure,
> they have some magical settings that few people understand how to set".
> Postgres was so much better because it was simple to set up, no magic crap.
> But now it becomes apparent that that only was so because Postgres sucked,
> ie., we hadn't yet gotten to the point where we
> *needed* to introduce settings like that.  Now we finally are?

Yes.  We are now facing the problem of too much memory use by PostgreSQL, where 
about some applications randomly access about 200,000 tables.  It is estimated 
based on a small experiment that each backend will use several to ten GBs of 
local memory for CacheMemoryContext.  The total memory use will become over 1 
TB when the expected maximum connections are used.

I haven't looked at this patch, but does it evict all kinds of entries in 
CacheMemoryContext, ie. relcache, plancache, etc?

Regards
Takayuki Tsunakawa








Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Michael Paquier
On Wed, Mar 07, 2018 at 06:39:32PM -0500, Tom Lane wrote:
> OK, seems like I'm on the short end of that vote.  I propose to push the
> GUC-crosschecking patch I posted yesterday, but not the default-value
> change, and instead push Kyotaro-san's initdb change.  Should we back-patch
> these things to v10 where the problem appeared?

I would vote for a backpatch.  If anybody happens to run initdb on v10
and gets max_connections to 10, that would immediately cause a failure.
We could also wait for sombody to actually complain about that, but a
bit of prevention does not hurt to ease future user experience on this
released version.
--
Michael


signature.asc
Description: PGP signature


Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Tom Lane
Robert Haas  writes:
> On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost  wrote:
>> Changing the defaults to go back down strikes me as an entirely wrong
>> approach after we've had a release with the higher defaults without
>> seriously compelling arguments against, and I don't agree that we've had
>> such a case made here.

> +1.  I don't see any real downside of increasing the minimum value of
> max_connections to 20.  I wasn't particularly a fan of raising
> max_wal_senders to 10, but a lot of other people were, and so far
> nobody's reported any problems related to that setting (that I know
> about).

OK, seems like I'm on the short end of that vote.  I propose to push the
GUC-crosschecking patch I posted yesterday, but not the default-value
change, and instead push Kyotaro-san's initdb change.  Should we back-patch
these things to v10 where the problem appeared?

regards, tom lane



Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation

2018-03-07 Thread Tom Lane
Robert Haas  writes:
> On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer  wrote:
>> So I can't say it's definitely impossible. It seems astonishingly unlikely,
>> but that's not always good enough.

> Race conditions tend to happen a lot more often than one might think.

Just to back that up --- we've seen cases where people could repeatably
hit race-condition windows that are just an instruction or two wide.
The first one I came to in an idle archive search is
https://www.postgresql.org/message-id/15543.1130714273%40sss.pgh.pa.us
I vaguely recall others but don't feel like digging harder right now.

regards, tom lane



Re: faster testing with symlink installs

2018-03-07 Thread Tom Lane
Robert Haas  writes:
> On Wed, Feb 28, 2018 at 9:34 PM, Peter Eisentraut
>  wrote:
>> Except ... this doesn't actually work.  find_my_exec() resolves symlinks
>> to find the actual program installation, and so for example the
>> installed initdb will look for postgres in src/bin/initdb/.  I would
>> like to revert this behavior, because it seems to do more harm than
>> good.  The original commit message indicates that this would allow
>> symlinking a program to somewhere outside of the installation tree and
>> still allow it to work and find its friends.  But that could just as
>> well be done with a shell script.

> My initial gut feeling is that changing this would hurt more people
> than it would help.

I agree.  My recollection is that we expended substantial sweat to make
that type of setup work, and I do not think it was for idle reasons.
The fact that the behavior is very old doesn't mean it was a bad choice.
(Also, the fact that the commit message didn't explain the reasoning in
detail is not much of an argument; we didn't go in for that back then.)

regards, tom lane



Re: Two-phase update of restart_lsn in LogicalConfirmReceivedLocation

2018-03-07 Thread Robert Haas
On Thu, Mar 1, 2018 at 2:03 AM, Craig Ringer  wrote:
> So I can't say it's definitely impossible. It seems astonishingly unlikely,
> but that's not always good enough.

Race conditions tend to happen a lot more often than one might think.
If there's a theoretical opportunity for this to go wrong, it would
probably be a good idea to fix it.  Not that I'm volunteering.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: unused includes in test_decoding

2018-03-07 Thread Robert Haas
On Wed, Feb 28, 2018 at 6:59 PM, Euler Taveira  wrote:
> This is a cosmetic patch that removes some unused includes in
> test_decoding. It seems to be like this since day 1 (9.4).

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: public schema default ACL

2018-03-07 Thread David G. Johnston
On Wed, Mar 7, 2018 at 2:48 PM, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> On 3/7/18 10:05, Stephen Frost wrote:
> > I liken this to a well-known and well-trodden feature for auto creating
> > user home directories on Unix.
>
> I don't think likening schemas to home directories is really addressing
> the most typical use cases.  Database contents are for the most part
> carefully constructed in a collaborative way.


​Databases intended to be deployed to production (hopefully) are, but not
necessarily those intend to evaluate PostgreSQL's capabilities.


> The fix is probably to not let them do that.  What is
> being discussed here instead is to let them do whatever they want in
> their own non-shared spaces.  That addresses the security concern, but
> it doesn't support the way people actually work right now.
>

Maybe not the majority of users, but the way DBA's work today is already
inherently secure (i.e., not using public)​ and requires a non-trivial
amount of DBA work (i.e., creating groups and users) to make happen.  They
are not the target audience.

The target user profile for this discussion is one who does:

sudo apt install postgresql-10
sudo -U postgres createuser myosusername
psql myosusername postgres
> CREATE TABLE test_table (id serial primary key);
> insert into test_table;
> select * from test_table;

We want to avoid having the create table fail now whereas it worked before
we removed create permissions on public from PUBLIC.

Now, I'd argue that people aren't bothering to "createuser" in the above
but simply skipping to "psql" and then to "sudo -U postgres psql" when they
get the error that "user myosusername" doesn't exist...once they start
creating new users I'd agree that they likely benefit more from us being
conservative and "do only what I say" as opposed to being helpful and doing
more stuff in the name of usability.

I still feel like I want to mull this over more but auto-creating schemas
strikes me as being "spooky action at a distance".

David J.


Re: faster testing with symlink installs

2018-03-07 Thread Robert Haas
On Wed, Feb 28, 2018 at 9:34 PM, Peter Eisentraut
 wrote:
> Except ... this doesn't actually work.  find_my_exec() resolves symlinks
> to find the actual program installation, and so for example the
> installed initdb will look for postgres in src/bin/initdb/.  I would
> like to revert this behavior, because it seems to do more harm than
> good.  The original commit message indicates that this would allow
> symlinking a program to somewhere outside of the installation tree and
> still allow it to work and find its friends.  But that could just as
> well be done with a shell script.
>
> Reverting this behavior would also allow Homebrew-like installations to
> work better.  The actual installation is in a versioned directory like
> /usr/local/Cellar/postgresql/10.1/... but symlinked to
> /usr/local/opt/postgresql/.  But because of the symlink resolution,
> calling /usr/local/opt/postgresql/bin/pg_config returns paths under
> Cellar, which then causes breakage of software built against that path
> on postgresql upgrades the change the version number.

My initial gut feeling is that changing this would hurt more people
than it would help.  It seems entirely reasonable to install
PostgreSQL in, say, /opt/local/postgresql, and then symlink psql and
pg_ctl into /opt/local/bin or /usr/bin or wherever you like to find
your binaries.  I don't think we want to break that.  It's true that
it will work if you symlink everything, as in your Homebrew example,
but you might not.  TBH I find that Homebrew example pretty odd.  I
would understand installing each major release in a version directory,
but putting every point release in a different versioned directory
seems like a bad plan.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: public schema default ACL

2018-03-07 Thread Peter Eisentraut
On 3/7/18 10:05, Stephen Frost wrote:
> I liken this to a well-known and well-trodden feature for auto creating
> user home directories on Unix.

I don't think likening schemas to home directories is really addressing
the most typical use cases.  Database contents are for the most part
carefully constructed in a collaborative way.  If your organization has
three DBAs foo, bar, and baz, it's quite unlikely that they will want to
create or look at objects in schemas named foo, bar, or baz.  More
likely, they will be interested in the schemas myapp or myotherapp.  Or
they don't care about schemas and will want the database to behave as if
there wasn't a schema layer between the database and the tables.

The existing structures are not bad.  They work for a lot of users.  The
problem is just that by default everyone can do whatever they want in a
shared space.  The fix is probably to not let them do that.  What is
being discussed here instead is to let them do whatever they want in
their own non-shared spaces.  That addresses the security concern, but
it doesn't support the way people actually work right now.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] SERIALIZABLE with parallel query

2018-03-07 Thread Robert Haas
On Wed, Feb 28, 2018 at 11:35 PM, Thomas Munro
 wrote:
> On Mon, Feb 26, 2018 at 6:37 PM, Thomas Munro
>  wrote:
>> I've now broken it into two patches.
>
> Rebased.

+SerializableXactHandle
+ShareSerializableXact(void)
+{
+Assert(!IsParallelWorker());
+
+return MySerializableXact;
+}

Uh, how's that OK?  There's no rule that you can't create a
ParallelContext in a worker.  Parallel query currently doesn't, so it
probably won't happen, but burying an assertion to that effect in the
predicate locking code doesn't seem nice.

Is "sxact" really the best (i.e. clearest) name we can come up with
for the lock tranche?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: csv format for psql

2018-03-07 Thread David Fetter
On Wed, Mar 07, 2018 at 09:37:26PM +0100, Pavel Stehule wrote:
> 2018-03-07 21:31 GMT+01:00 Daniel Verite :
> 
> > David Fetter wrote:
> >
> > > We have some inconsistency here in that fewer table formats are
> > > supported, but I think asciidoc, etc., do this correctly via
> > > invocations like:
> > >
> > >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
> >
> > -A is equivalent to -P format=unaligned, so in the above
> > invocation, it cancels the effect of -P format=asciidoc.
> > Anyway -P format=name on the command line
> > is the same as "\pset format name" as a
> > metacommand, so it works for all formats.
> >
> > Some formats (unaligned, html)  have corresponding
> > command-line options (-A, -H), and others don't.
> > In this patch, -C is used so that csv would be in the
> > category of formats that can be switched on with the simpler
> > invocation on the command line.
> > If we don't like that, we can leave out -C for future use
> > and let users write -P format=csv.
> > That's not the best choice from my POV though, as csv
> > is a primary choice to export tabular data.
> >
> 
> -C can be used for certificates or some similar. I like csv, but I am not
> sure, so it is too important to get short option (the list of free chars
> will be only shorter)

+1 for not using up a single-letter option for this.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: RFC: Add 'taint' field to pg_control.

2018-03-07 Thread Robert Haas
On Wed, Feb 28, 2018 at 8:03 PM, Craig Ringer  wrote:
> A huge +1 from me for the idea. I can't even count the number of black box
> "WTF did you DO?!?" servers I've looked at, where bizarre behaviour has
> turned out to be down to the user doing something very silly and not saying
> anything about it.

+1 from me, too.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: Server won't start with fallback setting by initdb.

2018-03-07 Thread Robert Haas
On Tue, Mar 6, 2018 at 10:51 PM, Stephen Frost  wrote:
> Changing the defaults to go back down strikes me as an entirely wrong
> approach after we've had a release with the higher defaults without
> seriously compelling arguments against, and I don't agree that we've had
> such a case made here.  If this discussion had happened before v10 was
> released, I'd be much more open to going with the suggestion of '5', but
> forcing users to update their configs for working environments because
> we've decided that the default of 10 was too high is just pedantry, in
> my opinion.

+1.  I don't see any real downside of increasing the minimum value of
max_connections to 20.  I wasn't particularly a fan of raising
max_wal_senders to 10, but a lot of other people were, and so far
nobody's reported any problems related to that setting (that I know
about).

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: csv format for psql

2018-03-07 Thread Pavel Stehule
2018-03-07 21:31 GMT+01:00 Daniel Verite :

> David Fetter wrote:
>
> > We have some inconsistency here in that fewer table formats are
> > supported, but I think asciidoc, etc., do this correctly via
> > invocations like:
> >
> >psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
>
> -A is equivalent to -P format=unaligned, so in the above
> invocation, it cancels the effect of -P format=asciidoc.
> Anyway -P format=name on the command line
> is the same as "\pset format name" as a
> metacommand, so it works for all formats.
>
> Some formats (unaligned, html)  have corresponding
> command-line options (-A, -H), and others don't.
> In this patch, -C is used so that csv would be in the
> category of formats that can be switched on with the simpler
> invocation on the command line.
> If we don't like that, we can leave out -C for future use
> and let users write -P format=csv.
> That's not the best choice from my POV though, as csv
> is a primary choice to export tabular data.
>

-C can be used for certificates or some similar. I like csv, but I am not
sure, so it is too important to get short option (the list of free chars
will be only shorter)

Regards

Pavel



>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>


Re: FOR EACH ROW triggers on partitioned tables

2018-03-07 Thread Thomas Munro
On Thu, Mar 8, 2018 at 6:17 AM, Alvaro Herrera  wrote:
> Here's another version of this patch.  It is virtually identical to the
> previous one, except for a small doc update and whitespace changes.

What is this test for?

+create trigger failed after update on parted_trig
+  referencing old table as old_table
+  for each statement execute procedure trigger_nothing();

It doesn't fail as you apparently expected.  Perhaps it was supposed
to be "for each row" so you could hit your new error with
errdetail("Triggers on partitioned tables cannot have transition
tables.")?

-- 
Thomas Munro
http://www.enterprisedb.com



Re: csv format for psql

2018-03-07 Thread Daniel Verite
David Fetter wrote:

> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
> 
>psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

-A is equivalent to -P format=unaligned, so in the above
invocation, it cancels the effect of -P format=asciidoc.
Anyway -P format=name on the command line
is the same as "\pset format name" as a
metacommand, so it works for all formats.

Some formats (unaligned, html)  have corresponding
command-line options (-A, -H), and others don't.
In this patch, -C is used so that csv would be in the
category of formats that can be switched on with the simpler
invocation on the command line.
If we don't like that, we can leave out -C for future use
and let users write -P format=csv.
That's not the best choice from my POV though, as csv
is a primary choice to export tabular data.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Robert Haas
On Wed, Mar 7, 2018 at 6:01 AM, Alvaro Herrera  wrote:
> The thing that comes to mind when reading this patch is that some time
> ago we made fun of other database software, "they are so complicated to
> configure, they have some magical settings that few people understand
> how to set".  Postgres was so much better because it was simple to set
> up, no magic crap.  But now it becomes apparent that that only was so
> because Postgres sucked, ie., we hadn't yet gotten to the point where we
> *needed* to introduce settings like that.  Now we finally are?
>
> I have to admit being a little disappointed about that outcome.

I think your disappointment is a little excessive.  I am not convinced
of the need either for this to have any GUCs at all, but if it makes
other people happy to have them, then I think it's worth accepting
that as the price of getting the feature into the tree.  These are
scarcely the first GUCs we have that are hard to tune.  work_mem is a
terrible knob, and there are probably like very few people who know
how to set ssl_ecdh_curve to anything other than the default, and
what's geqo_selection_bias good for, anyway?  I'm not sure what makes
the settings we're adding here any different.  Most people will ignore
them, and a few people who really care can change the values.

> I wonder if this is just because we refuse to acknowledge the notion of
> a connection pooler.  If we did, and the pooler told us "here, this
> session is being given back to us by the application, we'll keep it
> around until the next app comes along", could we clean the oldest
> inactive cache entries at that point?  Currently they use DISCARD for
> that.  Though this does nothing to fix hypothetical cache bloat for
> pg_dump in bug #14936.

We could certainly clean the oldest inactive cache entries at that
point, but there's no guarantee that would be the right thing to do.
If the working set across all applications is small enough that you
can keep them all in the caches all the time, then you should do that,
for maximum performance.  If not, DISCARD ALL should probably flush
everything that the last application needed and the next application
won't.  But without some configuration knob, you have zero way of
knowing how concerned the user is about saving memory in this place
vs. improving performance by reducing catalog scans.  Even with such a
knob it's a little difficult to say which things actually ought to be
thrown away.

I think this is a related problem, but a different one.  I also think
we ought to have built-in connection pooling.  :-)

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: planner failure with ProjectSet + aggregation + parallel query

2018-03-07 Thread Robert Haas
On Mon, Mar 5, 2018 at 10:38 AM, Robert Haas  wrote:
> While trying to track down a bug today, I found a different bug.
>
> As of 6946280cded903b6f5269fcce105f8ab1d455d33:
>
> rhaas=# create table foo (a int);
> CREATE TABLE
> rhaas=# set min_parallel_table_scan_size = 0;
> SET
> rhaas=# set parallel_setup_cost = 0;
> SET
> rhaas=# set parallel_tuple_cost = 0;
> SET
> rhaas=# select generate_series(1, a) from foo group by a;
> ERROR:  ORDER/GROUP BY expression not found in targetlist
>
> Without the SET commands, or without the GROUP BY, or without the SRF,
> it successfully constructs a plan.

I am able to reproduce this on commit
69f4b9c85f168ae006929eec44fc44d569e846b9 (Move targetlist SRF handling
from expression evaluation to new executor node) with the following
modification for a GUC rename:

create table foo (a int);
--set min_parallel_table_scan_size = 0;
set min_parallel_relation_size = 0;
set parallel_setup_cost = 0;
set parallel_tuple_cost = 0;
select generate_series(1, a) from foo group by a;

But on the previous commit I can't reproduce it.  So it looks to me
like that's when it got broken.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: csv format for psql

2018-03-07 Thread Daniel Verite
David Fetter wrote:

> This seems pretty specialized.  If we're adding something new, how about 
> 
>psql --format=csv -o foo.csv -c 'TABLE foo'

It's a bit easier to memorize than -P format=csv,
but psql doesn't have any long option that does not a have a short
form with a single letter, and both -F and -f are already taken.
Contrary to -C that isn't used until now.

> Or we could stick with:
> 
>psql -P format=csv -o foo.csv -c 'TABLE foo'

That already works as of the current patch, just
like with the other formats.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite



Re: csv format for psql

2018-03-07 Thread Daniel Verite
 I wrote:

> a better idea would to have a new \pset fieldsep_csv

PFA a v3 patch that implements that, along with
regression tests this time.

Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index bfdf859..8a0e7a1 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -2557,6 +2557,19 @@ lo_import 152801
   
 
   
+  fieldsep_csv
+  
+  
+  Specifies the field separator to be used in the csv format.
+  When the separator appears in a field value, that field
+  is output inside double quotes according to the csv rules.
+  To set a tab as field separator, type \pset fieldsep_csv
+  '\t'. The default is a comma.
+  
+  
+  
+
+  
   fieldsep_zero
   
   
@@ -2585,7 +2598,7 @@ lo_import 152801
   
   
   Sets the output format to one of unaligned,
-  aligned, wrapped,
+  aligned, csv, 
wrapped,
   html, asciidoc,
   latex (uses tabular),
   latex-longtable, or
@@ -2597,14 +2610,22 @@ lo_import 152801
   unaligned format writes all columns of a 
row on one
   line, separated by the currently active field separator. This
   is useful for creating output that might be intended to be read
-  in by other programs (for example, tab-separated or comma-separated
-  format).
+  in by other programs.
   
 
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
 
+ csv format writes columns separated by
+ commas, applying the quoting rules described in RFC-4180.
+ Alternative separators can be selected with \pset 
fieldsep_csv.
+ The output is compatible with the CSV format of the 
COPY
+ command. The header with column names is output unless the
+ tuples_only parameter is on.
+ Title and footers are not printed.
+ 
+
   wrapped format is like 
aligned but wraps
   wide data values across lines to make the output fit in the target
   column width.  The target width is determined as described under
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318..c543b1f 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1960,8 +1960,8 @@ exec_command_pset(PsqlScanState scan_state, bool 
active_branch)
 
int i;
static const char *const my_list[] = {
-   "border", "columns", "expanded", "fieldsep", 
"fieldsep_zero",
-   "footer", "format", "linestyle", "null",
+   "border", "columns", "expanded", "fieldsep", 
"fieldsep_csv",
+   "fieldsep_zero", "footer", "format", 
"linestyle", "null",
"numericlocale", "pager", "pager_min_lines",
"recordsep", "recordsep_zero",
"tableattr", "title", "tuples_only",
@@ -3603,6 +3603,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3674,9 +3677,11 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
popt->topt.format = PRINT_LATEX_LONGTABLE;
else if (pg_strncasecmp("troff-ms", value, vallen) == 0)
popt->topt.format = PRINT_TROFF_MS;
+   else if (pg_strncasecmp("csv", value, vallen) == 0)
+   popt->topt.format = PRINT_CSV;
else
{
-   psql_error("\\pset: allowed formats are unaligned, 
aligned, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n");
+   psql_error("\\pset: allowed formats are unaligned, 
aligned, csv, wrapped, html, asciidoc, latex, latex-longtable, troff-ms\n");
return false;
}
}
@@ -3804,6 +3809,15 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
}
}
 
+   else if (strcmp(param, "fieldsep_csv") == 0)
+   {
+   if (value)
+   {
+   free(popt->topt.fieldSepCsv);
+   popt->topt.fieldSepCsv = pg_strdup(value);
+   }
+   }
+
else if (strcmp(param, "fieldsep_zero") == 0)
{
free(popt->topt.fieldSep.separator);
@@ -3959,6 +3973,13 @@ printP

Re: csv format for psql

2018-03-07 Thread Pavel Stehule
2018-03-07 20:25 GMT+01:00 David Fetter :

> On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
> >
> > >>  psql --csv -c 'TABLE foo' > foo.csv
> > >>
> > >>With a -c to introduce the command.
> > >
> > >This seems pretty specialized.  If we're adding something new, how about
> > >
> > >   psql --format=csv -o foo.csv -c 'TABLE foo'
> > >
> > >Or we could stick with:
> > >
> > >   psql -P format=csv -o foo.csv -c 'TABLE foo'
> >
> > Currently "-P format=csv" uses the unaligned formating separators, i.e.
> '|'
> > is used. I was suggesting that a special long option could switch several
> > variables to some specific values, i.e.
> >
> >   --csv
> >
> > Would be equivalent to something like:
> >
> >   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...
>
> We have some inconsistency here in that fewer table formats are
> supported, but I think asciidoc, etc., do this correctly via
> invocations like:
>
> psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'
>
> > I.e. really generate some csv from the data in just one option, not many.
> >
> > But this is obviously debatable.
>
> I suspect we'll get requests for an all-JSON option, HTML tables,
> etc., assuming we don't have them already.
>
> I'm hoping we can have that all in one framework.  I get that setting
> each of tuples_only, fieldsep, recordsep, etc.  might be a bit of a
> lift for some users, but it's not clear how we'd make a sane default
> that made choices among those correct for enough users.
>
> For example, do we know that we want tuples_only behavior by default?
> A lot of people's CSV tools assume a header row.
>

I am for default header - it can be disabled by -t option

Pavel


>
> Best,
> David.
> --
> David Fetter  http://fetter.org/
> Phone: +1 415 235 3778
>
> Remember to vote!
> Consider donating to Postgres: http://www.postgresql.org/about/donate
>


Re: csv format for psql

2018-03-07 Thread David Fetter
On Wed, Mar 07, 2018 at 08:04:05PM +0100, Fabien COELHO wrote:
> 
> >>  psql --csv -c 'TABLE foo' > foo.csv
> >>
> >>With a -c to introduce the command.
> >
> >This seems pretty specialized.  If we're adding something new, how about
> >
> >   psql --format=csv -o foo.csv -c 'TABLE foo'
> >
> >Or we could stick with:
> >
> >   psql -P format=csv -o foo.csv -c 'TABLE foo'
> 
> Currently "-P format=csv" uses the unaligned formating separators, i.e. '|'
> is used. I was suggesting that a special long option could switch several
> variables to some specific values, i.e.
> 
>   --csv
> 
> Would be equivalent to something like:
> 
>   -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

We have some inconsistency here in that fewer table formats are
supported, but I think asciidoc, etc., do this correctly via
invocations like:

psql -P format=asciidoc -o foo.adoc -AtXc 'TABLE foo'

> I.e. really generate some csv from the data in just one option, not many.
> 
> But this is obviously debatable.

I suspect we'll get requests for an all-JSON option, HTML tables,
etc., assuming we don't have them already.

I'm hoping we can have that all in one framework.  I get that setting
each of tuples_only, fieldsep, recordsep, etc.  might be a bit of a
lift for some users, but it's not clear how we'd make a sane default
that made choices among those correct for enough users.

For example, do we know that we want tuples_only behavior by default?
A lot of people's CSV tools assume a header row.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Fix missing spaces in docs

2018-03-07 Thread Fabrízio de Royes Mello
Hi all,

The attached patch just fix missing spaces in documentation of CREATE
SERVER and CREATE USER MAPPING.

Regards,

-- 
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
>> Timbira: http://www.timbira.com.br
>> Blog: http://fabriziomello.github.io
>> Linkedin: http://br.linkedin.com/in/fabriziomello
>> Twitter: http://twitter.com/fabriziomello
>> Github: http://github.com/fabriziomello
diff --git a/doc/src/sgml/ref/create_server.sgml b/doc/src/sgml/ref/create_server.sgml
index eb4ca89..af0a7a0 100644
--- a/doc/src/sgml/ref/create_server.sgml
+++ b/doc/src/sgml/ref/create_server.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE SERVER [IF NOT EXISTS] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
+CREATE SERVER [ IF NOT EXISTS ] server_name [ TYPE 'server_type' ] [ VERSION 'server_version' ]
 FOREIGN DATA WRAPPER fdw_name
 [ OPTIONS ( option 'value' [, ... ] ) ]
 
diff --git a/doc/src/sgml/ref/create_user_mapping.sgml b/doc/src/sgml/ref/create_user_mapping.sgml
index c2f5278..9719a4f 100644
--- a/doc/src/sgml/ref/create_user_mapping.sgml
+++ b/doc/src/sgml/ref/create_user_mapping.sgml
@@ -21,7 +21,7 @@ PostgreSQL documentation
 
  
 
-CREATE USER MAPPING [IF NOT EXISTS] FOR { user_name | USER | CURRENT_USER | PUBLIC }
+CREATE USER MAPPING [ IF NOT EXISTS ] FOR { user_name | USER | CURRENT_USER | PUBLIC }
 SERVER server_name
 [ OPTIONS ( option 'value' [ , ... ] ) ]
 


Re: csv format for psql

2018-03-07 Thread Fabien COELHO



  psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.


This seems pretty specialized.  If we're adding something new, how about

   psql --format=csv -o foo.csv -c 'TABLE foo'

Or we could stick with:

   psql -P format=csv -o foo.csv -c 'TABLE foo'


Currently "-P format=csv" uses the unaligned formating separators, i.e. 
'|' is used. I was suggesting that a special long option could switch 
several variables to some specific values, i.e.


  --csv

Would be equivalent to something like:

  -P format=csv -P fieldsep=, -P recordsep=\n (?) -P tuples_only=on ...

I.e. really generate some csv from the data in just one option, not many.

But this is obviously debatable.

--
Fabien.



Re: parallel append vs. simple UNION ALL

2018-03-07 Thread Robert Haas
On Wed, Mar 7, 2018 at 5:36 AM, Rajkumar Raghuwanshi
 wrote:
> With 0001 applied on PG-head, I got reference leak warning and later a
> server crash.
> this crash is reproducible with enable_parallel_append=off also.
> below is the test case to reproduce this.

New patches attached, fixing all 3 of the issues you reported:

0001 is a new patch to fix the incorrect parallel safety marks on
upper relations.  I don't know of a visible effect of this patch by
itself, but there might be one.

0002 is the same as the old 0001, but I made a fix in
SS_charge_for_initplans() which fixed your most recent crash report.
Either this or the previous change also fixed the crash you saw when
using tab-completion.  Also, I added some test cases based on your
failing examples.

0003-0005 are the same as the old 0002-0004.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


0005-Consider-Parallel-Append-as-a-way-to-implement-a-uni.patch
Description: Binary data


0004-Generate-a-separate-upper-relation-for-each-stage-of.patch
Description: Binary data


0003-Rewrite-recurse_union_children-to-iterate-rather-tha.patch
Description: Binary data


0002-Let-Parallel-Append-over-simple-UNION-ALL-have-parti.patch
Description: Binary data


0001-Correctly-assess-parallel-safety-of-tlists-when-SRFs.patch
Description: Binary data


Re: pgbench's expression parsing & negative numbers

2018-03-07 Thread Fabien COELHO


Hello Andres,


working on overflow correctness in pg I noticed that pgbench isn't quite
there.


Indeed.

I assume it won't matter terribly often, but the way it parses integers 
makes it incorrect for, at least, the negativemost number. [...] but 
that unfortunately means that the sign is no included in the call to 
strtoint64.


The strtoint64 function is indeed a mess. It does not really report errors 
(more precisely, an error message is printed out, but the execution goes 
on nevertheless...).



Which in turn means you can't correctly parse PG_INT64_MIN,
because that's not representable as a positive number on a two's
complement machine (i.e. everywhere).  Preliminary testing shows that
that can relatively easily fixed by just prefixing [-+]? to the relevant
exprscan.l rules.


I'm not sure I get it, because then "1 -2" would be scanned as "int 
signed_int", which requires to add some fine rules to the parser to handle 
that as an addition, and I would be unclear about the priority handling,

eg "1 -2 * 3". But I agree that it can be made to work with transfering
the conversion to the parser and maybe some careful tweaking there.


But it might also not be a bad idea to simply defer
parsing of the number until exprparse.y has its hand on the number?

There's plenty other things wrong with overflow handling too, especially
evalFunc() basically just doesn't care.


Indeed.

Some overflow issues are easy to fix with the existing pg_*_s64_overflow 
macros that you provided. It should also handle double2int64 overflows.


I have tried to trigger errors with a -fsanitize=signed-integer-overflow 
compilation, without much success, but ISTM that you suggested that 
pgbench does not work with that in another thread. Could you be more 
precise?



I'll come up with a patch for that sometime soon.


ISTM that you have not sent any patch on the subject, otherwise I would 
have reviewed it. Maybe I could do one some time later, unless you think 
that I should not.


--
Fabien.



Re: csv format for psql

2018-03-07 Thread David Fetter
On Wed, Mar 07, 2018 at 07:40:49PM +0100, Fabien COELHO wrote:
> 
> Hello Pavel,
> 
> >>psql --csv 'TABLE Stuff;' > stuff.csv
> >
> >There is commad -c and it should be used. The --csv options should not to
> >have a parameter. I don't like a idea to have more options for query
> >execution.
> 
> Yes, I agree and that is indeed what I meant, sorry for the typo. The
> cleaner example would be something like:
> 
>   psql --csv -c 'TABLE foo' > foo.csv
> 
> With a -c to introduce the command.

This seems pretty specialized.  If we're adding something new, how about 

psql --format=csv -o foo.csv -c 'TABLE foo'

Or we could stick with:

psql -P format=csv -o foo.csv -c 'TABLE foo'

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: csv format for psql

2018-03-07 Thread Pavel Stehule
2018-03-07 19:40 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> psql --csv 'TABLE Stuff;' > stuff.csv
>>>
>>
>> There is commad -c and it should be used. The --csv options should not to
>> have a parameter. I don't like a idea to have more options for query
>> execution.
>>
>
> Yes, I agree and that is indeed what I meant, sorry for the typo. The
> cleaner example would be something like:
>
>   psql --csv -c 'TABLE foo' > foo.csv
>
> With a -c to introduce the command.
>

ok :) it has sense now

Regards

Pavel

>
> --
> Fabien.
>


Re: csv format for psql

2018-03-07 Thread Fabien COELHO


Hello Pavel,


psql --csv 'TABLE Stuff;' > stuff.csv


There is commad -c and it should be used. The --csv options should not to
have a parameter. I don't like a idea to have more options for query
execution.


Yes, I agree and that is indeed what I meant, sorry for the typo. The 
cleaner example would be something like:


  psql --csv -c 'TABLE foo' > foo.csv

With a -c to introduce the command.

--
Fabien.



Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2018-03-07 Thread Matheus de Oliveira
Em 3 de mar de 2018 19:32, "Peter Eisentraut" <
peter.eisentr...@2ndquadrant.com> escreveu:

On 2/20/18 10:10, Matheus de Oliveira wrote:
> Besides that, there is a another change in this patch on current ALTER
> CONSTRAINT about deferrability options. Previously, if the user did
> ALTER CONSTRAINT without specifying an option on deferrable or
> initdeferred, it was implied the default options, so this:
>
> ALTER TABLE tbl
> ALTER CONSTRAINT con_name;
>
> Was equivalent to:
>
> ALTER TABLE tbl
> ALTER CONSTRAINT con_name NOT DEFERRABLE INITIALLY IMMEDIATE;

Oh, that seems wrong.  Probably, it shouldn't even accept that syntax
with an empty options list, let alone reset options that are not
mentioned.


Yeah, it felt really weird when I noticed it. And I just noticed while
reading the source.

Can

you prepare a separate patch for this issue?


I can do that, no problem. It'll take awhile though, I'm on a trip and will
be home around March 20th.

You think this should be applied to all versions that support ALTER
CONSTRAINT, right?

Thanks.

Best regards,


Re: [PATCH] Add support for ON UPDATE/DELETE actions on ALTER CONSTRAINT

2018-03-07 Thread Matheus de Oliveira
Em 2 de mar de 2018 08:15, "Andres Freund"  escreveu:

Hi,


On 2018-02-20 12:10:22 -0300, Matheus de Oliveira wrote:
> I attached a patch to add support for changing ON UPDATE/DELETE actions of
> a constraint using ALTER TABLE ... ALTER CONSTRAINT.

This patch has been submitted to the last commitfest for v11 and is not
a trivial patch. As we don't accept such patches this late, it should be
moved to the next fest.  Any arguments against?


Sorry. My bad.

I'm OK with sending this to the next one.

Best regards,


Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Andres Freund
On 2018-03-07 14:48:48 -0300, Alvaro Herrera wrote:
> Oh, I wasn't suggesting to throw away the whole cache at that point;
> only that that is a convenient to do whatever cleanup we want to do.

But why is that better than doing so continuously?


> What I'm not clear about is exactly what is the cleanup that we want to
> do at that point.  You say it should be based on some configured size;
> Robert says any predefined size breaks [performance for] the case where
> the workload uses size+1, so let's use time instead (evict anything not
> used in more than X seconds?), but keeping in mind that a workload that
> requires X+1 would also break.

We mostly seem to have found that adding a *minimum* size before
starting evicting basedon time solves both of our concerns?


> So it seems we've arrived at the
> conclusion that the only possible solution is to let the user tell us
> what time/size to use.  But that sucks, because the user doesn't know
> either (maybe they can measure, but how?), and they don't even know that
> this setting is there to be tweaked; and if there is a performance
> problem, how do they figure whether or not it can be fixed by fooling
> with this parameter?  I mean, maybe it's set to 10 and we suggest "maybe
> 11 works better" but it turns out not to, so "maybe 12 works better"?
> How do you know when to stop increasing it?

I don't think it's that complicated, for the size figure. Having a knob
that controls how much memory a backend uses isn't a new concept, and
can definitely depend on the usecase.


> This seems a bit like max_fsm_pages, that is to say, a disaster that was
> only fixed by removing it.

I don't think that's a meaningful comparison. max_fms_pages had
persistent effect, couldn't be tuned without restarts, and the
performance dropoffs were much more "cliff" like.

Greetings,

Andres Freund



Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Alvaro Herrera
Hello,

Andres Freund wrote:

> On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote:
> > I wonder if this is just because we refuse to acknowledge the notion of
> > a connection pooler.  If we did, and the pooler told us "here, this
> > session is being given back to us by the application, we'll keep it
> > around until the next app comes along", could we clean the oldest
> > inactive cache entries at that point?  Currently they use DISCARD for
> > that.  Though this does nothing to fix hypothetical cache bloat for
> > pg_dump in bug #14936.
> 
> I'm not seeing how this solves anything?  You don't want to throw all
> caches away, therefore you need a target size.  Then there's also the
> case of the cache being too large in a single "session".

Oh, I wasn't suggesting to throw away the whole cache at that point;
only that that is a convenient to do whatever cleanup we want to do.
What I'm not clear about is exactly what is the cleanup that we want to
do at that point.  You say it should be based on some configured size;
Robert says any predefined size breaks [performance for] the case where
the workload uses size+1, so let's use time instead (evict anything not
used in more than X seconds?), but keeping in mind that a workload that
requires X+1 would also break.  So it seems we've arrived at the
conclusion that the only possible solution is to let the user tell us
what time/size to use.  But that sucks, because the user doesn't know
either (maybe they can measure, but how?), and they don't even know that
this setting is there to be tweaked; and if there is a performance
problem, how do they figure whether or not it can be fixed by fooling
with this parameter?  I mean, maybe it's set to 10 and we suggest "maybe
11 works better" but it turns out not to, so "maybe 12 works better"?
How do you know when to stop increasing it?

This seems a bit like max_fsm_pages, that is to say, a disaster that was
only fixed by removing it.

Thanks,

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Protect syscache from bloating with negative cache entries

2018-03-07 Thread Andres Freund
Hi,

On 2018-03-07 08:01:38 -0300, Alvaro Herrera wrote:
> I wonder if this is just because we refuse to acknowledge the notion of
> a connection pooler.  If we did, and the pooler told us "here, this
> session is being given back to us by the application, we'll keep it
> around until the next app comes along", could we clean the oldest
> inactive cache entries at that point?  Currently they use DISCARD for
> that.  Though this does nothing to fix hypothetical cache bloat for
> pg_dump in bug #14936.

I'm not seeing how this solves anything?  You don't want to throw all
caches away, therefore you need a target size.  Then there's also the
case of the cache being too large in a single "session".

Greetings,

Andres Freund



Re: GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Andres Freund
Hi,

On 2018-03-07 11:58:51 -0300, Alvaro Herrera wrote:
> > This appears to be a duplicate of 
> > https://commitfest.postgresql.org/17/1466/ - as the other one is older, I'm 
> > closing this one.
> 
> This comment makes no sense from the POV of the mail archives.  I had to
> look at the User-Agent in your email to realize that you wrote it in the
> commitfest app.

Yea, I stopped doing so afterwards...


> 1. impersonating the "From:" header is a bad idea; needs fixed much as
>we did with the bugs and doc comments submission forms
> 2. it should have had a header indicating it comes from CF app
> 3. it would be great to include in said header a link to the CF entry
>to which the comment was attached.

Sounds reasonable.

Greetings,

Andres Freund



Re: [HACKERS] GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Alvaro Herrera
I suggest to create a new function GinPredicateLockPage() that checks
whether fast update is enabled for the index.  The current arrangement
looks too repetitive and it seems easy to make a mistake.

Stylistically, please keep #include lines ordered alphabetically, and
cut long lines to below 80 chars.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: FOR EACH ROW triggers on partitioned tables

2018-03-07 Thread Alvaro Herrera
Here's another version of this patch.  It is virtually identical to the
previous one, except for a small doc update and whitespace changes.

To recap: when a row-level trigger is created on a partitioned table, it
is marked tginherits; partitions all have their pg_class row modified
with relhastriggers=true.  No clone of the pg_trigger row is created for
the partitions.  Instead, when the relcache entry for the partition is
created, pg_trigger is scanned to look for entries for its ancestors.
So the trigger list for a partition is created by repeatedly scanning
pg_trigger and pg_inherits, until only entries with relhastriggers=f are
found.

I reserve the right to revise this further, as I'm going to spend a
couple of hours looking at it this afternoon, particularly to see how
concurrent DDL behaves, but I don't see anything obviously wrong with
it.

Robert Haas wrote:

> Elsewhere, we've put a lot of blood, sweat, and tears into making sure
> that we only traverse the inheritance hierarchy from top to bottom.
> Otherwise, we're adding deadlock hazards.  I think it's categorically
> unacceptable to do traversals in the opposite order -- if you do, then
> an UPDATE on a child could deadlock with a LOCK TABLE on the parent.
> That will not win us any awards.

We don't actually open relations or acquire locks in the traversal I was
talking about, though; the only thing we do is scan pg_trigger using
first the partition relid, then seek the ancestor(s) by scanning
pg_inherits and recurse.  We don't acquire locks on the involved
relations, so there should be no danger of deadlocks.  Changes in the
definitions ought to be handled by the cache invalidations that are
sent, although I admit to not having tested this specifically.  I'll do
that later today.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/doc/src/sgml/catalogs.sgml b/doc/src/sgml/catalogs.sgml
index a0e6d7062b..4887878eec 100644
--- a/doc/src/sgml/catalogs.sgml
+++ b/doc/src/sgml/catalogs.sgml
@@ -1873,7 +1873,9 @@ SCRAM-SHA-256$:&l
   
   
True if table has (or once had) triggers; see
-   pg_trigger catalog
+   pg_trigger catalog.
+   If this is a partition, triggers on its partitioned ancestors are also
+   considered
   
  
 
@@ -6988,6 +6990,13 @@ SCRAM-SHA-256$:&l
  
 
  
+  tginherits
+  bool
+  
+  True if trigger applies to children relations too
+ 
+
+ 
   tgnargs
   int2
   
diff --git a/doc/src/sgml/ref/create_trigger.sgml 
b/doc/src/sgml/ref/create_trigger.sgml
index 3d6b9f033c..901264c6d2 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -504,7 +504,9 @@ UPDATE OF column_name1 [, 
column_name2REFERENCING clause, then before and after
images of rows are visible from all affected partitions or child tables.
diff --git a/src/backend/bootstrap/bootparse.y 
b/src/backend/bootstrap/bootparse.y
index ed7a55596f..7ad0126df5 100644
--- a/src/backend/bootstrap/bootparse.y
+++ b/src/backend/bootstrap/bootparse.y
@@ -230,6 +230,7 @@ Boot_CreateStmt:

   RELPERSISTENCE_PERMANENT,

   shared_relation,

   mapped_relation,
+   
   false,

   true);
elog(DEBUG4, "bootstrap 
relation created");
}
@@ -252,6 +253,7 @@ Boot_CreateStmt:

  mapped_relation,

  true,

  0,
+   
  false,

  ONCOMMIT_NOOP,

  (Datum) 0,

  false,
diff --git a/src/backend/catalog/heap.c b/src/backend/catalog/heap.c
index cf36ce4add..815f371ac2 100644
--- a/src/backend/catalog/heap.c
+++ b/src/backend/catalog/heap.c
@@ -257,6 +257,7 @@ heap_create(const c

Re: public schema default ACL

2018-03-07 Thread Stephen Frost
Greetings Petr,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 17:55, Stephen Frost wrote:
> > Greetings Petr, all,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> On 07/03/18 13:14, Stephen Frost wrote:
> >>> * Noah Misch (n...@leadboat.com) wrote:
>  On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> > * Tom Lane (t...@sss.pgh.pa.us) wrote:
> >> I wonder whether it'd be sensible for CREATE USER --- or at least the
> >> createuser script --- to automatically make a matching schema.  Or we
> >> could just recommend that DBAs do so.  Either way, we'd be pushing 
> >> people
> >> towards the design where "$user" does exist for most/all users.  Our 
> >> docs
> >> comment (section 5.8.7) that "the concepts of schema and user are 
> >> nearly
> >> equivalent in a database system that implements only the basic schema
> >> support specified in the standard", so the idea of automatically making
> >> a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
> >> put my flameproof long johns ...)
> >
> > You are not the first to think of this in recent days, and I'm hopeful
> > to see others comment in support of this idea.  For my 2c, I'd suggest
> > that what we actually do is have a new role attribute which is "when
> > this user connects to a database, if they don't have a schema named
> > after their role, then create one."  Creating the role at CREATE ROLE
> > time would only work for the current database, after all (barring some
> > other magic that allows us to create schemas in all current and future
> > databases...).
> 
>  I like the idea of getting more SQL-compatible, if this presents a 
>  distinct
>  opportunity to do so.  I do think it would be too weird to create the 
>  schema
>  in one database only.  Creating it on demand might work.  What would be 
>  the
>  procedure, if any, for database owners who want to deny object creation 
>  in
>  their databases?
> >>>
> >>> My suggestion was that this would be a role attribute.  If an
> >>> administrator doesn't wish for that role to have a schema created
> >>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> >>> we name it) role attribute to false.
> >>>
> >> Yeah I think role attribute makes sense, it's why I suggested something
> >> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
> >> point the schema to public for example) and also the fact that $user
> >> schema which is first in search_path might or might not exist.
> > 
> > What I dislike about this proposal is that it seems to conflate two
> > things- if the schema will be created for the user automatically or not,
> > and what the search_path setting is.
> 
> Well, what $user in search_path resolves to rather than what search_path is.

Alright, that makes a bit more sense to me.  I had thought you were
suggesting we would just combine these two pieces to make up the "real"
search path, which I didn't like.

Having it replace what $user is in the search_path would be a bit
confusing, I think.  Perhaps having a new '$default' would be alright
though I'm still having a bit of trouble imagining the use-case and it
seems like we'd probably still keep the "wil this schema be created
automatically or not" seperate from this new search path variable.

> > Those are two different things and
> > I don't think we should mix them.
> 
> I guess I am missing the point of the schema creation for user then if
> it's not also automatically the default schema for that user.

With our default search path being $user, public, it would be...

Thanks!

Stephen



Re: public schema default ACL

2018-03-07 Thread Petr Jelinek
On 07/03/18 17:55, Stephen Frost wrote:
> Greetings Petr, all,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> On 07/03/18 13:14, Stephen Frost wrote:
>>> * Noah Misch (n...@leadboat.com) wrote:
 On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>> I wonder whether it'd be sensible for CREATE USER --- or at least the
>> createuser script --- to automatically make a matching schema.  Or we
>> could just recommend that DBAs do so.  Either way, we'd be pushing people
>> towards the design where "$user" does exist for most/all users.  Our docs
>> comment (section 5.8.7) that "the concepts of schema and user are nearly
>> equivalent in a database system that implements only the basic schema
>> support specified in the standard", so the idea of automatically making
>> a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
>> put my flameproof long johns ...)
>
> You are not the first to think of this in recent days, and I'm hopeful
> to see others comment in support of this idea.  For my 2c, I'd suggest
> that what we actually do is have a new role attribute which is "when
> this user connects to a database, if they don't have a schema named
> after their role, then create one."  Creating the role at CREATE ROLE
> time would only work for the current database, after all (barring some
> other magic that allows us to create schemas in all current and future
> databases...).

 I like the idea of getting more SQL-compatible, if this presents a distinct
 opportunity to do so.  I do think it would be too weird to create the 
 schema
 in one database only.  Creating it on demand might work.  What would be the
 procedure, if any, for database owners who want to deny object creation in
 their databases?
>>>
>>> My suggestion was that this would be a role attribute.  If an
>>> administrator doesn't wish for that role to have a schema created
>>> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
>>> we name it) role attribute to false.
>>>
>> Yeah I think role attribute makes sense, it's why I suggested something
>> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
>> point the schema to public for example) and also the fact that $user
>> schema which is first in search_path might or might not exist.
> 
> What I dislike about this proposal is that it seems to conflate two
> things- if the schema will be created for the user automatically or not,
> and what the search_path setting is.

Well, what $user in search_path resolves to rather than what search_path is.

> Those are two different things and
> I don't think we should mix them.

I guess I am missing the point of the schema creation for user then if
it's not also automatically the default schema for that user.

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: public schema default ACL

2018-03-07 Thread Stephen Frost
Greetings Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 13:14, Stephen Frost wrote:
> > * Noah Misch (n...@leadboat.com) wrote:
> >> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
> >>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
>  I wonder whether it'd be sensible for CREATE USER --- or at least the
>  createuser script --- to automatically make a matching schema.  Or we
>  could just recommend that DBAs do so.  Either way, we'd be pushing people
>  towards the design where "$user" does exist for most/all users.  Our docs
>  comment (section 5.8.7) that "the concepts of schema and user are nearly
>  equivalent in a database system that implements only the basic schema
>  support specified in the standard", so the idea of automatically making
>  a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
>  put my flameproof long johns ...)
> >>>
> >>> You are not the first to think of this in recent days, and I'm hopeful
> >>> to see others comment in support of this idea.  For my 2c, I'd suggest
> >>> that what we actually do is have a new role attribute which is "when
> >>> this user connects to a database, if they don't have a schema named
> >>> after their role, then create one."  Creating the role at CREATE ROLE
> >>> time would only work for the current database, after all (barring some
> >>> other magic that allows us to create schemas in all current and future
> >>> databases...).
> >>
> >> I like the idea of getting more SQL-compatible, if this presents a distinct
> >> opportunity to do so.  I do think it would be too weird to create the 
> >> schema
> >> in one database only.  Creating it on demand might work.  What would be the
> >> procedure, if any, for database owners who want to deny object creation in
> >> their databases?
> > 
> > My suggestion was that this would be a role attribute.  If an
> > administrator doesn't wish for that role to have a schema created
> > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> > we name it) role attribute to false.
> > 
> Yeah I think role attribute makes sense, it's why I suggested something
> like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
> point the schema to public for example) and also the fact that $user
> schema which is first in search_path might or might not exist.

What I dislike about this proposal is that it seems to conflate two
things- if the schema will be created for the user automatically or not,
and what the search_path setting is.  Those are two different things and
I don't think we should mix them.

> Question would be what happens if schema is then explicitly dropper (in
> either case).

I'm not sure that I see an issue with that- if it's dropped then it gets
recreated when that user logs back in.  The systems I'm aware of, as
best as I can recall, didn't have any particular check or explicit
additional behavior for such a case.

Thanks!

Stephen



Re: public schema default ACL

2018-03-07 Thread Stephen Frost
Greetings Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 16:26, Stephen Frost wrote:
> > Greeting Petr, all,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> On 07/03/18 13:18, Stephen Frost wrote:
> >>> Greetings,
> >>>
> >>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>  Certain "market leader" database behaves this way as well. I just hope
>  we won't go as far as them and also create users for schemas (so that
>  the analogy of user=schema would be complete and working both ways).
>  Because that's one of the main reasons their users depend on packages so
>  much, there is no other way to create a namespace without having to deal
>  with another user which needs to be secured.
> >>>
> >>> I agree that we do *not* want to force role creation on schema creation.
> >>>
>  One thing we could do to limit impact of any of this is having
>  DEFAULT_SCHEMA option for roles which would then be the first one in the
>  search_path (it could default to the role name), that way making public
>  schema work again for everybody would be just about tweaking the roles a
>  bit which can be easily scripted.
> >>>
> >>> I don't entirely get what you're suggesting here considering we already
> >>> have $user, and it is the first in the search_path..?
> >>>
> >>
> >> What I am suggesting is that we add option to set user's default schema
> >> to something other than user name so that if people don't want the
> >> schema with the name of the user auto-created, it won't be.
> > 
> > We have ALTER USER joe SET search_path already though..?  And ALTER
> > DATABASE, and in postgresql.conf?  What are we missing?
> 
> That will not change the fact that we have created schema joe for that
> user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar
> would.
> 
> My point is that I don't mind if we create schemas for users by default,
> but I want simple way to opt out.

Oh, yes, we would definitely need an opt-out mechanism.  It's unclear to
me what adding a 'default schema' role option would do though that's
different from setting the search_path for a user.  I certainly wouldn't
expect it to create a new schema

> > opportunity to do so.  I do think it would be too weird to create the 
> > schema
> > in one database only.  Creating it on demand might work.  What would be 
> > the
> > procedure, if any, for database owners who want to deny object creation 
> > in
> > their databases?
> 
>  Well, REVOKE CREATE ON DATABASE already exists.
> >>>
> >>> That really isn't the same..  In this approach, regular roles are *not*
> >>> given the CREATE right on the database, the system would just create the
> >>> schema for them on login automatically if the role attribute says to do
> >>> so.
> >>
> >> What's the point of creating schema for them if they don't have CREATE
> >> privilege?
> > 
> > They would own the schema and therefore have CREATE and USAGE rights on
> > the schema itself.  Creating objects checks for schema rights, it
> > doesn't check for database rights- that's only if you're creating
> > schemas.
> > 
> 
> Yes, but should the schema for them be created at all if they don't have
> CREATE privilege on the database? If yes then I have same question as
> Noah, how does dba prevent object creation in their databases?

Yes, the schema would be created regardless of the rights of the user on
the database, because the admin set the flag on the role saying 'create
a schema for this user when they log in.'

If we think there is a use-case for saying "this user should only have
schemas in these databases, not all databases" then I could see having
the role attribute be a list of databases or "all", instead.  In the
end, I do think this is something which is controlled at the role level
and not something an individual database owner could override or
prevent, though perhaps there is some room for discussion there.

What I don't want is for this feature to *depend* on the users having
CREATE rights on the database, as that would allow them to create other
schemas (perhaps even one which is named the same as a likely new user
whose account hasn't been created yet or they haven't logged in yet...).

Thanks!

Stephen



Re: Change RangeVarGetRelidExtended() to take flags argument?

2018-03-07 Thread Bossart, Nathan
On 3/5/18, 7:08 PM, "Andres Freund"  wrote:
> On 2018-03-05 19:57:44 -0500, Tom Lane wrote:
>> Andres Freund  writes:
>>> One wrinkle in that plan is that it'd not be trivial to discern whether
>>> a lock couldn't be acquired or whether the object vanished.  I don't
>>> really have good idea how to tackle that yet.
>> Do we really care which case applies?
>
> I think there might be cases where we do. As expand_vacuum_rel()
> wouldn't use missing_ok = true, it'd not be an issue for this specific
> callsite though.

I think it might be enough to simply note the ambiguity of returning
InvalidOid when skip-locked and missing-ok are both specified.  Even
if RangeVarGetRelidExtended() did return whether skip-locked or
missing-ok applied, such a caller would likely not be able to trust
it anyway, as no lock would be held.

Nathan



Re: public schema default ACL

2018-03-07 Thread Petr Jelinek
On 07/03/18 13:14, Stephen Frost wrote:
> Greetings,
> 
> * Noah Misch (n...@leadboat.com) wrote:
>> On Tue, Mar 06, 2018 at 09:28:21PM -0500, Stephen Frost wrote:
>>> * Tom Lane (t...@sss.pgh.pa.us) wrote:
 I wonder whether it'd be sensible for CREATE USER --- or at least the
 createuser script --- to automatically make a matching schema.  Or we
 could just recommend that DBAs do so.  Either way, we'd be pushing people
 towards the design where "$user" does exist for most/all users.  Our docs
 comment (section 5.8.7) that "the concepts of schema and user are nearly
 equivalent in a database system that implements only the basic schema
 support specified in the standard", so the idea of automatically making
 a schema per user doesn't seem ridiculous on its face.  (Now, where'd I
 put my flameproof long johns ...)
>>>
>>> You are not the first to think of this in recent days, and I'm hopeful
>>> to see others comment in support of this idea.  For my 2c, I'd suggest
>>> that what we actually do is have a new role attribute which is "when
>>> this user connects to a database, if they don't have a schema named
>>> after their role, then create one."  Creating the role at CREATE ROLE
>>> time would only work for the current database, after all (barring some
>>> other magic that allows us to create schemas in all current and future
>>> databases...).
>>
>> I like the idea of getting more SQL-compatible, if this presents a distinct
>> opportunity to do so.  I do think it would be too weird to create the schema
>> in one database only.  Creating it on demand might work.  What would be the
>> procedure, if any, for database owners who want to deny object creation in
>> their databases?
> 
> My suggestion was that this would be a role attribute.  If an
> administrator doesn't wish for that role to have a schema created
> on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> we name it) role attribute to false.
> 
Yeah I think role attribute makes sense, it's why I suggested something
like DEFAULT_SCHEMA, that seems to address both schema creation (dba can
point the schema to public for example) and also the fact that $user
schema which is first in search_path might or might not exist.

Question would be what happens if schema is then explicitly dropper (in
either case).

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: public schema default ACL

2018-03-07 Thread Petr Jelinek
On 07/03/18 16:26, Stephen Frost wrote:
> Greeting Petr, all,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> On 07/03/18 13:18, Stephen Frost wrote:
>>> Greetings,
>>>
>>> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
 Certain "market leader" database behaves this way as well. I just hope
 we won't go as far as them and also create users for schemas (so that
 the analogy of user=schema would be complete and working both ways).
 Because that's one of the main reasons their users depend on packages so
 much, there is no other way to create a namespace without having to deal
 with another user which needs to be secured.
>>>
>>> I agree that we do *not* want to force role creation on schema creation.
>>>
 One thing we could do to limit impact of any of this is having
 DEFAULT_SCHEMA option for roles which would then be the first one in the
 search_path (it could default to the role name), that way making public
 schema work again for everybody would be just about tweaking the roles a
 bit which can be easily scripted.
>>>
>>> I don't entirely get what you're suggesting here considering we already
>>> have $user, and it is the first in the search_path..?
>>>
>>
>> What I am suggesting is that we add option to set user's default schema
>> to something other than user name so that if people don't want the
>> schema with the name of the user auto-created, it won't be.
> 
> We have ALTER USER joe SET search_path already though..?  And ALTER
> DATABASE, and in postgresql.conf?  What are we missing?

That will not change the fact that we have created schema joe for that
user though. While something like CREATE USER joe DEFAULT_SCHEMA foobar
would.

My point is that I don't mind if we create schemas for users by default,
but I want simple way to opt out.

> 
> opportunity to do so.  I do think it would be too weird to create the 
> schema
> in one database only.  Creating it on demand might work.  What would be 
> the
> procedure, if any, for database owners who want to deny object creation in
> their databases?

 Well, REVOKE CREATE ON DATABASE already exists.
>>>
>>> That really isn't the same..  In this approach, regular roles are *not*
>>> given the CREATE right on the database, the system would just create the
>>> schema for them on login automatically if the role attribute says to do
>>> so.
>>
>> What's the point of creating schema for them if they don't have CREATE
>> privilege?
> 
> They would own the schema and therefore have CREATE and USAGE rights on
> the schema itself.  Creating objects checks for schema rights, it
> doesn't check for database rights- that's only if you're creating
> schemas.
> 

Yes, but should the schema for them be created at all if they don't have
CREATE privilege on the database? If yes then I have same question as
Noah, how does dba prevent object creation in their databases?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [GSOC 18] Discussion on the datatable

2018-03-07 Thread Mark Wong
Hi Hongyuan,

On Tue, Mar 06, 2018 at 01:36:23AM +0800, Hongyuan Ma wrote:
> Hi Mark,
> In the past few days I have read some code in pgperffarm.git repository.I 
> look forward to discussing the project in detail with you and gradually 
> defining the datasheet structure and refining the requirements. Here are some 
> of my ideas, if there are any errors or deficiencies, please feel free to 
> correct me.
> 
> 
> To create a datasheet: pg_perf_cate
> Overview:
> pg_perf_cate table is used to store performance test project categories that 
> support multi-level categories.
> 
> 
> Description:
> In line 12 of the "pgperffarm \ client \ benchmarks \ runner.py" file there 
> is a line like this:
> 
> 
> ''
> 'manages runs of all the benchmarks, including cluster restarts etc.'
> ''
> 
> 
> In my imagination, there may be more items of performance tests than build 
> tests. Based on the above comments, I guess, for example, may be there are 
> "single instance of performance tests","cluster performance tests", "other 
> performance tests" three major categories. Each major category also contains 
> their own test sub-categories, such as addition tests and deletion tests and 
> so on. In the pg_perf_cate table, the cate_pid field indicates the parent 
> category of the current test category. If the pid field of a row of data has 
> a value of 0, the row represents the top-level category.
> 
> 
> Related Functions:
>  - Maybe in the navigation bar we can create a category menu to help users 
> quickly find their own interest in the test items (similar to the Amazon Mall 
> product categories). The cate_order field is used to manage the order of the 
> categories in the current level for easy front-end menu display.
>  - In the admin back-end need a page which can add or modify categories.
> -
> To create a datasheet: pg_perf_test
> Overview:
> The pg_perf_test table is used to store specific test items, including the 
> test item number(test_id), the name of the test item(test_name), the ID of 
> the sub-category(cate_id), the description of the test item (test_desc,to be 
> discussed), and the person ID(user_id).
> 
> 
> Description:
> In line 15 of the "pgperffarm \ client \ benchmarks \ pgbench.py" file, I see 
> a note like this:
> ''
> # TODO allow running custom scripts, not just the default
> ''
> Now that I want to allow users to run custom test scripts and upload them, I 
> think it is necessary to tell others how to run the test again. So I want to 
> add a test_desc field that will store the details about this test and tell 
> the user how to run this test. But I am not too sure about the storage format 
> for the details of the test, perhaps the rich text format or markdown format 
> is a suitable choice.
> When this test item is added by the administrator, the user_id field has a 
> value of 0. Otherwise, this field corresponds to the user_id field in the 
> user table. For this field, I prefer not to use foreign keys.
> 
> 
> Related Functions:
>  - At the front end, each test has its own detail page, on which the test 
> related content is presented and a list of test results is listed.
>  - In the admin background need a page to manage test items.
> -
> To create a datasheet: pg_perf_test_result
> 
> 
> Overview:
> The pg_perf_test_result table is used to store test results, including at 
> least the result ID(result_id), user ID (user_id,I prefer not to create a 
> user-test result association table), test item ID(test_id), test branch 
> number(branch_id), system configuration(os_config), pg 
> configuration(pg_config), test result details(result_desc) , test 
> time(add_time) and other fields.
> Confusion:
> I think compared to other tables, pg_perf_test_result table may be a 
> relatively complex one.
> This piece of code can be seen around line 110 of the "pgperffarm \ client \ 
> benchmarks \ runner.py" file:
> ''
> r ['meta'] = {
> 'benchmark': config ['benchmark'],
> 'date': strftime ("% Y-% m-% d% H:% M:% S.00 + 00", 
> gmtime ()),
> 'name': config_name,
> 'uname': uname,
> }
> 
> 
> with open ('% s / results.json'% self._output, 'w') as f:
> f.write (json.dumps (r, indent = 4))
> ''
> Could you please provide a sample results.json file so that I can better 
> understand what information is contained in the uploaded data and design the 
> datasheet based on it.

Don't let this distract you too much from finishing your current term.
There will be plenty of time to hammer out the schema.

Here's a brief description of the data that is summarized in a json
object:

The idea is that the json do

Re: PATCH: Configurable file mode mask

2018-03-07 Thread David Steele
On 3/6/18 10:04 PM, Michael Paquier wrote:
> On Tue, Mar 06, 2018 at 01:32:49PM -0500, David Steele wrote:
>> On 3/5/18 10:46 PM, Michael Paquier wrote:
> 
>>> Those two are separate issues.  Could you begin a new thread on the
>>> matter?  This will attract more attention.
>>
>> OK, I'll move it back for now, and make a note to raise the position as
>> a separate issue.  I'd rather not do it in this fest, though.
> 
> Seems like you forgot to re-add the chmod calls in initdb.c.

Hmmm, I thought we were talking about moving the position of umask().

I don't think the chmod() calls are needed because umask() is being set.
 The tests show that the config files have the proper permissions after
inidb, so this just looks like redundant code to me.

But, I'm not going to argue if you think they should go back.  It would
make the patch less noisy.

>>> -   if (chmod(location, S_IRWXU) != 0)
>>> +   current_umask = umask(0);
>>> +   umask(current_umask);
>>> [...]
>>> -   if (chmod(pg_data, S_IRWXU) != 0)
>>> +   if (chmod(pg_data, PG_DIR_MODE_DEFAULT & ~file_mode_mask) != 0)
>>> Such modifications should be part of patch 3, not 2, where the group
>>> features really apply.
>>
>> The goal of patch 2 is to refactor the way permissions are set by
>> replacing locally hard-coded values with centralized values, so I think
>> this makes sense here.
> 
> Hm.  I would have left that out in this first version, now I am fine to
> defer that to a committer's opinion as well.

OK - I really do think it belongs here.  A committer may not agree, of
course.

>>> my $test_mode = shift;
>>>
>>> +   umask(0077);
>>
>> Added. (Ensure that all files are created with default data dir
>> permissions).
> 
> +   # Ensure that all files are created with default data dir permissions
> +   umask(0077);
> I have stared at this comment two minutes to finally understand that
> this refers to the extra files which are part of this test.  It may be
> better to be a bit more precise here.  I thought first that this
> referred as well to setup_cluster calls...

Updated to, "Ensure that all files created in this module for testing
are set with default data dir permissions."

> +# Ensure all permissions in the pg_data directory are
> correct. Permissions
> +# should be dir = 0700, file = 0600.
> +sub check_pg_data_perm
> +{
> check_permission_recursive() would be a more adapted name.  Stuff in
> TestLib.pm is not necessarily linked to data folders.

When we add group permissions we need to have special logic around
postmaster.pid.  This should be 0600 even if the rest of the files are 0640.

I can certainly remove that special logic in 02 and make this function
more generic, but then I think we would still have to add
check_pg_data_perm() in patch 03.

Another way to do this would be to make the function generic and
stipulate that the postmaster must be shut down before running the
function.  We could check postmaster.pid permissions as a separate test.

What do you think?

> sub clean_rewind_test
> {
> +   ok (check_pg_data_perm($node_master->data_dir(), 0));
> +
> $node_master->teardown_node  if defined $node_master;
> I have also a pending patch for pg_rewind which adds read-only files in
> the data folders of a new test, so this would cause this part to blow
> up.  Testing that for all the test sets does not bring additional value
> as well, and doing it at cleanup phase is also confusing.  So could you
> move that check into only one test's script?  You could remove the umask
> call in 003_extrafiles.pl as well this way, and reduce the patch diffs.

I think I would rather have a way to skip the permission test rather
than disable it for most of the tests.  pg_rewind does more writes into
PGDATA that anything other than a backend.  Good coverage seems like a plus.

> + if ($file_mode != 0600)
> + {
> + print(*STDERR, "$File::Find::name mode must be 0600\n");
> +
> + $result = 0;
> + return
> + }
> 0600 should be replaced by $expected_file_perm, and isn't a ';' missing
> for each return "clause" (how does this even work..)?

Well, 0600 is a special case -- see above.  As for the missing
semi-colon, well that's Perl for you.  Fixed.

> Patch 2 is getting close for a committer lookup I think, still need to
> look at patch 3 in details..  And from patch 3:
>   # Expected permission
> - my $expected_file_perm = 0600;
> - my $expected_dir_perm = 0700;
> + my $expected_file_perm = $allow_group ? 0640 : 0600;
> + my $expected_dir_perm = $allow_group ? 0750 : 0700;
> Or $expected_file_perm and $expected_dir_perm could just be used as
> arguments.

This gets back to the check_pg_data_perm() discussion above.

I'll hold off on another set of patches until I hear back from you.
There were only trivial changes as noted above.

Thanks,
-- 
-David
da...@pgmasters.net



signature.asc
Description: OpenPGP digital signature


Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Alvaro Herrera
Nikolay Shaplov wrote:

> Actually that's me who have lost it.

Yeah, I realized today when I saw your reply to Nikita.  I didn't
realize it was him submitting a new version of the patch.

> The code with  oxford comma would be a 
> bit more complicated. We should put such coma when we have 3+ items and do 
> not 
> put it when we have 2.
> 
> Does it worth it?
> 
> As I've read oxford using of comma is not mandatory and used to avoid 
> ambiguity.
> "XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)".
> oxford comma is used to make sure that YYY and ZZZ are separate items of the 
> list, not an expression inside one item.
> 
> But here we hardly have such ambiguity.

Gracious goodness -- the stuff these Brits come up with!

> So I'll ask again, do you really think it worth it?

I'm not qualified to answer this question.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [PATCH] get rid of StdRdOptions, use individual binary reloptions representation for each relation kind instead

2018-03-07 Thread Nikolay Shaplov
В письме от 2 марта 2018 11:27:49 пользователь Andres Freund написал:

> > Since I get a really big patch as a result, it was decided to commit it in
> > parts.
> 
> I get that, but I strongly suggest not creating 10 loosely related
> threads, but keeping it as a patch series in one thread. It's really
> hard to follow for people not continually paying otherwise. 

Oups. Sorry I thought I should do other way round and create a new thread for 
new patch. And tried to post a link to other threads for connectivity wherever 
I can.

Will do it as you say from now on. 

But I am still confused what should I do if I am commiting two patch from the 
patch series in simultaneously... 

-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.


Re: Limit global default function execution privileges

2018-03-07 Thread Stephen Frost
Greetings,

* David G. Johnston (david.g.johns...@gmail.com) wrote:
> Since we are discussing locking down our defaults is revoking the global
> function execution privilege granted to PUBLIC - instead limiting it to
> just the pg_catalog schema - on the table?
> 
> I'm not sure how strongly I feel toward the proposal but it does come up on
> these lists; and the fact that it doesn't distinguish between security
> definer and security invoker is a trap for the unaware.

I wouldn't limit it to the pg_catalog schema, I'd just explicitly mark
the functions in pg_catalog which should have EXECUTE rights available
to PUBLIC.

I'm afraid this would cause a lot of work for people who use a lot of
pl/pgsql, but it might be a good thing in the end.  Environments could
configure ALTER DEFAULT PRIVILEGES to automatically install the GRANT
back if they wanted it, and pg_dump would just pull through whatever the
privileges actually were on old systems into the new systems.

This definitely comes up regularly when introducing new people to
PostgreSQL.


Thanks!

Stephen



Re: BUG #14941: Vacuum crashes

2018-03-07 Thread Bossart, Nathan
On 3/6/18, 11:04 PM, "Michael Paquier"  wrote:
> +   if (!(options & VACOPT_SKIP_LOCKED))
> +   relid = RangeVarGetRelid(vrel->relation, AccessShareLock,
> false);
> +   else
> +   {
> +   relid = RangeVarGetRelid(vrel->relation, NoLock, false);
> Yeah, I agree with Andres that getting all this logic done in
> RangeVarGetRelidExtended would be cleaner.  Let's see where the other
> thread leads us to:
> https://www.postgresql.org/message-id/20180306005349.b65whmvj7z6hbe2y%40alap3.anarazel.de

I had missed that thread.  Thanks for pointing it out.

> +   /*
> +* We must downcase the statement type for log message
> consistency
> +* between expand_vacuum_rel(), vacuum_rel(), and analyze_rel().
> +*/
> +   stmttype_lower = asc_tolower(stmttype, strlen(stmttype));
> This blows up on multi-byte characters and may report incorrect relation
> name if double quotes are used, no?

At the moment, stmttype is either "VACUUM" or "ANALYZE", but I suppose
there still might be multi-byte risk in translations.

> +   /*
> +* Since autovacuum workers supply OIDs when calling vacuum(), no
> +* autovacuum worker should reach this code, and we can make
> +* assumptions about the logging levels we should use in the
> checks
> +* below.
> +*/
> +   Assert(!IsAutoVacuumWorkerProcess());
> This is a good idea, should be a separate patch as other folks tend to
> be confused with relid handling in expand_vacuum_rel().

Will do.

> +  Specifies that VACUUM should not wait for any
> +  conflicting locks to be released: if a relation cannot be locked
> +  immediately without waiting, the relation is skipped
> Should this mention as well that no errors are raised, allowing the
> process to move on with the next relations?

IMO that is already covered by saying the relation is skipped,
although I'm not against explicitly stating it, too.  Perhaps we could
outline the logging behavior as well.

Nathan



Limit global default function execution privileges

2018-03-07 Thread David G. Johnston
Since we are discussing locking down our defaults is revoking the global
function execution privilege granted to PUBLIC - instead limiting it to
just the pg_catalog schema - on the table?

I'm not sure how strongly I feel toward the proposal but it does come up on
these lists; and the fact that it doesn't distinguish between security
definer and security invoker is a trap for the unaware.

David J.


Re: public schema default ACL

2018-03-07 Thread Stephen Frost
Greeting Petr, all,

* Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> On 07/03/18 13:18, Stephen Frost wrote:
> > Greetings,
> > 
> > * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
> >> Certain "market leader" database behaves this way as well. I just hope
> >> we won't go as far as them and also create users for schemas (so that
> >> the analogy of user=schema would be complete and working both ways).
> >> Because that's one of the main reasons their users depend on packages so
> >> much, there is no other way to create a namespace without having to deal
> >> with another user which needs to be secured.
> > 
> > I agree that we do *not* want to force role creation on schema creation.
> > 
> >> One thing we could do to limit impact of any of this is having
> >> DEFAULT_SCHEMA option for roles which would then be the first one in the
> >> search_path (it could default to the role name), that way making public
> >> schema work again for everybody would be just about tweaking the roles a
> >> bit which can be easily scripted.
> > 
> > I don't entirely get what you're suggesting here considering we already
> > have $user, and it is the first in the search_path..?
> > 
> 
> What I am suggesting is that we add option to set user's default schema
> to something other than user name so that if people don't want the
> schema with the name of the user auto-created, it won't be.

We have ALTER USER joe SET search_path already though..?  And ALTER
DATABASE, and in postgresql.conf?  What are we missing?

> >>> opportunity to do so.  I do think it would be too weird to create the 
> >>> schema
> >>> in one database only.  Creating it on demand might work.  What would be 
> >>> the
> >>> procedure, if any, for database owners who want to deny object creation in
> >>> their databases?
> >>
> >> Well, REVOKE CREATE ON DATABASE already exists.
> > 
> > That really isn't the same..  In this approach, regular roles are *not*
> > given the CREATE right on the database, the system would just create the
> > schema for them on login automatically if the role attribute says to do
> > so.
> 
> What's the point of creating schema for them if they don't have CREATE
> privilege?

They would own the schema and therefore have CREATE and USAGE rights on
the schema itself.  Creating objects checks for schema rights, it
doesn't check for database rights- that's only if you're creating
schemas.

Thanks!

Stephen



Re: public schema default ACL

2018-03-07 Thread Petr Jelinek
On 07/03/18 13:18, Stephen Frost wrote:
> Greetings,
> 
> * Petr Jelinek (petr.jeli...@2ndquadrant.com) wrote:
>> Certain "market leader" database behaves this way as well. I just hope
>> we won't go as far as them and also create users for schemas (so that
>> the analogy of user=schema would be complete and working both ways).
>> Because that's one of the main reasons their users depend on packages so
>> much, there is no other way to create a namespace without having to deal
>> with another user which needs to be secured.
> 
> I agree that we do *not* want to force role creation on schema creation.
> 
>> One thing we could do to limit impact of any of this is having
>> DEFAULT_SCHEMA option for roles which would then be the first one in the
>> search_path (it could default to the role name), that way making public
>> schema work again for everybody would be just about tweaking the roles a
>> bit which can be easily scripted.
> 
> I don't entirely get what you're suggesting here considering we already
> have $user, and it is the first in the search_path..?
> 

What I am suggesting is that we add option to set user's default schema
to something other than user name so that if people don't want the
schema with the name of the user auto-created, it won't be.

> 
>>> opportunity to do so.  I do think it would be too weird to create the schema
>>> in one database only.  Creating it on demand might work.  What would be the
>>> procedure, if any, for database owners who want to deny object creation in
>>> their databases?
>>
>> Well, REVOKE CREATE ON DATABASE already exists.
> 
> That really isn't the same..  In this approach, regular roles are *not*
> given the CREATE right on the database, the system would just create the
> schema for them on login automatically if the role attribute says to do
> so.

What's the point of creating schema for them if they don't have CREATE
privilege?

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services



Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Nikolay Shaplov
В письме от 1 марта 2018 14:47:35 пользователь Alvaro Herrera написал:

> I see you lost the Oxford comma:
> 
> -DETAIL:  Valid values are "on", "off", and "auto".
> +DETAIL:  Valid values are "auto", "on" and "off".
> 
> Please put these back.
Actually that's me who have lost it. The code with  oxford comma would be a 
bit more complicated. We should put such coma when we have 3+ items and do not 
put it when we have 2.

Does it worth it?

As I've read oxford using of comma is not mandatory and used to avoid 
ambiguity.
"XXX, YYY and ZZZ" can be read as "XXX, YYY, ZZZ" or as "XXX, (YYY and ZZZ)".
oxford comma is used to make sure that YYY and ZZZ are separate items of the 
list, not an expression inside one item.

But here we hardly have such ambiguity.

So I'll ask again, do you really think it worth it?


-- 
Do code for fun.

signature.asc
Description: This is a digitally signed message part.


Re: Implementing SQL ASSERTION

2018-03-07 Thread David Fetter
On Mon, Jan 15, 2018 at 09:14:02PM +, Joe Wildish wrote:
> Hi David,
> 
> > On 15 Jan 2018, at 16:35, David Fetter  wrote:
> > 
> > It sounds reasonable enough that I'd like to make a couple of Modest
> > Proposals™, to wit:
> > 
> > - We follow the SQL standard and make SERIALIZABLE the default
> >  transaction isolation level, and
> > 
> > - We disallow writes at isolation levels other than SERIALIZABLE when
> >  any ASSERTION could be in play.
> 
> Certainly it would be easy to put a test into the assertion check
> function to require the isolation level be serialisable. I didn’t
> realise that that was also the default level as per the standard.
> That need not necessarily be changed, of course; it would be obvious
> to the user that it was a requirement as the creation of an
> assertion would fail without it, as would any subsequent attempts to
> modify the involved tables.

This patch no longer applies.  Any chance of a rebase?

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: [PATCH][PROPOSAL] Add enum releation option type

2018-03-07 Thread Nikolay Shaplov
В письме от 1 марта 2018 19:11:05 пользователь Nikita Glukhov написал:
> Hi.
>
> I have refactored patch by introducing new struct relop_enum_element to make
> it possible to use existing C-enum values in option's definition.  So,
> additional enum GIST_OPTION_BUFFERING_XXX was removed.

Hi! I've imported yours relopt_enum_element solution. Since Alvaro liked it,
and I like it to. But I called it relopt_enum_element_definition, as it is not
an element itself, but a description of possibilities.

But I do not want to import the rest of it.

> #define RELOPT_ENUM_DEFAULT ((const char *) -1)   /* pseudo-name for 
> default
> value */

I've discussed this solution with my C-experienced friends, and each of them
said, that it will work, but it is better not to use ((const char *) -1) as it
will stop working without any warning, because it is not standard C solution
and newer C-compiler can stop accepting such thing without further notice.

I would avoid using of such thing if possible.

> Also default option value should be placed now in the first element of
> allowed_values[].  This helps not to expose default values definitions (like
> GIST_BUFFERING_AUTO defined in gistbuild.c).

For all other reloption types, default value is a part of relopt_* structure.
I see no reason to do otherwise for enum.

As for exposing GIST_BUFFERING_AUTO. We do the same for default fillfactor
value. I see no reason why we should do otherwise here...

And if we keep default value on relopt_enum, we will not need
RELOPT_ENUM_DEFAULT that, as I said above, I found dubious.


> for (elem = opt_enum->allowed_values; elem->name; elem++)
It is better then I did. I imported that too.

> if (validate && !parsed)
Oh, yes... There was my mistake. I did not consider validate == false case.
I should do it. Thank you for pointing this out.

But I think that we should return default value if the data in pg_class is
brocken.

May be I later should write an additional patch, that throw WARNING if
reloptions from pg_class can't be parsed. DB admin should know about it, I
think...


The rest I've kept as I do before. If you think that something else should be
changed, I'd like to see, not only the changes, but also some explanations. I
am not sure, for example that we should use same enum for reloptions and
metapage buffering mode representation for example. This seems to be logical,
but it may also confuse. I wan to hear all opinions before doing it, for
example.

May be

typedef enum gist_option_buffering_numeric_values
{
GIST_OPTION_BUFFERING_ON = GIST_BUFFERING_STATS,
GIST_OPTION_BUFFERING_OFF = GIST_BUFFERING_DISABLED,
GIST_OPTION_BUFFERING_AUTO = GIST_BUFFERING_AUTO,
} gist_option_buffering_value_numbers;

will do, and then we can assign one enum to another, keeping the traditional
variable naming?

I also would prefer to keep all enum definition in an .h file, not enum part
in .h file, and array part in .c.

--
Do code for fun.diff --git a/src/backend/access/common/reloptions.c b/src/backend/access/common/reloptions.c
index 46276ce..4b775ab 100644
--- a/src/backend/access/common/reloptions.c
+++ b/src/backend/access/common/reloptions.c
@@ -404,7 +404,11 @@ static relopt_real realRelOpts[]  	{{NULL}}
 };

-static relopt_string stringRelOpts[] +static relopt_enum_element_definition gist_buffering_enum_def[] +		GIST_OPTION_BUFFERING_ENUM_DEF;
+static relopt_enum_element_definition view_check_option_enum_def[] +		VIEW_OPTION_CHECK_OPTION_ENUM_DEF;
+static relopt_enum enumRelOpts[]  {
 	{
 		{
@@ -413,10 +417,8 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_GIST,
 			AccessExclusiveLock
 		},
-		4,
-		false,
-		gistValidateBufferingOption,
-		"auto"
+		gist_buffering_enum_def,
+		GIST_OPTION_BUFFERING_AUTO
 	},
 	{
 		{
@@ -425,11 +427,14 @@ static relopt_string stringRelOpts[]  			RELOPT_KIND_VIEW,
 			AccessExclusiveLock
 		},
-		0,
-		true,
-		validateWithCheckOption,
-		NULL
+		view_check_option_enum_def,
+		VIEW_OPTION_CHECK_OPTION_NOT_SET
 	},
+	{{NULL}}
+};
+
+static relopt_string stringRelOpts[] +{
 	/* list terminator */
 	{{NULL}}
 };
@@ -476,6 +481,12 @@ initialize_reloptions(void)
    realRelOpts[i].gen.lockmode));
 		j++;
 	}
+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		Assert(DoLockModesConflict(enumRelOpts[i].gen.lockmode,
+   enumRelOpts[i].gen.lockmode));
+		j++;
+	}
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		Assert(DoLockModesConflict(stringRelOpts[i].gen.lockmode,
@@ -514,6 +525,14 @@ initialize_reloptions(void)
 		j++;
 	}

+	for (i = 0; enumRelOpts[i].gen.name; i++)
+	{
+		relOpts[j] = &enumRelOpts[i].gen;
+		relOpts[j]->type = RELOPT_TYPE_ENUM;
+		relOpts[j]->namelen = strlen(relOpts[j]->name);
+		j++;
+	}
+
 	for (i = 0; stringRelOpts[i].gen.name; i++)
 	{
 		relOpts[j] = &stringRelOpts[i].gen;
@@ -611,6 +630,9 @@ allocate_reloption(bits32 kinds, int type, const char *name, const char *desc)
 		case RELOPT_TYPE_REAL:
 			size = sizeof(relo

Re: public schema default ACL

2018-03-07 Thread Stephen Frost
Greetings,

* Alvaro Herrera (alvhe...@alvh.no-ip.org) wrote:
> Stephen Frost wrote:
> 
> > * Noah Misch (n...@leadboat.com) wrote:
> 
> > > I like the idea of getting more SQL-compatible, if this presents a 
> > > distinct
> > > opportunity to do so.  I do think it would be too weird to create the 
> > > schema
> > > in one database only.  Creating it on demand might work.  What would be 
> > > the
> > > procedure, if any, for database owners who want to deny object creation in
> > > their databases?
> > 
> > My suggestion was that this would be a role attribute.  If an
> > administrator doesn't wish for that role to have a schema created
> > on-demand at login time, they would set the 'SCHEMA_CREATE' (or whatever
> > we name it) role attribute to false.
> 
> Is a single attribute enough?  I think we need two: one would authorize
> to create the schema $user to the user themselves (maybe
> SELF_SCHEMA_CREATE); another would automatically do so when connecting
> to a database that does not have it (perhaps AUTO_CREATE_SCHEMA).

I don't see a use-case for this SELF_SCHEMA_CREATE attribute and it
seems more likely to cause confusion than to be helpful.  If the admin
sets AUTO_CREATE_SCHEMA for a user then that's what we should do.

> Now, maybe the idea of creating it as soon as a connection is
> established is not great.  What about creating it only when the first
> object creation is attempted and there is no other schema to create in?
> This avoid pointless proliferation of empty user schemas, as well as
> avoid the overhead of checking existence of schem $user on each
> connection.

I don't see how creating schemas for roles which the admin has created
with the AUTO_CREATE_SCHEMA option would be pointless.  To not do so
would be confusing, imo.  Consider the user who logs in and doesn't
realize that they're allowed to create a schema and doesn't see a schema
of their own in the list- they aren't going to think "I should just try
to create an object and see if a schema appears", they're going to ask
the admin why they don't have a schema.

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Hmm.  On first glance that sounds bizarre, but we do something pretty
> similar for the pg_temp schemas, so it could likely be made to work.

While I agree that it might not be that hard to make the code do it,
since we do this for temp schemas, I still don't see real value in it
and instead just a confusing system where schemas "appear" at some
arbitrary point when the user happens to try to create an object without
qualification.

I liken this to a well-known and well-trodden feature for auto creating
user home directories on Unix.  Being different from that for, at best,
rare use-cases which could be handled in other ways is going against
POLA.  If an admin is concerned about too many empty schemas or about
having $user in a search_path and needing to search it, then those are
entirely fixable rather easily, but those are the uncommon cases in my
experience.

> One issue to think about is exactly which $user we intend to make the
> schema for, if we've executed SET SESSION AUTHORIZATION, or are inside
> a SECURITY DEFINER function, etc etc.  I'd argue that only the original
> connection username should get this treatment, which may mean that object
> creation can fail in those contexts.

This just strengthens the "this will be confusing to our users" argument,
imv.

Thanks!

Stephen



Re: GSoC 2017: weekly progress reports (week 6)

2018-03-07 Thread Alvaro Herrera
Andres Freund wrote:

> This appears to be a duplicate of https://commitfest.postgresql.org/17/1466/ 
> - as the other one is older, I'm closing this one.

This comment makes no sense from the POV of the mail archives.  I had to
look at the User-Agent in your email to realize that you wrote it in the
commitfest app.  I see three problems here

1. impersonating the "From:" header is a bad idea; needs fixed much as
   we did with the bugs and doc comments submission forms
2. it should have had a header indicating it comes from CF app
3. it would be great to include in said header a link to the CF entry
   to which the comment was attached.

Thanks

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



  1   2   >