date:20180322

2018-03-21 6:24 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-20 18:38 GMT+01:00 Pavel Stehule :
>
>> Hi
>>
>> I am sending new update. The code is less ugly, and the current
>> functionality is +/- final for first stage. It should be good enough for
>> playing and testing this concept.
>>
>> What is supported:
>>
>> 1. scalar, composite and array variables
>> 2. composite can be defined on place or some composite type can be used
>> 3. variable, or any field of variable, can have defined default value
>> 4. variable is database object - the access rights are required
>> 5. the values are stored in binary form with defined typmod
>>
>> An usage is very simple:
>>
>> postgres=# create variable foo as numeric default 0;
>> CREATE VARIABLE
>> postgres=# select foo;
>> ┌─┐
>> │ foo │
>> ╞═╡
>> │   0 │
>> └─┘
>> (1 row)
>>
>> postgres=# let foo = pi();
>> LET
>> postgres=# select foo;
>> ┌──┐
>> │   foo│
>> ╞══╡
>> │ 3.14159265358979 │
>> └──┘
>> (1 row)
>>
>> postgres=# create variable boo as (x numeric default 0, y numeric default
>> 0);
>> CREATE VARIABLE
>> postgres=# let boo.x = 100;
>> LET
>> postgres=# select boo;
>> ┌─┐
>> │   boo   │
>> ╞═╡
>> │ (100,0) │
>> └─┘
>> (1 row)
>>
>> postgres=# select boo.x;
>> ┌─┐
>> │  x  │
>> ╞═╡
>> │ 100 │
>> └─┘
>> (1 row)
>>
>> Please try it.
>>
>
> small fix - support for SQL functions
>
>

the patch is in commit fest list https://commitfest.postgresql.org/18/1608/

Regards

Pavel


>
>> Regards
>>
>> Pavel
>>
>
>

Re: [HACKERS] MERGE SQL Statement for PG11

On Fri, Mar 23, 2018 at 10:00 AM, Amit Langote <
langote_amit...@lab.ntt.co.jp> wrote:

> On 2018/03/23 3:42, Pavan Deolasee wrote:
> > A slightly improved version attached. Apart from doc cleanup based on
> > earlier feedback, fixed one assertion failure based on Rahila's report.
> > This was happening when target relation is referenced in the source
> > subquery. Fixed that and added a test case to test that situation.
> >
> > Rebased on current master.
>
> I tried these patches (applied 0002 on top of 0001).  When applying 0002,
> I got some apply errors:
>
> The next patch would create the file
> src/test/isolation/expected/merge-delete.out,
> which already exists!  Assume -R? [n]
>
> I managed to apply it by ignoring the errors, but couldn't get make check
> to pass; attached regressions.diffs if you want to take a look.
>

Thanks. Are you sure you're using a clean repo? I suspect you'd a previous
version of the patch applied and hence the apply errors now. I also suspect
that you may have made a mistake while resolving the conflicts while
applying the patch (since a file at the same path existed). The failures
also seem related to past version of the patch.

I just checked with a freshly checked out repo and the patches apply
correctly on the current master and regression passes too.
http://commitfest.cputube.org/ also reported success overnight.

>
> Btw, is 0001 redundant with the latest patch on ON CONFLICT DO UPDATE
> thread?  Can I apply just 0002 on top of that patch?  So, I tried that --
> that is, skipped your 0001 and instead applied ON CONFLICT DO UPDATE
> patch, and then applied your 0002.

Yes.  I should probably rebase my patch on your v9 or just include the
relevant changes in the MERGE patch itself to avoid any dependency right
now. Will check.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Re: csv format for psql

2018-03-22 20:10 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-22 19:28 GMT+01:00 Pavel Stehule :
>
>>
>>
>> 2018-03-22 18:38 GMT+01:00 Fabien COELHO :
>>
>>>
>>> Hello Pavel,
>>>
>>> Using \pset format csv means overwriting field sep every time - nobody
 uses
 |

>>>
>>> Yep. The alternative is to have a csv-specific separator variable, which
>>> does not seem very useful, must be remembered, but this is indeed debatable.
>>>
>>> I think so dependency on order of psql arguments is significant problem

>>>
>>> This is intentional, and this issue/feature already exists, the last
>>> argument overwrite previous settings thus will win, eg:
>>>
>>>   psql --pset=format=troff --html -c 'SELECT 1'
>>>
>>> Will output in html, not in troff.
>>>
>>
>> Can we introduce some format specific default separators - if we would
>> not to introduce csv_field_sep options?
>>
>> It should not be hard. All formats can has '|' like now, and csv can have
>> a ',' - then if field separator is not explicit, then default field
>> separator is used, else specified field separator is used.
>>
>> You can see my idea in attached patch
>>
>> Regards
>>
>> Pavel
>>
>> postgres=# \pset format csv
>> Output format is csv.
>> postgres=# select * from foo;
>> a,b,c
>> 1,2,Hello
>> 3,4,Nazdar
>> postgres=# \pset fieldsep ;
>> Field separator is ";".
>> postgres=# select * from foo;
>> a;b;c
>> 1;2;Hello
>> 3;4;Nazdar
>>
>>
>>
> The default fieldsep should be "off" that means so format defaults are
> used. ',' is used for CSV, | for any else.
>
> So all will work like now, but there will be bigger freedom with new
> format design. Now, all formats with possibility to setting fieldsep,
> should to share default '|', what I think, is not practical.
>

I can make patch, if there will be a agreement.

comments?

Regards

Pavel

>
>
>
>
>
>>
>>
>>
>>
>>>
>>> --
>>> Fabien.
>>>
>>
>>
>

Re: Faster inserts with mostly-monotonically increasing values

2018-03-22 Thread Andrew Dunstan

On Fri, Mar 23, 2018 at 10:20 AM, Claudio Freire  wrote:
> On Thu, Mar 22, 2018 at 3:29 AM, Pavan Deolasee
>  wrote:
>>
>> On Thu, Mar 22, 2018 at 7:22 AM, Claudio Freire 
>> wrote:
>>>
>>>
>>> If you're not planning on making any further changes, would you mind
>>> posting a coalesced patch?
>>>
>>> Notice that I removed the last offset thing only halfway, so it would
>>> need some cleanup.
>>
>>
>> Here is an updated patch. I removed the last offset caching thing completely
>> and integrated your changes for conditional lock access. Some other
>> improvements to test cases  too. I realised that we must truncate and
>> re-insert to test index fastpath correctly.


This patch looks in pretty good shape. I have been trying hard to
think of some failure mode but haven't been able to come up with one.

We can put off for another day consideration of if/when e can skip the
check for serialization conflict.

> Some comments
>
> +/*
> + * It's very common to have an index on an auto-incremented or
> + * monotonically increasing value. In such cases, every insertion happens
> + * towards the end of the index. We try to optimise that case by caching
> + * the right-most block of the index. If our cached block is still the
> + * rightmost block, has enough free space to accommodate a new entry and
> + * the insertion key is greater or equal to the first key in this page,
> + * then we can safely conclude that the new key will be inserted in the
> + * cached block. So we simply search within the cached page and insert 
> the
> + * key at the appropriate location. We call it a fastpath.
>
> It should say "the insertion key is strictly greater than the first key"


right.

>
> Equal cases cannot be accepted since it would break uniqueness checks,
> so the comment should say that. The code already is correctly checking
> for strict inequality, it's just the comment that is out of sync.
>
> You might accept equal keys for non-unique indexes, but I don't
> believe making that distinction in the code is worth the hassle.
>
> Also, "rightmost block" != "rightmost leaf page" ("leaf" being the key
> difference). So it should say "rightmost leaf page".


right.

[...]
>
> Setting "offset = InvalidOffsetNumber" in that contorted way is
> unnecessary. You can remove the first assignment and instead
> initialize unconditionally right after the fastpath block (that
> doesn't make use of offset anyway):


Yes, I noticed that and it's confusing, Just set it at the top.



>
> In indexing.out:
>
> +explain select sum(a) from fastpath where a = 6456;
> + QUERY PLAN
> +
> + Aggregate  (cost=4.31..4.32 rows=1 width=8)
> +   ->  Index Only Scan using fpindex1 on fastpath  (cost=0.29..4.30
> rows=1 width=4)
> + Index Cond: (a = 6456)
> +(3 rows)
>
> Having costs in explain tests can be fragile. Better use "explain
> (costs off)". If you run "make check" continuously in a loop, you
> should get some failures related to that pretty quickly.
>



Agree about costs off, but I'm fairly dubious of the value of using
EXPLAIN at all here.Nothing in the patch should result in any change
in EXPLAIN output.

I would probably just have a few regression lines that should be sure
to exercise the code path and leave it at that.


cheers

andrew

-- 
Andrew Dunstanhttps://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PoC PATCH] Parallel dump to /dev/null

2018-03-22 Thread Michael Paquier

On Thu, Mar 22, 2018 at 08:42:05AM +0100, Michael Banck wrote:
> Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:
>> Hence, is there any objection to mark this patch as returned with
>> feedback?  
> 
> I've done so now.

Thanks Michael.
--
Michael


signature.asc
Description: PGP signature

Re: [HACKERS] MERGE SQL Statement for PG11

On 2018/03/23 3:42, Pavan Deolasee wrote:
> A slightly improved version attached. Apart from doc cleanup based on
> earlier feedback, fixed one assertion failure based on Rahila's report.
> This was happening when target relation is referenced in the source
> subquery. Fixed that and added a test case to test that situation.
> 
> Rebased on current master.

I tried these patches (applied 0002 on top of 0001).  When applying 0002,
I got some apply errors:

The next patch would create the file
src/test/isolation/expected/merge-delete.out,
which already exists!  Assume -R? [n]

I managed to apply it by ignoring the errors, but couldn't get make check
to pass; attached regressions.diffs if you want to take a look.

Btw, is 0001 redundant with the latest patch on ON CONFLICT DO UPDATE
thread?  Can I apply just 0002 on top of that patch?  So, I tried that --
that is, skipped your 0001 and instead applied ON CONFLICT DO UPDATE
patch, and then applied your 0002.  I had to fix a couple of places to get
MERGE working correctly for partitioned tables; attached find a delta
patch for the fixes I made, which were needed because I skipped 0001 in
favor of the ON CONFLICT DO UPDATE patch.  But the regression test failure
I mentioned above didn't go away, so it seems to have nothing to do with
partitioning.

Thanks,
Amit
diff --git a/src/backend/executor/nodeModifyTable.c 
b/src/backend/executor/nodeModifyTable.c
index 00d241f232..4927bfebfa 100644
--- a/src/backend/executor/nodeModifyTable.c
+++ b/src/backend/executor/nodeModifyTable.c
@@ -1552,7 +1552,6 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
ExecCheckHeapTupleVisible(estate, , buffer);
 
/* Store target's existing tuple in the state's dedicated slot */
-   ExecSetSlotDescriptor(mtstate->mt_existing, relation->rd_att);
ExecStoreTuple(, mtstate->mt_existing, buffer, false);
 
/*
diff --git a/src/backend/optimizer/prep/preptlist.c 
b/src/backend/optimizer/prep/preptlist.c
index 3ff8d86853..4a864b2340 100644
--- a/src/backend/optimizer/prep/preptlist.c
+++ b/src/backend/optimizer/prep/preptlist.c
@@ -142,7 +142,7 @@ preprocess_targetlist(PlannerInfo *root)
action->targetList = 
expand_targetlist(action->targetList,

   action->commandType,

   result_relation,
-   
   RelationGetDescr(target_relation));
+   
   target_relation);
break;
case CMD_DELETE:
break;
*** /home/amit/pg/mygit/src/test/regress/expected/merge.out 2018-01-30 
11:50:31.297108552 +0900
--- /home/amit/pg/mygit/src/test/regress/results/merge.out  2018-03-23 
13:18:25.034107527 +0900
***
*** 39,52 
  WHEN MATCHED THEN
DELETE
  ;
!QUERY PLAN   
! 
   Merge on target t
 ->  Merge Join
!  Merge Cond: (t.tid = s.sid)
   ->  Sort
!Sort Key: t.tid
!->  Seq Scan on target t
   ->  Sort
 Sort Key: s.sid
 ->  Seq Scan on source s
--- 39,52 
  WHEN MATCHED THEN
DELETE
  ;
! QUERY PLAN
! --
   Merge on target t
 ->  Merge Join
!  Merge Cond: (t_1.tid = s.sid)
   ->  Sort
!Sort Key: t_1.tid
!->  Seq Scan on target t_1
   ->  Sort
 Sort Key: s.sid
 ->  Seq Scan on source s
***
*** 137,142 
--- 137,154 
INSERT DEFAULT VALUES
  ;
  ERROR:  permission denied for table target2
+ -- check if the target can be accessed from source relation subquery; we 
should
+ -- not be able to do so
+ MERGE INTO target t
+ USING (SELECT * FROM source WHERE t.tid > sid) s
+ ON t.tid = s.sid
+ WHEN NOT MATCHED THEN
+   INSERT DEFAULT VALUES
+ ;
+ ERROR:  invalid reference to FROM-clause entry for table "t"
+ LINE 2: USING (SELECT * FROM source WHERE t.tid > sid) s
+   ^
+ HINT:  There is an entry for table "t", but it cannot be referenced from this 
part of the query.
  --
  -- initial tests
  --
***
*** 229,242 
  WHEN MATCHED THEN
UPDATE SET balance = 0
  ;
!QUERY PLAN   
! 
   Merge on target t
 ->  Hash Join
!  Hash Cond: (s.sid = t.tid)
   ->  Seq Scan on source s

Re: [HACKERS] why not parallel seq scan for slow functions

2018-03-22 Thread Amit Kapila

On Tue, Mar 20, 2018 at 1:23 AM, Robert Haas  wrote:
> On Sat, Mar 17, 2018 at 1:16 AM, Amit Kapila  wrote:
>> Test-1
>> --
>> DO $$
>> DECLARE count integer;
>> BEGIN
>> For count In 1..100 Loop
>> Execute 'explain Select ten from tenk1';
>> END LOOP;
>> END;
>> $$;
>>
>> In the above block, I am explaining the simple statement which will
>> have just one path, so there will be one additional path projection
>> and removal cycle for this statement.  I have just executed the above
>> block in psql by having \timing option 'on' and the average timing for
>> ten runs on HEAD is  21292.388 ms, with patches (0001.* ~ 0003) is
>> 22405.2466 ms and with patches (0001.* ~ 0005.*) is 22537.1362.  These
>> results indicate that there is approximately 5~6% of the increase in
>> planning time.
>
> Ugh.  I'm able to reproduce this, more or less -- with master, this
> test took 42089.484 ms, 41935.849 ms, 42519.336 ms on my laptop, but
> with 0001-0003 applied, 43925.959 ms, 43619.004 ms, 43648.426 ms.
> However I have a feeling there's more going on here, because the
> following patch on top of 0001-0003 made the time go back down to
> 42353.548, 41797.757 ms, 41891.194 ms.
>
..
>
> It seems pretty obvious that creating an extra projection path that is
> just thrown away can't "really" be making this faster, so there's
> evidently some other effect here involving how the code is laid out,
> or CPU cache effects, or, uh, something.
>

Yeah, sometimes that kind of stuff change performance characteristics,
but I think what is going on here is that create_projection_plan is
causing the lower node to build physical tlist which takes some
additional time.  I have tried below change on top of the patch series
and it brings back the performance for me.

@@ -1580,7 +1580,7 @@ create_projection_plan(PlannerInfo *root,
ProjectionPath *best_path, int flags)
List   *tlist;

/* Since we intend to project, we don't need to constrain child tlist */
-   subplan = create_plan_recurse(root, best_path->subpath, 0);
+   subplan = create_plan_recurse(root, best_path->subpath, flags);

Another point I have noticed in
0001-Teach-create_projection_plan-to-omit-projection-wher  patch:

-create_projection_plan(PlannerInfo *root, ProjectionPath *best_path)
+create_projection_plan(PlannerInfo *root, ProjectionPath *best_path, int flags)
{
..
+ else if ((flags & CP_LABEL_TLIST) != 0)
+ {
+ tlist = copyObject(subplan->targetlist);
+ apply_pathtarget_labeling_to_tlist(tlist, best_path->path.pathtarget);
+ }
+ else
+ return subplan;
..
}

Before returning subplan, don't we need to copy the cost estimates
from best_path as is done in the same function after few lines.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: [HACKERS] taking stdbool.h into use

2018-03-22 Thread Peter Eisentraut

On 3/22/18 22:17, Peter Eisentraut wrote:
> So after a day, only the old macOS PowerPC systems have sizeof(bool) == 4.

And this is causing some problems in PL/Perl.  I'll look into it.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: ON CONFLICT DO UPDATE for partitioned tables

On 2018/03/22 20:48, Pavan Deolasee wrote:
> Thanks. It's looking much better now.

Thanks.

> I think we can possibly move all ON
> CONFLICT related members to a separate structure and just copy the pointer
> to the structure if (map == NULL). That might make the code a bit more tidy.

OK, I tried that in the attached updated patch.

> Is there anything that needs to be done for transition tables? I checked
> and didn't see anything, but please check.

There doesn't seem to be anything that this patch has to do for transition
tables.  If you look at the tests I added in triggers.sql which exercise
INSERT ON CONFLICT's interaction with transition tables, you can see that
we get the same output for a partitioned table as we get for a normal table.

Thanks,
Amit
From d385c307fbe98935661d7b983229eb5b2e2e6436 Mon Sep 17 00:00:00 2001
From: amit 
Date: Tue, 20 Mar 2018 10:09:38 +0900
Subject: [PATCH v9] Fix ON CONFLICT to work with partitioned tables

Author: Amit Langote, Alvaro Herrera, Etsuro Fujita
---
 doc/src/sgml/ddl.sgml |  15 --
 doc/src/sgml/ref/insert.sgml  |   8 +
 src/backend/catalog/partition.c   |  83 +++--
 src/backend/executor/execMain.c   |   4 +
 src/backend/executor/execPartition.c  | 252 --
 src/backend/executor/nodeModifyTable.c|  82 +++--
 src/backend/parser/analyze.c  |   7 -
 src/include/catalog/partition.h   |   1 +
 src/include/nodes/execnodes.h |  25 ++-
 src/test/regress/expected/insert_conflict.out |  86 +++--
 src/test/regress/expected/triggers.out|  33 
 src/test/regress/sql/insert_conflict.sql  |  72 +++-
 src/test/regress/sql/triggers.sql |  33 
 13 files changed, 616 insertions(+), 85 deletions(-)

diff --git a/doc/src/sgml/ddl.sgml b/doc/src/sgml/ddl.sgml
index 3a54ba9d5a..8805b88d82 100644
--- a/doc/src/sgml/ddl.sgml
+++ b/doc/src/sgml/ddl.sgml
@@ -3324,21 +3324,6 @@ ALTER TABLE measurement ATTACH PARTITION 
measurement_y2008m02
 
  
   
-   Using the ON CONFLICT clause with partitioned tables
-   will cause an error if the conflict target is specified (see
-for more details on how the clause
-   works).  Therefore, it is not possible to specify
-   DO UPDATE as the alternative action, because
-   specifying the conflict target is mandatory in that case.  On the other
-   hand, specifying DO NOTHING as the alternative action
-   works fine provided the conflict target is not specified.  In that case,
-   unique constraints (or exclusion constraints) of the individual leaf
-   partitions are considered.
-  
- 
-
- 
-  
When an UPDATE causes a row to move from one
partition to another, there is a chance that another concurrent
UPDATE or DELETE misses this row.
diff --git a/doc/src/sgml/ref/insert.sgml b/doc/src/sgml/ref/insert.sgml
index 134092fa9c..62e142fd8e 100644
--- a/doc/src/sgml/ref/insert.sgml
+++ b/doc/src/sgml/ref/insert.sgml
@@ -518,6 +518,14 @@ INSERT INTO table_name [ AS 
+
+   
+Note that it is currently not supported for the
+ON CONFLICT DO UPDATE clause of an
+INSERT applied to a partitioned table to update the
+partition key of a conflicting row such that it requires the row be moved
+to a new partition.
+   

 
  It is often preferable to use unique index inference rather than
diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 53855f5088..bfe559490e 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -192,6 +192,7 @@ static int  
get_partition_bound_num_indexes(PartitionBoundInfo b);
 static int get_greatest_modulus(PartitionBoundInfo b);
 static uint64 compute_hash_value(int partnatts, FmgrInfo *partsupfunc,
 Datum *values, 
bool *isnull);
+static Oid get_partition_parent_recurse(Relation inhRel, Oid relid, bool 
getroot);
 
 /*
  * RelationBuildPartitionDesc
@@ -1377,6 +1378,7 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
 
 /*
  * get_partition_parent
+ * Obtain direct parent of given relation
  *
  * Returns inheritance parent of a partition by scanning pg_inherits
  *
@@ -1387,14 +1389,59 @@ check_default_allows_bound(Relation parent, Relation 
default_rel,
 Oid
 get_partition_parent(Oid relid)
 {
-   Form_pg_inherits form;
-   RelationcatalogRelation;
+   RelationinhRel;
+   Oid parentOid;
+
+   inhRel = heap_open(InheritsRelationId, AccessShareLock);
+
+   parentOid = get_partition_parent_recurse(inhRel, relid, false);
+   if (parentOid == InvalidOid)
+   elog(ERROR, "could not find parent of relation %u", relid);
+
+   heap_close(inhRel, AccessShareLock);
+
+   return

Re: [HACKERS] taking stdbool.h into use

2018-03-22 Thread Peter Eisentraut

On 3/21/18 07:48, Peter Eisentraut wrote:
> On 3/21/18 01:51, Tom Lane wrote:
>> Andres Freund  writes:
>>> On March 20, 2018 8:24:41 PM PDT, Michael Paquier  
>>> wrote:
 Yeah, I agree with that.  Just not using stdbool.h in those cases ought
 to be fine.  Any platforms where sizeof(bool) is 4 involves macos older
 than 10.5 and Windows platforms using MSVC versions older than 2003
 (didn't look further down either).
>>
>>> Aren't there some somewhat modern architectures where that's still the 
>>> case, for performance reasons? PPC or such?
>>
>> Well, hydra (F16 on ppc64) has sizeof(bool) = 1.  Don't have any other
>> good datapoints handy.  Presumably we'd set up configure to report
>> what it found out, so it wouldn't take long to survey the buildfarm.
> 
> I've pushed the configure tests.  Let's see what they say.

So after a day, only the old macOS PowerPC systems have sizeof(bool) == 4.

I have committed the rest of this patch now.

Windows could use some manual adjustments in pg_config.h.win32 if anyone
cares.  (That file also needs some more updates.)

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

On Thu, Mar 22, 2018 at 6:02 PM, Alvaro Herrera  wrote:
> Incremental development is a good thing.  Trying to do everything in a
> single commit is great when time is infinite or even merely very long,
> but if you run out of it, which I'm sure is common, leaving some things
> out that can be reasonable implemented in a separate patch is perfectly
> acceptable.

We're talking about something that took me less than an hour to get
working. AFAICT, it's just a matter of tweaking the grammar, and
adding a bit of transformWithClause() boilerplate to the start of
transformMergeStmt().

As I've pointed out on this thread already, I'm often concerned about
supporting functionality like this because it increases my overall
confidence in the design. If it was genuinely hard to add WITH clause
support, then that would probably tell us something about the overall
design that likely creates problems elsewhere. It's easy to say that
it isn't worth holding the patch up for WITH clause support, because
that's true, but it's also beside the point.

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

Peter Geoghegan wrote:

> Pavan hasn't added support for referencing CTEs, which other database
> systems with MERGE have. I think that it ought to be quite doable. It
> didn't take me long to get it working myself, but there wasn't follow
> through on that (I could have posted the patch, which looked exactly
> as you'd expect it to look). I think that we should add support for
> CTEs now, as I see no reason for the omission.

Incremental development is a good thing.  Trying to do everything in a
single commit is great when time is infinite or even merely very long,
but if you run out of it, which I'm sure is common, leaving some things
out that can be reasonable implemented in a separate patch is perfectly
acceptable.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

bugfifx: a minor mistake in brin_inclusion.c comment

2018-03-22 Thread Tomas Vondra

Hi,

while looking at brin_inclusion.c, I've noticed that the comment about
INCLUSION_UNMERGEABLE and INCLUSION_CONTAINS_EMPTY uses incorrect values
(1 instead of 2). Attached is a simple fix.

But perhaps it would be better to use the constants in the comment.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
diff --git a/src/backend/access/brin/brin_inclusion.c b/src/backend/access/brin/brin_inclusion.c
index 2542877..d8042b0 100644
--- a/src/backend/access/brin/brin_inclusion.c
+++ b/src/backend/access/brin/brin_inclusion.c
@@ -60,9 +60,9 @@
  * The values stored in the bv_values arrays correspond to:
  *
  * 0 - the union of the values in the block range
- * 1 - whether an empty value is present in any tuple in the block range
- * 2 - whether the values in the block range cannot be merged (e.g. an IPv6
+ * 1 - whether the values in the block range cannot be merged (e.g. an IPv6
  *	   address amidst IPv4 addresses).
+ * 2 - whether an empty value is present in any tuple in the block range
  */
 #define INCLUSION_UNION0
 #define INCLUSION_UNMERGEABLE		1

Re: Using base backup exclusion filters to reduce data transferred with pg_rewind

2018-03-22 Thread Michael Paquier

On Fri, Mar 23, 2018 at 01:38:38AM +0900, Fujii Masao wrote:
> Personally it looks very intrusive, so I'm feeling inclined to push
> the changes without that refactoring.

Okay.  Just moving the list of items from basebackup.c to a dedicated
header is not sufficient though as things like RELCACHE_INIT_FILENAME as
declared in headers which are backend-only.  The same applies to
LOG_METAINFO_DATAFILE_TMP, PG_AUTOCONF_FILENAME, PG_STAT_TMP_DIR and
PG_DYNSHMEM_DIR.

BACKUP_LABEL_FILE and TABLESPACE_MAP can be included though via xlog.h.
So what are you looking for?  I see a couple of options:
1) The inclusive refactoring, which you are discarding.
2) A dedicated header, but some of the now-not-hardcoded values will
need to be so.  That's the list I am giving above.
3) A copy of the list from basebackup.c to src/bin/pg_rewind/.

I would guess that you are looking for 2), but I am not sure if you
imply that 3) would be acceptable or not.
--
Michael

signature.asc
Description: PGP signature

Re: TOAST table created for partitioned tables

On 2018/03/23 2:51, Robert Haas wrote:
> On Wed, Jan 17, 2018 at 1:03 AM, Amit Langote  wrote:
>> On Wed, Jan 17, 2018 at 1:51 PM, Michael Paquier
>>  wrote:
>>> On Tue, Jan 16, 2018 at 11:38:58PM -0500, Tom Lane wrote:
 Yeah, pg_upgrade already has to cope with cases where the newer version
 thinks a table needs a toast table when the older version didn't, or
 vice versa.  This looks like it ought to fall into that category.
 Not that testing it wouldn't be a good idea.
>>>
>>> As far as I can see this statement is true. If you create a parent
>>> partition table in a v10 cluster, and then upgrade to HEAD with this
>>> patch applied, you'll be able to notice that the relation still has its
>>> toast table present, while newly-created parent partitions would have
>>> nothing. (Just tested, I didn't review the patch in details).
>>
>> Thanks for checking.  I too checked that pg_upgrading v10 cluster
>> containing partitioned tables that have a TOAST table attached to it
>> works normally and like Michael says, the TOAST table remains.
> 
> I have committed your patch.

Thank you!

Regards,
Amit

Re: Typo with pg_multixact/offset in multixact.c

2018-03-22 Thread Michael Paquier

On Thu, Mar 22, 2018 at 01:36:59PM -0400, Robert Haas wrote:
> Committed.

Thanks.
--
Michael


signature.asc
Description: PGP signature

Re: Removing useless DISTINCT clauses

2018-03-22 Thread David Rowley

On 22 March 2018 at 21:16, Laurenz Albe  wrote:
> Would it be very difficult to extend that to "if any unique constraints are
> contained in the DISTINCT clause"?

Unfortunately, it's quite a bit more work to make that happen. It's
not just unique constraints that are required to make this work. We
also need to pay attention to NOT NULL constraints too, as we're
unable to prove function dependency when the keys have NULLs, as there
may be any number of rows containing NULL values.

The problem is that in order to properly invalidate cached plans we
must record the constraint OIDs which the plan depends on.  We can do
that for PK and UNIQUE constraints, unfortunately, we can't do it for
NOT NULL constraints as, at the moment, these are just properties of
pg_attribute.  For this to be made to work we'd need to store those in
pg_constraint too, which is more work that I'm going to do for this
patch.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Faster inserts with mostly-monotonically increasing values

2018-03-22 Thread Claudio Freire

On Thu, Mar 22, 2018 at 3:29 AM, Pavan Deolasee
 wrote:
>
> On Thu, Mar 22, 2018 at 7:22 AM, Claudio Freire 
> wrote:
>>
>>
>> If you're not planning on making any further changes, would you mind
>> posting a coalesced patch?
>>
>> Notice that I removed the last offset thing only halfway, so it would
>> need some cleanup.
>
>
> Here is an updated patch. I removed the last offset caching thing completely
> and integrated your changes for conditional lock access. Some other
> improvements to test cases  too. I realised that we must truncate and
> re-insert to test index fastpath correctly.

Thanks.

Some comments

+/*
+ * It's very common to have an index on an auto-incremented or
+ * monotonically increasing value. In such cases, every insertion happens
+ * towards the end of the index. We try to optimise that case by caching
+ * the right-most block of the index. If our cached block is still the
+ * rightmost block, has enough free space to accommodate a new entry and
+ * the insertion key is greater or equal to the first key in this page,
+ * then we can safely conclude that the new key will be inserted in the
+ * cached block. So we simply search within the cached page and insert the
+ * key at the appropriate location. We call it a fastpath.

It should say "the insertion key is strictly greater than the first key"

Equal cases cannot be accepted since it would break uniqueness checks,
so the comment should say that. The code already is correctly checking
for strict inequality, it's just the comment that is out of sync.

You might accept equal keys for non-unique indexes, but I don't
believe making that distinction in the code is worth the hassle.

Also, "rightmost block" != "rightmost leaf page" ("leaf" being the key
difference). So it should say "rightmost leaf page".

+if (P_ISLEAF(lpageop) && P_RIGHTMOST(lpageop) &&
+!P_INCOMPLETE_SPLIT(lpageop) &&
+!P_IGNORE(lpageop) &&
+(PageGetFreeSpace(page) > itemsz) &&
+PageGetMaxOffsetNumber(page) >= P_FIRSTDATAKEY(lpageop) &&
+_bt_compare(rel, natts, itup_scankey, page,
+P_FIRSTDATAKEY(lpageop)) > 0)
+{
+offset = InvalidOffsetNumber;
+fastpath = true;
+}
...later...
+if (!fastpath)
+{
+/* find the first page containing this key */
+stack = _bt_search(rel, natts, itup_scankey, false, , BT_WRITE,
+   NULL);
+
+offset = InvalidOffsetNumber;

Setting "offset = InvalidOffsetNumber" in that contorted way is
unnecessary. You can remove the first assignment and instead
initialize unconditionally right after the fastpath block (that
doesn't make use of offset anyway):

In indexing.out:

+explain select sum(a) from fastpath where a = 6456;
+ QUERY PLAN
+
+ Aggregate  (cost=4.31..4.32 rows=1 width=8)
+   ->  Index Only Scan using fpindex1 on fastpath  (cost=0.29..4.30
rows=1 width=4)
+ Index Cond: (a = 6456)
+(3 rows)

Having costs in explain tests can be fragile. Better use "explain
(costs off)". If you run "make check" continuously in a loop, you
should get some failures related to that pretty quickly.

+truncate fastpath;
+insert into fastpath select y.x, 'b' || (y.x/10)::text, 100 from
(select generate_series(1,1) as x) y;
+vacuum fastpath;

Vacuum can also be a pain. In parallel tests, vacuum won't do what you
expect it to do, as other concurrent transactions will prevent it
from, say, marking all-visible pages.

In fact, you can get those spurious failures by running "make -j8
check-world" repeatedly (with tap tests enabled). That causes enough
concurrency to cause trouble and sporadic failures.

Question whether you need it. I'm not sure why you need the explain
tests, which I imagine are the reason why you need that vacuum there.

If you do need it, I successfully used the following helper in another patch:

CREATE FUNCTION wait_barrier() RETURNS void LANGUAGE plpgsql AS $$
DECLARE vxids text[];
BEGIN
 -- wait for all transactions to commit to make deleted tuples vacuumable
 SELECT array_agg(DISTINCT virtualtransaction) FROM pg_locks WHERE pid
<> pg_backend_pid() INTO vxids;
 WHILE (SELECT count(virtualtransaction) FROM pg_locks WHERE pid <>
pg_backend_pid() AND virtualtransaction = ANY(vxids)) > 0 LOOP
PERFORM pg_sleep(0.1);
 END LOOP;
END
$$;

You call it right before vacuum to make sure all potentially
troublesome transactions finished by the time vacuum runs:

SELECT wait_barrier();
VACUUM blah;

Name the function something else, though, to avoid name clashing with
that patch. Also... that function up there is also still up for review
itself, so take it with a grain of salt. It worked for

Re: [HACKERS] MERGE SQL Statement for PG11

On Thu, Mar 22, 2018 at 1:06 AM, Simon Riggs  wrote:
> I'm now proposing that I move to commit this, following my own final
> review, on Tues 27 Mar in 5 days time, giving time for cleanup of
> related issues.
>
> If there are any items you believe are still open, please say so now
> or mention any other objections you have.

I would like to see clarity on the use of multiple RTEs for the target
table by MERGE. See my remarks to Pavan just now. Also, I think that
the related issue of how partitioning was implemented needs to get a
lot more review (partitioning is the whole reason for using multiple
RTEs, last I checked). It would be easy enough to fix the multiple
RTEs issue if partitioning wasn't a factor. I didn't manage to do much
review of partitioning at all. I don't really understand how the patch
implements partitioning. Input from a subject matter expert might help
matters quite a lot.

Pavan hasn't added support for referencing CTEs, which other database
systems with MERGE have. I think that it ought to be quite doable. It
didn't take me long to get it working myself, but there wasn't follow
through on that (I could have posted the patch, which looked exactly
as you'd expect it to look). I think that we should add support for
CTEs now, as I see no reason for the omission.

In general, I still think that this patch could do with more review,
but I'm running out of time. If you want to commit it, I will not
explicitly try to block it, but I do have misgivings about your
timeframe.

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

On Thu, Mar 22, 2018 at 12:59 AM, Simon Riggs  wrote:
>> The point here is that we're primarily talking about two whole tables.
>> That deserves such prominent placement, as that suggests where users
>> might really find MERGE useful, but without being too prescriptive.
>
> The information I have is that many people are expecting MERGE to work
> for OLTP since that is how it is used in other databases, not solely
> as an ETL command.

I'm sure that that's true, which is why I said "...without being too
prescriptive".

> So we're not primarily talking about two whole tables.

Sure, but that's where MERGE is going to be compelling. Especially for
Postgres, which already has ON CONFLICT DO UPDATE.

>> Also, instead of saying "There are a variety of differences and
>> restrictions between the two statement types [MERGE and INSERT ... ON
>> CONFLICT DO UPDATE] and they are not interchangeable", you could
>> instead be specific, and say:
>>
>> MERGE is well suited to synchronizing two tables using multiple
>> complex conditions. Using INSERT with ON CONFLICT DO UPDATE works well
>> when requirements are simpler. Only ON CONFLICT provides an atomic
>> INSERT or UPDATE outcome in READ COMMITTED mode.
>>
>> BTW, the docs should be clear on the fact that "INSERT ... ON
>> CONFLICT" isn't a statement. INSERT is. ON CONFLICT is a clause.
>
> I think it would be better if you wrote a separate additional doc
> patch to explain all of this, perhaps in Performance Tips section or
> otherwise.

I don't think that it has much to do with performance.

-- 
Peter Geoghegan

Re: [HACKERS] MERGE SQL Statement for PG11

On Thu, Mar 22, 2018 at 11:42 AM, Pavan Deolasee
 wrote:
> A slightly improved version attached.

You still need to remove this change:

> diff --git a/src/include/miscadmin.h b/src/include/miscadmin.h
> index a4574cd533..dbfb5d2a1a 100644
> --- a/src/include/miscadmin.h
> +++ b/src/include/miscadmin.h
> @@ -444,5 +444,6 @@ extern bool has_rolreplication(Oid roleid);
>  /* in access/transam/xlog.c */
>  extern bool BackupInProgress(void);
>  extern void CancelBackup(void);
> +extern int64 GetXactWALBytes(void);

I see that we're still using two target RTEs in this latest revision,
v23e -- the new approach to partitioning, which I haven't had time to
study in more detail, has not produced a change there. This causes
weird effects, such as the following:

"""
pg@~[20658]=# create table foo(bar int4);
CREATE TABLE
pg@~[20658]=# merge into foo f using (select 1 col) dd on f.bar=dd.col
when matched then update set bar = f.barr + 1;
ERROR:  column f.barr does not exist
LINE 1: ...n f.bar=dd.col when matched then update set bar = f.barr + 1...
 ^
HINT:  Perhaps you meant to reference the column "f.bar" or the column "f.bar".

"""

While I think this this particular HINT buglet is pretty harmless, I
continue to be concerned about the unintended consequences of having
multiple RTEs for MERGE's target table. Each RTE comes from a
different lookup path -- the first one goes through setTargetTable()'s
parserOpenTable() + addRangeTableEntryForRelation() calls. The second
one goes through transformFromClauseItem(), for the join, which
ultimately ends up calling transformTableEntry()/addRangeTableEntry().
Consider commit 5f173040, which fixed a privilege escalation security
bug around multiple name lookup. Could the approach taken by MERGE
here introduce a similar security issue?

Using GDB, I see two calls to RangeVarGetRelidExtended() when a simple
MERGE is executed. They both have identical relation arguments, that
look like this:

(gdb) p *relation
$4 = {
  type = T_RangeVar,
  catalogname = 0x0,
  schemaname = 0x0,
  relname = 0x5600ebdcafb0 "foo",
  inh = 1 '\001',
  relpersistence = 112 'p',
  alias = 0x5600ebdcb048,
  location = 11
}

This seems like something that needs to be explained, at a minimum.
Even if I'm completely wrong about there being a security hazard,
maybe the suggestion that there might be still gives you some idea of
what I mean about unintended consequences.

-- 
Peter Geoghegan

Re: [PATCH] Verify Checksums during Basebackups

2018-03-22 Thread Michael Banck

Hi Magnus,

thanks a lot for looking at my patch!

Am Donnerstag, den 22.03.2018, 15:07 +0100 schrieb Magnus Hagander:
> On Sat, Mar 17, 2018 at 10:34 PM, Michael Banck  
> wrote:
> > On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote:
> > > Possibly open questions:
> > >
> > > 1. I have not so far changed the replication protocol to make verifying
> > > checksums optional. I can go about that next if the consensus is that we
> > > need such an option (and cannot just check it everytime)?
> > 
> > I think most people (including those I had off-list discussions about
> > this with) were of the opinion that such an option should be there, so I
> > added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP
> > replication command and also an option -k / --no-verify-checksums to
> > pg_basebackup to trigger this.
> > 
> > Updated patch attached
> > 
> 
> Notes:
> 
> +   if (checksum_failure == true)
> 
> Really, just if(checksum_failure)
> 
> +           errmsg("checksum mismatch during basebackup")));
> 
> Should be "base backup" in messages.

I've changed both.

> +static const char *skip[] = {
> 
> I think that needs a much better name than just "skip". Skip for what?
> In particular since we are just skipping it for checksums, and not for
> the actual basebackup, that name is actively misinforming.

I have copied that verbatim from the online checksum patch, but of
course this is in src/backend/replication and not src/bin so warrants
more scrutiny. If you plan to commit both for v11, it might make sense
to have that separated out to a more central place?

But I guess what we mean is a test for "is a heap file". Do you have a
good suggestion where it should end up so that pg_verify_checksums can
use it as well?

In the meantime, I've changed the skip[] array to no_heap_files[] and
the skipfile() function to is_heap_file(), also reversing the logic. If
it helps pg_verify_checksums, we could make is_not_a_heap_file()
instead.

> +   filename = basename(pstrdup(readfilename));
> +   if (!noverify_checksums && DataChecksumsEnabled() &&
> +       !skipfile(filename) &&
> +       (strncmp(readfilename, "./global/", 9) == 0 ||
> +       strncmp(readfilename, "./base/", 7) == 0 ||
> +       strncmp(readfilename, "/", 1) == 0))
> +       verify_checksum = true;
> 
> I would include the checks for global, base etc into the skipfile()
> function as well (also renamed).

Check. I had to change the way (the previous) skipfile() works a bit,
because it was expecting a filename as argument, while we check
pathnames in the above.

> +                * Only check pages which have not been modified since the
> +                * start of the base backup.
> 
> I think this needs a description of why, as well (without having read
> this thread, this is a pretty subtle case).

I tried to expand on this some more.

> +system_or_bail 'dd', 'conv=notrunc', 'oflag=seek_bytes', 'seek=4000', 
> 'bs=9', 'count=1', 'if=/dev/zero', "of=$pgdata/$pg_class";
> 
> This part of the test will surely fail on Windows, not having a
> /dev/zero. Can we easily implement this part natively in perl perhaps?

Right, this was one of the open questions. I now came up with a perl 4-
liner that seems to do the trick, but I can't test it on Windows.

> Most of that stuff is trivial and can be cleaned up at commit time. Do
> you want to send an updated patch with a few of those fixes, or should
> I clean it?

I've attached a new patch, but I have not addressed the question whether
skipfile()/is_heap_file() should be moved somewhere else yet.

I found one more cosmetic issue: if there is an external tablespace, and
pg_basebackup encounters corruption, you would get the message
"pg_basebackup: changes to tablespace directories will not be undone"
from cleanup_directories_atexit(), which I now also suppress in case of
checksum failures.

> The test thing is a stopper until we figure that one out though. And
> while at it -- it seems we don't have any tests for the checksum
> feature in general. It would probably make sense to consider that at
> the same time as figuring out the right way to do this one.

I don't want to deflect work, but it seems to me the online checksums
patch would be in a better position to generally test checksums while
it's at it. Or did you mean something related to pg_basebackup?

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuerdiff --git a/doc/src/sgml/protocol.sgml b/doc/src/sgml/protocol.sgml
index 366b684293..ab9f2fb93a 100644
--- a/doc/src/sgml/protocol.sgml
+++ b/doc/src/sgml/protocol.sgml
@@ -2382,7 +2382,7 @@ The commands accepted in replication mode are:

-BASE_BACKUP [ LABEL 'label' ] [ PROGRESS ] [ FAST ] [

Re: Error detail/hint style fixup

Daniel Gustafsson  writes:
> Being a two-space guy I would like the style to remain that, but I also agree
> that the churn is way too expensive and that it’s considered quite obsolete 
> and
> old by many.  Cutting that change from the patch makes the remainder more
> palatable.

This stuff seems reasonably non-controversial, so pushed.

I couldn't resist the temptation to mess about with dblink_res_error
a bit more, too.

BTW, really the point of what I'd mentioned before was to avoid having
dblink_res_error constructing a message out of fragments, which it's
still doing.  I'd thought perhaps we would shove the responsibility for
mentioning the connection name out to the callers to get rid of that.
But handling the possibility of an unnamed connection seems like it'd
complicate the callers considerably.  And as long as we don't actually
have translation support in that module, it'd just be make-work, so
I didn't do it.

regards, tom lane

Re: [HACKERS] Surjective functional indexes

The whole IsProjectionFunctionalIndex looks kinda bogus/ugly to me.  Set
the boolean to false, but keep evaluating anyway?  But then, I thought
the idea was to do this based on the reloption, not by comparing the
expression cost to a magical (unmodifiable) value?

In RelationGetIndexAttrBitmap(), indexattrs no longer gets the columns
corresponding to projection indexes.  Isn't that weird/error
prone/confusing?  I think it'd be saner to add these bits to both
bitmaps.

Please update the comments ending in heapam.c:4188, and generally all
comments that you should update.

Please keep serial_schedule in sync with parallel_schedule.

Also, pgindent.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] Surjective functional indexes

The rd_projidx (list of each nth element in the index list that is a
projection index) thing looks odd.  Wouldn't it make more sense to have
a list of index OIDs that are projection indexes?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: Re: reorganizing partitioning code

Robert Haas wrote:
> On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
>  wrote:
> > Let's keep this entry open till the last minute.
> 
> Ugh, I don't like keeping things open till the last minute all that
> much, especially if they're not being updated.  But since this has
> been updated I guess it's somewhat moot.
> 
> Are you going to pick this up RSN?

If during next week qualifies as RSN, then yes.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: Re: reorganizing partitioning code

On Wed, Mar 21, 2018 at 10:20 AM, Alvaro Herrera
 wrote:
> Let's keep this entry open till the last minute.

Ugh, I don't like keeping things open till the last minute all that
much, especially if they're not being updated.  But since this has
been updated I guess it's somewhat moot.

Are you going to pick this up RSN?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: parallel append vs. simple UNION ALL

On Mon, Mar 19, 2018 at 11:57 AM, Robert Haas  wrote:
> On Fri, Mar 16, 2018 at 7:35 AM, Rajkumar Raghuwanshi
>  wrote:
>> On Fri, Mar 9, 2018 at 1:04 AM, Robert Haas  wrote:
>>> Great.  Committed 0001.  Are you planning any further testing of this
>>> patch series?
>>
>> Sorry I missed the mail.
>> Yes, I have further tested patches and find no more issues.
>
> OK, thanks to both you and Ashutosh Bapat.  Committed 0002 and 0003.

And now committed 0004.  This area could use significantly more work,
but I think it's better to have this much than not.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

2018-03-22 Thread Tomas Vondra

On 03/22/2018 08:51 PM, Tom Lane wrote:
> I wrote:
>> Tomas Vondra  writes:
>>> The 0002 part is the main part, unifying the definition of reltuples on
>>> three main places:
> 
>> On to this part ...
> 
> I've pushed 0002 with several corrections: it did not seem to me that
> you'd correctly propagated what ANALYZE is doing into CREATE INDEX or
> pgstattuple.  Also, since one of the things VACUUM does with num_tuples
> is to report it as "the total number of non-removable tuples", I thought
> we'd better leave that calculation alone.  I made the added code count
> live tuples in a new variable live_tuples, instead.
> 

OK, makes sense. Thanks.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

I wrote:
> Tomas Vondra  writes:
>> The 0002 part is the main part, unifying the definition of reltuples on
>> three main places:

> On to this part ...

I've pushed 0002 with several corrections: it did not seem to me that
you'd correctly propagated what ANALYZE is doing into CREATE INDEX or
pgstattuple.  Also, since one of the things VACUUM does with num_tuples
is to report it as "the total number of non-removable tuples", I thought
we'd better leave that calculation alone.  I made the added code count
live tuples in a new variable live_tuples, instead.

regards, tom lane

Re: Re: csv format for psql

2018-03-22 19:28 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-22 18:38 GMT+01:00 Fabien COELHO :
>
>>
>> Hello Pavel,
>>
>> Using \pset format csv means overwriting field sep every time - nobody
>>> uses
>>> |
>>>
>>
>> Yep. The alternative is to have a csv-specific separator variable, which
>> does not seem very useful, must be remembered, but this is indeed debatable.
>>
>> I think so dependency on order of psql arguments is significant problem
>>>
>>
>> This is intentional, and this issue/feature already exists, the last
>> argument overwrite previous settings thus will win, eg:
>>
>>   psql --pset=format=troff --html -c 'SELECT 1'
>>
>> Will output in html, not in troff.
>>
>
> Can we introduce some format specific default separators - if we would not
> to introduce csv_field_sep options?
>
> It should not be hard. All formats can has '|' like now, and csv can have
> a ',' - then if field separator is not explicit, then default field
> separator is used, else specified field separator is used.
>
> You can see my idea in attached patch
>
> Regards
>
> Pavel
>
> postgres=# \pset format csv
> Output format is csv.
> postgres=# select * from foo;
> a,b,c
> 1,2,Hello
> 3,4,Nazdar
> postgres=# \pset fieldsep ;
> Field separator is ";".
> postgres=# select * from foo;
> a;b;c
> 1;2;Hello
> 3;4;Nazdar
>
>
>
The default fieldsep should be "off" that means so format defaults are
used. ',' is used for CSV, | for any else.

So all will work like now, but there will be bigger freedom with new format
design. Now, all formats with possibility to setting fieldsep, should to
share default '|', what I think, is not practical.





>
>
>
>
>>
>> --
>> Fabien.
>>
>
>

Re: [HACKERS] Add support for tuple routing to foreign partitions

On Tue, Feb 27, 2018 at 7:01 AM, Etsuro Fujita
 wrote:
> Changes other than that are:
>
> * Fixed typo and revised code/comments
> * Added more regression tests
> * Added docs
>
> Attached is a new version of the patch set.

I took a look at this patch set today but I really don't think we
should commit something so minimal.  It's got at least four issues
that I see:

1. It still doesn't work for COPY, because COPY isn't going to have a
ModifyTableState.  I really think it ought to be possible to come up
with an API that can handle both INSERT and COPY; I don't think it
should be necessary to have two different APIs for those two problems.
Amit managed to do it for regular tables, and I don't really see a
good reason why foreign tables need to be different.

2. I previously asked why we couldn't use the existing APIs for this,
and you gave me some answer, but I can't help noticing that
postgresExecForeignRouting is an exact copy of
postgresExecForeignInsert.  Apparently, some code reuse is indeed
possible here!  Why not reuse the same function instead of calling a
new one?  If the answer is that planSlot might be NULL in this case,
or something like that, then let's just document that.  A needless
proliferation of FDW APIs is really undesirable, as it raises the bar
for every FDW author.

3. It looks like we're just doing an INSERT for every row, which is
pretty much an anti-pattern for inserting data into a PostgreSQL
database.  COPY is way faster, and even multi-row inserts are
significantly faster.

4. It doesn't do anything about the UPDATE tuple routing problem
mentioned upthread.

I don't necessarily think that the first patch in this area has to
solve all of those problems, and #4 in particular seems like it might
be a pretty big can of worms.  However, I think that getting INSERT
but not COPY, with bad performance, and with duplicated APIs, is
moving more in the wrong direction than the right one.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Error detail/hint style fixup

2018-03-22 Thread Daniel Gustafsson

> On 22 Mar 2018, at 17:19, Tom Lane  wrote:

> In short, I'm not sure it's worth thrashing a lot of our translatable
> strings to enforce a debatable style detail.

I thought I had voiced that exact concern in my previous email, but I totally
missed that.  

Being a two-space guy I would like the style to remain that, but I also agree
that the churn is way too expensive and that it’s considered quite obsolete and
old by many.  Cutting that change from the patch makes the remainder more
palatable.

cheers ./daniel

errdetail_hint_style_v3.patch
Description: Binary data

Re: Re: csv format for psql

2018-03-22 18:38 GMT+01:00 Fabien COELHO :

>
> Hello Pavel,
>
> Using \pset format csv means overwriting field sep every time - nobody uses
>> |
>>
>
> Yep. The alternative is to have a csv-specific separator variable, which
> does not seem very useful, must be remembered, but this is indeed debatable.
>
> I think so dependency on order of psql arguments is significant problem
>>
>
> This is intentional, and this issue/feature already exists, the last
> argument overwrite previous settings thus will win, eg:
>
>   psql --pset=format=troff --html -c 'SELECT 1'
>
> Will output in html, not in troff.
>

Can we introduce some format specific default separators - if we would not
to introduce csv_field_sep options?

It should not be hard. All formats can has '|' like now, and csv can have a
',' - then if field separator is not explicit, then default field separator
is used, else specified field separator is used.

You can see my idea in attached patch

Regards

Pavel

postgres=# \pset format csv
Output format is csv.
postgres=# select * from foo;
a,b,c
1,2,Hello
3,4,Nazdar
postgres=# \pset fieldsep ;
Field separator is ";".
postgres=# select * from foo;
a;b;c
1;2;Hello
3;4;Nazdar






>
> --
> Fabien.
>
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index bfdf859731..4d3e3b59f3 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -151,6 +151,16 @@ EOF
   
 
 
+
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
 
   -d dbname
   --dbname=dbname
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2546,8 +2556,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2584,9 +2594,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2601,6 +2615,15 @@ lo_import 152801
   format).
   
 
+  csv format writes columns separated
+  by fieldsep, applying the CSV quoting rules
+  described in RFC-4180 and compatible with the CSV format
+  of the COPY command.
+  The header with column names is output unless the
+  tuples_only parameter is on.
+  Title and footers are not printed.
+  
+
   aligned format is the standard, human-readable,
   nicely formatted text output;  this is the default.
   
@@ -2747,8 +2770,8 @@ lo_import 152801
   recordsep
   
   
-  Specifies the record (line) separator to use in unaligned
-  output format. The default is a newline character.
+  Specifies the record (line) separator to use in unaligned or
+  csv output formats. The default is a newline character.
   
   
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318749..1cd8a3856e 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -1222,6 +1222,10 @@ exec_command_f(PsqlScanState

Re: [FEATURE PATCH] pg_stat_statements with plans (v02)

2018-03-22 Thread legrand legrand

Hello,
I'm very interested in pg_stat_statements usage, and I'm very happy to see
you adding plans to it.

Reading other pg_stat_statements threads on this forum, there are also activ
developments to add:
- planing duration,
- first date,
- last_update date,
- parameters for normalized queries,
- ...
as described in
http://www.postgresql-archive.org/pg-stat-statements-HLD-for-futur-developments-td6012381.html

I was wondering about how would your dev behave with all those new features.
It seems to me that bad and good plans will not have any of thoses
informations.

What would you think about displaying good, current, bad plans results in 3
lines inspite of only one line ?

queryid, plantype, query, ...
aaa good  ...
aaa current   ...
aaa bad   ...

This would permit to get planing duration, first capture time, last capture,
executions, query parameters for all plans (good or bad inclusive).

Last question, didn't you think about a model to store all the different
plans using a planid like

queryid, planid, query, ...
aaa plan1  ...
aaa plan2   ...
aaa plan3  ...
...

I can not imagine that there would be so many of them ;o)
Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: TOAST table created for partitioned tables

On Wed, Jan 17, 2018 at 1:03 AM, Amit Langote  wrote:
> On Wed, Jan 17, 2018 at 1:51 PM, Michael Paquier
>  wrote:
>> On Tue, Jan 16, 2018 at 11:38:58PM -0500, Tom Lane wrote:
>>> Yeah, pg_upgrade already has to cope with cases where the newer version
>>> thinks a table needs a toast table when the older version didn't, or
>>> vice versa.  This looks like it ought to fall into that category.
>>> Not that testing it wouldn't be a good idea.
>>
>> As far as I can see this statement is true. If you create a parent
>> partition table in a v10 cluster, and then upgrade to HEAD with this
>> patch applied, you'll be able to notice that the relation still has its
>> toast table present, while newly-created parent partitions would have
>> nothing. (Just tested, I didn't review the patch in details).
>
> Thanks for checking.  I too checked that pg_upgrading v10 cluster
> containing partitioned tables that have a TOAST table attached to it
> works normally and like Michael says, the TOAST table remains.

I have committed your patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Re: csv format for psql



Hello Pavel,


Using \pset format csv means overwriting field sep every time - nobody uses
|


Yep. The alternative is to have a csv-specific separator variable, which 
does not seem very useful, must be remembered, but this is indeed 
debatable.



I think so dependency on order of psql arguments is significant problem


This is intentional, and this issue/feature already exists, the last 
argument overwrite previous settings thus will win, eg:


  psql --pset=format=troff --html -c 'SELECT 1'

Will output in html, not in troff.

--
Fabien.

Re: Typo with pg_multixact/offset in multixact.c

On Mon, Feb 5, 2018 at 2:14 AM, Michael Paquier
 wrote:
> While hacking some stuff, I bumped into the following:
> --- a/src/backend/access/transam/multixact.c
> +++ b/src/backend/access/transam/multixact.c
> @@ -1932,7 +1932,7 @@ ZeroMultiXactMemberPage(int pageno, bool writeXlog)
>   * MaybeExtendOffsetSlru
>   * Extend the offsets SLRU area, if necessary
>   *
> - * After a binary upgrade from <= 9.2, the pg_multixact/offset SLRU area 
> might
> + * After a binary upgrade from <= 9.2, the pg_multixact/offsets SLRU area 
> might
>   * contain files that are shorter than necessary; this would occur if the old
>   * installation had used multixacts beyond the first page (files cannot be
>   * copied, because the on-disk representation is different).  pg_upgrade 
> would

Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Changing default value of wal_sync_method to open_datasync on Linux

On Mon, Feb 19, 2018 at 7:27 PM, Tsunakawa, Takayuki
 wrote:
> The reason for change is better performance.  Robert Haas said open_datasync 
> was much faster than fdatasync with NVRAM in this thread:

I also said it would be worse on spinning disks.

Also, Yoshimi Ichiyanagi did not find it to be true even on NVRAM.

Changing the default requires a lot more than one test result where a
non-default setting is better.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] VACUUM and ANALYZE disagreeing on what reltuples means

Tomas Vondra  writes:
> 0001 fixes this by tracking the number of actually indexed rows in the
> build states, just like in the other index AMs.
> A VACUUM or ANALYZE will fix the estimate, of course, but for tables
> that are not changing very much it may take quite a while. So I think
> this is something we definitely need to back-patch.

Agreed, and done.  I noticed a second bug in contrib/bloom, too: it'd
forget to write out the last index page if that contained only one
tuple, because the new-page code path in bloomBuildCallback failed to
increment the "count" field after clearing it.

> The 0002 part is the main part, unifying the definition of reltuples on
> three main places:

On to this part ...

regards, tom lane

Re: Updating parallel.sgml's treatment of parallel joins

On Fri, Feb 23, 2018 at 10:30 PM, Thomas Munro
 wrote:
> Here is an attempt at updating parallel.sgml to cover Parallel Hash.
> I will be neither surprised nor offended if Robert would like to put
> it differently!

Looks nice.  Committed.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: pgstat_report_activity() and parallel CREATE INDEX (was: Parallel index creation & pg_stat_activity)

On Wed, Mar 7, 2018 at 8:53 PM, Peter Geoghegan  wrote:
> On Thu, Mar 1, 2018 at 2:47 PM, Peter Geoghegan  wrote:
>> No. Just an oversight. Looks like _bt_parallel_build_main() should
>> call pgstat_report_activity(), just like ParallelQueryMain().
>>
>> I'll come up with a patch soon.
>
> Attached patch has parallel CREATE INDEX propagate debug_query_string
> to workers. Workers go on to use this string as their own
> debug_query_string, as well as registering it using
> pgstat_report_activity(). Parallel CREATE INDEX pg_stat_activity
> entries will have a query text, too, which addresses Phil's complaint.

Committed.  Thanks for the patch.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Partition-wise aggregation/grouping

On Thu, Mar 22, 2018 at 6:15 AM, Jeevan Chalke
 wrote:
> Leeks cleaner now. Thanks for refactoring it.
>
> I have merged these changes and flatten all previuos changes into the main
> patch.

Committed 0001-0005.  I made a few further modifications.  These were
mostly cosmetic, but with two exceptions:

1. I moved one set_cheapest() call to avoid doing that twice for the
top-level grouped_rel.

2. I removed the logic to set partition properties for grouped_rels.
As far as I can see, there's nothing that needs this.  It would be
important if we wanted subsequent planning stages to be able to do
partition-wise stuff, e.g. when doing window functions or setops, or
at higher query levels.  Maybe we'll have that someday; until then, I
think this is just a waste of cycles.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: [HACKERS] Surjective functional indexes

2018-03-22 Thread Simon Riggs

Please excuse my absence from this thread.

On 2 March 2018 at 12:34, Konstantin Knizhnik  wrote:
>
>
> On 01.03.2018 22:48, Andres Freund wrote:
>>
>> Hi,
>>
>> I still don't think, as commented upon by Tom and me upthread, that we
>> want this feature in the current form.

The performance benefit of this patch is compelling. I don't see a
comment from Tom along those lines, just a queston as to whether the
code is in the right place.

>> Your arguments about layering violations seem to be counteracted by the
>> fact that ProjectionIsNotChanged() basically appears to do purely
>> executor work inside inside heapam.c.
>
>
> ProjectionIsNotChanged doesn't perform evaluation itslef, is calls
> FormIndexDatum.
> FormIndexDatum is also called in 13 other places in Postgres:
> analyze.c, constraint.c, index.c, tuplesort, execIndexing.c
>
> I still think that trying to store somewhere result odf index expression
> evaluation to be able to reuse in index update is not so good idea:
> it complicates code and add extra dependencies between different modules.
> And benefits of such optimization are not obvious: is index expression
> evaluation is so expensive, then recheck_on_update should be prohibited for
> such index at all.

Agreed. What we are doing is trying to avoid HOT updates.

We make the check for HOT update dynamically when it could be made
statically in some cases. Simplicity says just test it dynamically.

We could also do work to check for HOT update for functional indexes
earlier but for the same reason, I don't think its worth adding.

The patch itself is about as simple as it can be, so the code is in
the right place. My problems with it have been around the actual user
interface to request it.

Index option handling has changed (and this needs rebase!), but other
than that I think we want this and am planning to commit something
early next week.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: FOR EACH ROW triggers on partitioned tables

Peter Eisentraut wrote:
> On 3/21/18 19:18, Alvaro Herrera wrote:
> > Here's v8, which addresses all your comments except the doc updates.  I
> > added a few more tests, too.  Thanks for the review!  I intend to commit
> > this shortly, probably not before Friday to give some more time for
> > others to review/comment.
> 
> Looks good, does what it needs to do.
> 
> A small fixup attached.  In particular, I renamed one trigger from "f",
> which created confusing output, looking like a boolean column.

Thanks!

> This comment in the tests I don't understand:
> 
> -- verify that the FOR UPDATE OF (columns) is propagated correctly
> 
> I don't see how this applies to the tests that follow.  Does this have
> something to do with the subsequent foreign keys patch perhaps?

Not at all ... I meant "AFTER UPDATE OF columns" (used as a firing
event).  Not sure how I typo'ed it that badly.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Using base backup exclusion filters to reduce data transferred with pg_rewind

2018-03-22 Thread Fujii Masao

On Tue, Mar 13, 2018 at 9:48 PM, Michael Paquier  wrote:
> On Tue, Mar 13, 2018 at 01:16:27PM +0300, Anastasia Lubennikova wrote:
>> Since these patches contain mostly cosmetic changes and do not break
>> anything, I think it's fine to mark this thread as Ready For Committer
>> without long discussion.
>
> Thanks Anastasia for the review.  The refactoring is quite intuitive I
> think, still that's perhaps a bit too much intrusive.  Extra opinions
> about that are welcome.

Personally it looks very intrusive, so I'm feeling inclined to push
the changes without that refactoring.

The 0005 patch doesn't seem to be right. The patch changes process_source_file()
so that it excludes some directories like pg_notify from the filemap. However,
after that, process_source_file() is called for the files under those "excluded"
directories and then they are not excluded. For example, pg_notify/ file is
unexpectedly included in the filemap and copied by pg_rewind.

This problem happens because recurse_dir() has the following code and
ISTM you forgot to take into account it.

else if (S_ISDIR(fst.st_mode))
{
callback(path, FILE_TYPE_DIRECTORY, 0, NULL);
/* recurse to handle subdirectories */
recurse_dir(datadir, path, callback);
}

Regards,

-- 
Fujii Masao

Re: FOR EACH ROW triggers on partitioned tables

2018-03-22 Thread Peter Eisentraut

On 3/21/18 19:18, Alvaro Herrera wrote:
> Here's v8, which addresses all your comments except the doc updates.  I
> added a few more tests, too.  Thanks for the review!  I intend to commit
> this shortly, probably not before Friday to give some more time for
> others to review/comment.

Looks good, does what it needs to do.

A small fixup attached.  In particular, I renamed one trigger from "f",
which created confusing output, looking like a boolean column.

This comment in the tests I don't understand:

-- verify that the FOR UPDATE OF (columns) is propagated correctly

I don't see how this applies to the tests that follow.  Does this have
something to do with the subsequent foreign keys patch perhaps?

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From 37c41e1be7fbc1a02c7d543a471c84aee7b75a9f Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Thu, 22 Mar 2018 12:07:54 -0400
Subject: [PATCH] fixup! Allow FOR EACH ROW triggers on partitioned tables

---
 src/include/catalog/pg_constraint.h|  2 +-
 src/include/commands/trigger.h |  2 +-
 src/test/regress/expected/triggers.out | 36 +-
 src/test/regress/sql/triggers.sql  | 10 +-
 4 files changed, 25 insertions(+), 25 deletions(-)

diff --git a/src/include/catalog/pg_constraint.h 
b/src/include/catalog/pg_constraint.h
index 45b26cdfa8..3957e07235 100644
--- a/src/include/catalog/pg_constraint.h
+++ b/src/include/catalog/pg_constraint.h
@@ -73,7 +73,7 @@ CATALOG(pg_constraint,2606)
Oid conindid;   /* index supporting 
this constraint */
 
/*
-* if this constraint is on a partition inherited from a partitioned 
table,
+* If this constraint is on a partition inherited from a partitioned 
table,
 * this is the OID of the corresponding constraint in the parent.
 */
Oid conparentid;
diff --git a/src/include/commands/trigger.h b/src/include/commands/trigger.h
index 2a6f2cd934..a5b8610fa2 100644
--- a/src/include/commands/trigger.h
+++ b/src/include/commands/trigger.h
@@ -159,7 +159,7 @@ extern PGDLLIMPORT int SessionReplicationRole;
 
 extern ObjectAddress CreateTrigger(CreateTrigStmt *stmt, const char 
*queryString,
  Oid relOid, Oid refRelOid, Oid constraintOid, Oid 
indexOid,
- Oid funcid, Oid parentTriggerOid, Node *whenClause,
+ Oid funcoid, Oid parentTriggerOid, Node *whenClause,
  bool isInternal, bool in_partition);
 
 extern void RemoveTriggerById(Oid trigOid);
diff --git a/src/test/regress/expected/triggers.out 
b/src/test/regress/expected/triggers.out
index 2cb56efdf9..a4e9ea03f3 100644
--- a/src/test/regress/expected/triggers.out
+++ b/src/test/regress/expected/triggers.out
@@ -1871,7 +1871,7 @@ drop table parted_trig;
 --
 create table trigpart (a int, b int) partition by range (a);
 create table trigpart1 partition of trigpart for values from (0) to (1000);
-create trigger f after insert on trigpart for each row execute procedure 
trigger_nothing();
+create trigger trg1 after insert on trigpart for each row execute procedure 
trigger_nothing();
 create table trigpart2 partition of trigpart for values from (1000) to (2000);
 create table trigpart3 (like trigpart);
 alter table trigpart attach partition trigpart3 for values from (2000) to 
(3000);
@@ -1879,32 +1879,32 @@ select tgrelid::regclass, tgname, tgfoid::regproc from 
pg_trigger
   where tgrelid::regclass::text like 'trigpart%' order by 
tgrelid::regclass::text;
   tgrelid  | tgname | tgfoid  
 ---++-
- trigpart  | f  | trigger_nothing
- trigpart1 | f  | trigger_nothing
- trigpart2 | f  | trigger_nothing
- trigpart3 | f  | trigger_nothing
+ trigpart  | trg1   | trigger_nothing
+ trigpart1 | trg1   | trigger_nothing
+ trigpart2 | trg1   | trigger_nothing
+ trigpart3 | trg1   | trigger_nothing
 (4 rows)
 
-drop trigger f on trigpart1;   -- fail
-ERROR:  cannot drop trigger f on table trigpart1 because trigger f on table 
trigpart requires it
-HINT:  You can drop trigger f on table trigpart instead.
-drop trigger f on trigpart2;   -- fail
-ERROR:  cannot drop trigger f on table trigpart2 because trigger f on table 
trigpart requires it
-HINT:  You can drop trigger f on table trigpart instead.
-drop trigger f on trigpart3;   -- fail
-ERROR:  cannot drop trigger f on table trigpart3 because trigger f on table 
trigpart requires it
-HINT:  You can drop trigger f on table trigpart instead.
+drop trigger trg1 on trigpart1;-- fail
+ERROR:  cannot drop trigger trg1 on table trigpart1 because trigger trg1 on 
table trigpart requires it
+HINT:  You can drop trigger trg1 on table trigpart instead.
+drop trigger trg1 on trigpart2;-- fail
+ERROR:  cannot drop trigger trg1 on

Re: [PATCH] Verify Checksums during Basebackups

2018-03-22 Thread David Steele

Hi Michael,

On 3/17/18 5:34 PM, Michael Banck wrote:
> 
> On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote:
> 
> I think most people (including those I had off-list discussions about
> this with) were of the opinion that such an option should be there, so I
> added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP
> replication command and also an option -k / --no-verify-checksums to
> pg_basebackup to trigger this.
> 
> Updated patch attached.

+memcpy(page, (buf + BLCKSZ * i), BLCKSZ);

Why make a copy here?  How about:

char *page = buf + BLCKSZ * i

I know pg_checksum_page manipulates the checksum field but I have found
it to be safe.

+if (phdr->pd_checksum != checksum)

I've attached a patch that adds basic retry functionality.  It's not
terrible efficient since it rereads the entire buffer for any block
error.  A better way is to keep a bitmap for each block in the buffer,
then on retry compare bitmaps.  If the block is still bad, report it.
If the block was corrected moved on.  If a block was good before but is
bad on retry it can be ignored.

+ereport(WARNING,
+(errmsg("checksum verification failed in file "

I'm worried about how verbose this warning could be since there are
131,072 blocks per segment.  It's unlikely to have that many block
errors, but users do sometimes put files in PGDATA which look like they
should be validated.  Since these warnings all go to the server log it
could get pretty bad.

We should either stop warning after the first failure, or aggregate the
failures for a file into a single message.

Some tests with multi-block errors should be added to test these scenarios.

Thanks,
-- 
-David
da...@pgmasters.net
diff --git a/src/backend/replication/basebackup.c 
b/src/backend/replication/basebackup.c
index 9f735a2c07..48bacfe5dd 100644
--- a/src/backend/replication/basebackup.c
+++ b/src/backend/replication/basebackup.c
@@ -1258,6 +1258,7 @@ sendFile(const char *readfilename, const char 
*tarfilename, struct stat *statbuf
int i;
pgoff_t len = 0;
charpage[BLCKSZ];
+   int block_error = -1;
size_t  pad;
PageHeader  phdr;
int segmentno = 0;
@@ -1328,9 +1329,16 @@ sendFile(const char *readfilename, const char 
*tarfilename, struct stat *statbuf
verify_checksum = false;
continue;
}
-   for (i = 0; i < cnt / BLCKSZ; i++)
+
+   /*
+* Validate all blocks in the buffer.  If checking 
after the buffer
+* was reread then start at the block that caused the 
reread.
+ */
+   for (i = block_error == -1 ? 0 : block_error; i < cnt / 
BLCKSZ; i++)
{
+   blkno = len / BLCKSZ + i;
memcpy(page, (buf + BLCKSZ * i), BLCKSZ);
+
/*
 * Only check pages which have not been 
modified since the
 * start of the base backup.
@@ -1341,6 +1349,18 @@ sendFile(const char *readfilename, const char 
*tarfilename, struct stat *statbuf
phdr = (PageHeader) page;
if (phdr->pd_checksum != checksum)
{
+   /*
+* If first time encountering 
an error on this block
+* then read the last buffer 
again to see if the
+* checksum is corrected.
+*/
+   if (block_error == -1)
+   {
+   block_error = i;
+   fseek(fp, -cnt, 
SEEK_CUR);
+   break;
+   }
+
ereport(WARNING,

(errmsg("checksum verification failed in file "

"\"%s\", block %d: calculated %X but "
@@ -1350,8 +1370,12 @@ sendFile(const char *readfilename, const char 
*tarfilename, struct stat *statbuf
checksum_failure = true;
}
}
-   blkno++;
+   block_error = -1;
}
+
+   /*

Re: Error detail/hint style fixup

Daniel Gustafsson writes:
> [ errdetail_hint_style_v2.patch ]

I started going through this in more detail, and I see that a significant
chunk of the changes are to put two spaces not one between sentences in
errdetail/errhint messages. This is per our style guideline:

Detail and hint messages: Use complete sentences, and end each with
a period. Capitalize the first word of sentences. Put two spaces after
the period if another sentence follows (for English text; might be
inappropriate in other languages).

but I wonder if maybe the right answer is to drop the last sentence of
that style guideline. It seems a bit at odds with the general principle
enunciated under Formatting: "Don't put any specific assumptions about
formatting into the message texts". There are those who consider it
obsolete style, too, eg

http://www.thedailymash.co.uk/news/society/last-human-to-use-two-spaces-after-a-full-stop-dies-20180312145785

Personally I do type two spaces, which is a habit I learned while using
TeX (where it made a difference), but that was a long time ago.

A quick grep through one of the backend .po files suggests that about
two-thirds of our messages where the issue arises have one space rather
than two (I counted about 35 instances of one space, 15 of two). If
we did want to standardize this fully, we'd have to hit more places
than you did here.

In short, I'm not sure it's worth thrashing a lot of our translatable
strings to enforce a debatable style detail.

Thoughts?

regards, tom lane

Re: Prefix operator for text and spgist support

2018-03-22 Thread Teodor Sigaev


Hi!

Patch looks resonable, but I see some place to improvement:
spg_text_leaf_consistent() only needs to check with text_startswith() if 
reconstucted value came to leaf consistent is shorter than given prefix. For 
example, if level >= length of prefix then we guarantee that fully reconstracted 
is matched too. But do not miss that you may need to return value for index only 
scan, consult returnData field


In attachment rebased and minorly edited version of your patch.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/
diff --git a/doc/src/sgml/func.sgml b/doc/src/sgml/func.sgml
index 2f59af25a6..4dc11d8df2 100644
--- a/doc/src/sgml/func.sgml
+++ b/doc/src/sgml/func.sgml
@@ -2274,6 +2274,21 @@
ph
   
 
+  
+   
+
+ text_startswith
+
+text_startswith(string, prefix)
+   
+   bool
+   
+Returns true if string starts with prefix.
+   
+   text_startswith('alphabet', 'alph')
+   t
+  
+
   

 
@@ -4033,6 +4048,12 @@ cast(-44 as bit(12))   11010100
 ILIKE, respectively.  All of these operators are
 PostgreSQL-specific.

+
+   
+There is also the prefix operator ^@ and corresponding
+text_startswith function which covers cases when only
+searching by beginning of the string is needed.
+   
   
 
 
diff --git a/doc/src/sgml/spgist.sgml b/doc/src/sgml/spgist.sgml
index e47f70be89..06b7519052 100644
--- a/doc/src/sgml/spgist.sgml
+++ b/doc/src/sgml/spgist.sgml
@@ -161,6 +161,7 @@
~~
~=~
~~
+   ^@
   
  
  
diff --git a/src/backend/access/spgist/spgtextproc.c b/src/backend/access/spgist/spgtextproc.c
index f156b2166e..f92ff68a5f 100644
--- a/src/backend/access/spgist/spgtextproc.c
+++ b/src/backend/access/spgist/spgtextproc.c
@@ -496,7 +496,7 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 			 * well end with a partial multibyte character, so that applying
 			 * any encoding-sensitive test to it would be risky anyhow.)
 			 */
-			if (strategy > 10)
+			if (strategy > 10 && strategy != RTPrefixStrategyNumber)
 			{
 if (collate_is_c)
 	strategy -= 10;
@@ -526,6 +526,10 @@ spg_text_inner_consistent(PG_FUNCTION_ARGS)
 	if (r < 0)
 		res = false;
 	break;
+case RTPrefixStrategyNumber:
+	if (r != 0)
+		res = false;
+	break;
 default:
 	elog(ERROR, "unrecognized strategy number: %d",
 		 in->scankeys[j].sk_strategy);
@@ -601,10 +605,25 @@ spg_text_leaf_consistent(PG_FUNCTION_ARGS)
 	for (j = 0; j < in->nkeys; j++)
 	{
 		StrategyNumber strategy = in->scankeys[j].sk_strategy;
-		text	   *query = DatumGetTextPP(in->scankeys[j].sk_argument);
-		int			queryLen = VARSIZE_ANY_EXHDR(query);
+		text	   *query;
+		int			queryLen;
 		int			r;
 
+		/* for this strategy collation is not important */
+		if (strategy == RTPrefixStrategyNumber)
+		{
+			res = DatumGetBool(DirectFunctionCall2(text_startswith,
+out->leafValue, in->scankeys[j].sk_argument));
+
+			if (res) /* no need to consider remaining conditions */
+break;
+
+			continue;
+		}
+
+		query = DatumGetTextPP(in->scankeys[j].sk_argument);
+		queryLen = VARSIZE_ANY_EXHDR(query);
+
 		if (strategy > 10)
 		{
 			/* Collation-aware comparison */
diff --git a/src/backend/utils/adt/selfuncs.c b/src/backend/utils/adt/selfuncs.c
index bf240aa9c5..de3ace442a 100644
--- a/src/backend/utils/adt/selfuncs.c
+++ b/src/backend/utils/adt/selfuncs.c
@@ -1204,7 +1204,6 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	Oid			vartype;
 	Oid			opfamily;
 	Pattern_Prefix_Status pstatus;
-	Const	   *patt;
 	Const	   *prefix = NULL;
 	Selectivity rest_selec = 0;
 	double		nullfrac = 0.0;
@@ -1316,9 +1315,21 @@ patternsel(PG_FUNCTION_ARGS, Pattern_Type ptype, bool negate)
 	 * but because we want to be sure we cache compiled regexps under the
 	 * right cache key, so that they can be re-used at runtime.
 	 */
-	patt = (Const *) other;
-	pstatus = pattern_fixed_prefix(patt, ptype, collation,
-   , _selec);
+	if (ptype == Pattern_Type_Prefix)
+	{
+		char	*s = TextDatumGetCString(constval);
+		prefix = string_to_const(s, vartype);
+		pstatus = Pattern_Prefix_Partial;
+		rest_selec = 1.0;	/* all */
+		pfree(s);
+	}
+	else
+	{
+		Const *patt = (Const *) other;
+
+		pstatus = pattern_fixed_prefix(patt, ptype, collation,
+	   , _selec);
+	}
 
 	/*
 	 * If necessary, coerce the prefix constant to the right type.
@@ -1488,6 +1499,16 @@ likesel(PG_FUNCTION_ARGS)
 }
 
 /*
+ *		prefixsel			- selectivity of prefix operator
+ */
+Datum
+prefixsel(PG_FUNCTION_ARGS)
+{
+	PG_RETURN_FLOAT8(patternsel(fcinfo, Pattern_Type_Prefix, false));
+}
+
+/*
+ *
  *		iclikesel			- Selectivity of ILIKE pattern match.
  */
 Datum
@@ -2906,6 +2927,15 @@ likejoinsel(PG_FUNCTION_ARGS)
 	PG_RETURN_FLOAT8(patternjoinsel(fcinfo, Pattern_Type_Like, false));

Re: Re: csv format for psql

Hi

2018-03-22 16:30 GMT+01:00 Daniel Verite :

> David Steele wrote:
>
> > Do you know when you'll have an updated patch that addresses the minor
> > issues brought up in review and the concern above?
>
> Here's an update addressing the issues discussed:
>
> - fieldsep and recordsep are used, no more fieldsep_csv
> - the command line option is --csv without short option and is equivalent
>  to -P format=csv -P fieldsep=','
> - pset output formats are reordered alphabetically on display
> - a couple more cases in the regression tests
>
>
I am sorry, I don't think so this design is correct. It introduce
dependency on arguments order

# correct output
[pavel@nemesis postgresql]$ psql --csv -c "table foo" -F ';'
a;b;c
1;2;Hello
3;4;Nazdar

# error -F is ignored
[pavel@nemesis postgresql]$ psql -F ';' --csv -c "table foo"
a,b,c
1,2,Hello
3,4,Nazdar

Using \pset format csv means overwriting field sep every time - nobody uses
|

I think so dependency on order of psql arguments is significant problem

Regards

Pavel





>
> Best regards,
> --
> Daniel Vérité
> PostgreSQL-powered mailer: http://www.manitou-mail.org
> Twitter: @DanielVerite
>

Re: PL/pgSQL nested CALL with transactions

2018-03-22 Thread Tomas Vondra

Hi,

The patch looks good to me - certainly in the sense that I haven't found
any bugs during the review. I do have a couple of questions and comments
about possible improvements, though. Some of this may be a case of
bike-shedding or simply because I've not looked as SPI/plpgsql before.


1) plpgsql.sgml

The new paragraph talks about "intervening command" - I've been unsure
what that refers too, until I read the comment for ExecutCallStmt(),
which explains this as CALL -> SELECT -> CALL. Perhaps we should add
that to the sgml docs too.


2) spi.c

I generally find it confusing when there are negative flags, which are
then negated whenever used. That is pretty the "no_snapshots" case,
because it means "don't manage snapshots" and all references to the flag
look like this:

if (snapshot != InvalidSnapshot && !plan->no_snapshots)

Why not to have "positive" flag instead, e.g. "manage_snapshots"?

FWIW the comment in_SPI_execute_plan talking about "four distinct
snapshot management behaviors" should mention that with
no_snapshots=true none of those really matters.

I also wonder why _SPI_make_plan_non_temp/_SPI_save_plan changes palloc
to palloc0. It is just to initialize the no_snapshots flag explicitly?
It seems a bit wasteful to do a memset and then go and initialize all
the fields anyway, although I don't know how sensitive this code is.


3) utility.c

I find this condition rather confusing:

(!(context == PROCESS_UTILITY_TOPLEVEL ||
   context == PROCESS_UTILITY_QUERY_NONATOMIC) ||
   IsTransactionBlock())

I mean, negated || with another || - it took me a while to parse what
that means. I suggest doing this instead:

#define ProcessUtilityIsAtomic(context)\
   (!(context == PROCESS_UTILITY_TOPLEVEL ||
  context == PROCESS_UTILITY_QUERY_NONATOMIC))

(ProcessUtilityIsAtomic(context) || IsTransactionBlock())


4) spi_priv.h

Shouldn't the comment for _SPI_plan also mention what no_snapshots does?


5) utility.h

So now that we have  PROCESS_UTILITY_QUERY and
PROCESS_UTILITY_QUERY_NONATOMIC, does that mean PROCESS_UTILITY_QUERY is
always atomic?

6) pl_exec.h

The exec_prepare_plan in exec_stmt_perform was expanded into a local
copy of the code, but ISTM the new code just removes handling of some
error types when (plan==NULL), and doesn't call SPI_keepplan or
exec_simple_check_plan. Why not to keep using exec_prepare_plan and add
a new parameter to skip those calls?


regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Re: csv format for psql

2018-03-22 Thread Daniel Verite

David Steele wrote:

> Do you know when you'll have an updated patch that addresses the minor
> issues brought up in review and the concern above?

Here's an update addressing the issues discussed:

- fieldsep and recordsep are used, no more fieldsep_csv
- the command line option is --csv without short option and is equivalent
 to -P format=csv -P fieldsep=','
- pset output formats are reordered alphabetically on display
- a couple more cases in the regression tests


Best regards,
-- 
Daniel Vérité
PostgreSQL-powered mailer: http://www.manitou-mail.org
Twitter: @DanielVerite
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index 10b9795..c984a9c 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -152,6 +152,16 @@ EOF
 
 
 
+  --csv
+  
+  
+  Switches to csv output mode. This is equivalent to \pset format
+  csv followed by \pset fieldsep ','.
+  
+  
+
+
+
   -d dbname
   --dbname=dbname
   
@@ -246,7 +256,7 @@ EOF
   
   
   Use separator as the
-  field separator for unaligned output. This is equivalent to
+  field separator for unaligned and csv outputs. This is equivalent to
   \pset fieldsep or \f.
   
   
@@ -382,7 +392,7 @@ EOF
   
   
   Use separator as the
-  record separator for unaligned output. This is equivalent to
+  record separator for unaligned and csv outputs. This is equivalent to
   \pset recordsep.
   
   
@@ -558,7 +568,7 @@ EOF
   
   
   Set the field separator for unaligned output to a zero byte.  This is
-  equvalent to \pset fieldsep_zero.
+  equivalent to \pset fieldsep_zero.
   
   
 
@@ -1937,9 +1947,9 @@ Tue Oct 26 21:40:57 CEST 1999
 
 
 
-Sets the field separator for unaligned query output. The default
-is the vertical bar (|). It is equivalent to
-\pset fieldsep.
+Sets the field separator for unaligned and csv query outputs. The
+default is the vertical bar (|). It is equivalent
+to \pset fieldsep.
 
 
   
@@ -2546,8 +2556,8 @@ lo_import 152801
   fieldsep
   
   
-  Specifies the field separator to be used in unaligned output
-  format. That way one can create, for example, tab- or
+  Specifies the field separator to be used in unaligned and csv output
+  formats. That way one can create, for example, tab- or
   comma-separated output, which other programs might prefer. To
   set a tab as field separator, type \pset fieldsep
   '\t'. The default field separator is
@@ -2584,9 +2594,13 @@ lo_import 152801
   format
   
   
-  Sets the output format to one of unaligned,
-  aligned, wrapped,
-  html, asciidoc,
+  Sets the output format to one of
+  unaligned,
+  aligned,
+  csv,
+  wrapped,
+  html,
+  asciidoc,
   latex (uses tabular),
   latex-longtable, or
   troff-ms.
@@ -2601,6 +2615,15 @@ lo_import 152801
   format).
   
 
+  csv format writes columns separated
+  by fieldsep, applying the CSV quoting rules
+  described in RFC-4180 and compatible with the CSV format
+  of the COPY command.
+  The header with column names is output unless the
+  tuples_only parameter is on.
+  Title and footers are not printed.
+  
+
   aligned format is the standard, 
human-readable,
   nicely formatted text output;  this is the default.
   
@@ -2747,8 +2770,8 @@ lo_import 152801
   recordsep
   
   
-  Specifies the record (line) separator to use in unaligned
-  output format. The default is a newline character.
+  Specifies the record (line) separator to use in unaligned or
+  csv output formats. The default is a newline character.
   
   
   
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 3560318..1d8cc96 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3603,6 +3603,9 @@ _align2string(enum printFormat in)
case PRINT_TROFF_MS:
return "troff-ms";
break;
+   case PRINT_CSV:
+   return "csv";
+   break;
}
return "unknown";
 }
@@ -3658,25 +3661,27 @@ do_pset(const char *param, const char *value, 
printQueryOpt *popt, bool quiet)
{
if (!value)
;
-   else if (pg_strncasecmp("unaligned", value, vallen) == 0)
-   popt->topt.format = PRINT_UNALIGNED;
else if (pg_strncasecmp("aligned", value, vallen) == 0)

Re: WIP: Covering + unique indexes.

2018-03-22 Thread Alexander Korotkov

On Wed, Mar 21, 2018 at 9:51 PM, Alexander Korotkov <
a.korot...@postgrespro.ru> wrote:

> On Thu, Mar 8, 2018 at 7:13 PM, Anastasia Lubennikova <
> a.lubennik...@postgrespro.ru> wrote:
>
>> 06.03.2018 11:52, Thomas Munro:
>>
>>> On Wed, Jan 31, 2018 at 3:09 AM, Anastasia Lubennikova
>>>  wrote:
>>>
 Thank you for reviewing. All mentioned issues are fixed.

>>> == Applying patch 0002-covering-btree_v4.patch...
>>> 1 out of 1 hunk FAILED -- saving rejects to file
>>> src/backend/access/nbtree/README.rej
>>> 1 out of 1 hunk FAILED -- saving rejects to file
>>> src/backend/access/nbtree/nbtxlog.c.rej
>>>
>>> Can we please have a new patch set?
>>>
>>
>> Here it is.
>> Many thanks to Andrey Borodin.
>
>
> I took a look at this patchset.  I have some notes about it.
>
> * I see patch changes dblink, amcheck and tcl contribs.  It would be nice
> to add corresponding
> check to dblink and amcheck regression tests.  It would be good to do the
> same with tcn contrib.
> But tcn doesn't have regression tests at all.  And it's out of scope of
> this patch to add regression
> tests to tcn.  So, it's OK to just check that it's working correctly with
> covering indexes (I hope it's
> already done by other reviewers).
>
> * I think that subscription regression tests in
> src/test/subscription should have some use
> of covering indexes.  Logical decoding and subscription are heavily using
> primary keys.
> So they need to be tested to work correctly with covering indexes.
>
> * I also think some isolation tests in src/test/isolation need to check
> covering indexes too.
> In particular insert-conflict-*.spec and lock-*.spec and probably more.
>
> *  pg_dump doesn't handle old PostgreSQL versions correctly.  If I try to
> dump database
> of PostgreSQL 9.6, pg_dump gives me following error:
>
> pg_dump: [archiver (db)] query failed: ERROR:  column i.indnkeyatts does
> not exist
> LINE 1: ...atalog.pg_get_indexdef(i.indexrelid) AS indexdef, i.indnkeya...
>  ^
>
> If fact there is a sequence of "if" ... "else if" blocks in getIndexes()
> which selects
> appropriate query depending on remote server version.  And for pre-11 we
> should
> use indnatts instead of indnkeyatts.
>
> * There is minor formatting issue in this part of code.  Some spaces need
> to be replaced with tabs.
> +IndexTuple
> +index_truncate_tuple(Relation idxrel, IndexTuple olditup)
> +{
> + TupleDesc   itupdesc = CreateTupleDescCopyConstr(Rela
> tionGetDescr(idxrel));
> + Datum   values[INDEX_MAX_KEYS];
> + boolisnull[INDEX_MAX_KEYS];
> + IndexTuple newitup;
>
> * I think this comment needs to be rephrased.
> + /*
> +  * Code below is concerned to the opclasses which are not used
> +  * with the included columns.
> +  */
> I would write something like this: "Code below checks opclass key type.
> Included columns
> don't have opclasses, and this check is not required for them.".  Native
> english speakers
> could provide even better phrasing though.
>
> * I would also like all the patches in patchset version to have same
> version number.
> I understand that "Covering-btree" and "Covering-amcheck" have less
> previous
> versions than "Covering-core".  But it's way easier to identify patches
> belonging to
> the same patchset version if they have same version number.  For sure,
> then some
> patches would skip some version numbers, but that doesn't seem to be a
> problem for me.
>

I have few more notes regarding this patchset.

* indkeyatts is described in the documentation, but I think that
description of indnatts
should be also updated clarifying that indnatts counts "included" columns.

+  
+  indnkeyatts
+  int2
+  
+  The number of key columns in the index. "Key columns" are
ordinary
+  index columns (as opposed to "included" columns).
+ 

*  It seems like this paragraph appears in the patchset without any
mentioning
in the thread.

+Notes to Operator Class Implementors
+

Besides I really appreciate it, it seems to be unrelated to the covering
indexes.
I'd like this to be extracted into separate patch and be committed
separately.

* There is a typo here: brtee -> btree
+ * 7. Check various AMs. All but brtee must fail.

* This comment should be updated assuming that we now put left page
hikey to the WAL independently on whether it's leaf page split.

+ /*
+ * We must also log the left page's high key, because the right
+ * page's leftmost key is suppressed on non-leaf levels.  Show it
+ * as belonging to the left page buffer, so that it is not stored
+ * if XLogInsert decides it needs a full-page image of the left
+ * page.
+ */

* get_index_def() is adjusted to support covering indexes.  I think this
support
deserve to be checked in regression tests.

* In PostgreSQL sentences are sometimes divided by single spacing, sometimes
divided by double spacing.  I think we should follow

Re: WIP: a way forward on bootstrap data

On 3/22/18, Tom Lane  wrote:
> I don't really think it's legal C; I'd rather write BKI_DEFAULT('\054').

Okay, I'll do that.

>> Were you thinking of this comment in
>> pg_attribute.h? We use the double-quoted empty string for postgres.bki
>> and change it to  '\0' for schemapg.h.
>
>> /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */
>> char attidentity BKI_DEFAULT("");
>
> That definitely seems like a hack --- why not BKI_DEFAULT('\0') ?

Hmm, yes, the way I had it, the comment is a mystery. I'll switch it around.

-John Naylor

Re: faster testing with symlink installs

On Wed, Mar 21, 2018 at 11:43 PM, Michael Paquier  wrote:
> On Wed, Mar 07, 2018 at 05:06:59PM -0500, Robert Haas wrote:
>> TBH I find that Homebrew example pretty odd.  I would understand
>> installing each major release in a version directory, but putting
>> every point release in a different versioned directory seems like a
>> bad plan.
>
> That's a project policy on their side.  The version of all components is
> kept around this way to give users the easier possibility to link or use
> back a previous version if a component update does wrong.  There are
> advantages to such models as well for projects which care a lot about
> some precise compatibility that some upstream maintainer overlooked,
> while not having to patch an existing instance to filter out only a set
> of components.

Well, under that definition, the behavior Peter is complaining about
upthread is a feature, not a bug.  If you want a separate install of
PostgreSQL for every minor release, you should have a separate version
of each other piece of software you build against it.  Or so it seems
to me, anyway.

I suppose we could provide a build-time option to change this behavior.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: WIP: a way forward on bootstrap data

On 3/22/18, Alvaro Herrera  wrote:
> how about letting the line go long, with the comment at the right of
> each definition, with one blank line between struct members, as in the
> sample below?  You normally don't care that these lines are too long
> since you seldom edit them -- one mostly adds or remove entire lines
> instead, so there's not as much need for side-by-side diffs as with
> regular code.  (One issue with this proposal is how to convince pgindent
> to leave the long lines alone.)

Yeah, it seems when perltidy or pgindent mangle things badly, it's to
try and shoehorn a long line into a smaller number of characters.  If
memory serves, I've come across things like this:

pg_node_tree proargdefaults BKI_DEFAULT(_null_); /* list of expression

  trees for argument

  defaults (NULL if none) */

And thought "only a machine could be so precisely awkward"

-John Naylor

Re: pgbench - add \if support

2018-03-22 Thread Teodor Sigaev


Thank you, pushed

Fabien COELHO wrote:


Hello Teodor,

Patch seems usefull and commitable except comments in conditional.[ch]. I'd 
like to top/header comment in each file more detailed and descriptive. As for 
now it mentions only psql usage without explaining how it is basic or common.


Indeed, it was not updated.

I've fixed the file names and added a simple description at the beginning of the 
header file, and a one liner in the code file.


Do you think that more is needed?

The patch also needed a rebase after the hash function addition.



--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

Re: WIP: a way forward on bootstrap data

John Naylor  writes:
> On 3/22/18, Tom Lane  wrote:
>> While I'm thinking of it --- I noticed one or two places where you
>> had "BKI_DEFAULT(\0)".

> Hmm, I only see this octal in pg_type.h
> chartypdelim BKI_DEFAULT(\054);

Sorry, I was going by memory rather than looking at the code.

> Which I hope is fine.

I don't really think it's legal C; I'd rather write BKI_DEFAULT('\054').

> Were you thinking of this comment in
> pg_attribute.h? We use the double-quoted empty string for postgres.bki
> and change it to  '\0' for schemapg.h.

> /* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */
> char  attidentity BKI_DEFAULT("");

That definitely seems like a hack --- why not BKI_DEFAULT('\0') ?

regards, tom lane

Re: WIP: a way forward on bootstrap data

John Naylor wrote:

> I've attached an earlier version of pg_proc.h with both formats as I
> understand them. I turned a couple comments into multi-line comments
> to demonstrate. I think without spaces it's just as hard to read as
> with multiple annotations. I'd vote for spaces, but then again I'm not
> the one who has to read these things very often.

how about letting the line go long, with the comment at the right of
each definition, with one blank line between struct members, as in the
sample below?  You normally don't care that these lines are too long
since you seldom edit them -- one mostly adds or remove entire lines
instead, so there's not as much need for side-by-side diffs as with
regular code.  (One issue with this proposal is how to convince pgindent
to leave the long lines alone.)

To me, an important property of these structs is fitting as much as
possible (vertically) in a screenful; the other proposed modes end up
with too many lines.

CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO
{
NameDataproname;/* procedure name */

Oid pronamespace BKI_DEFAULT(PGNSP); /* OID of namespace containing 
this proc */

Oid proowner BKI_DEFAULT(PGUID); /* procedure owner */

Oid prolang BKI_DEFAULT(12); /* OID of pg_language entry */

float4  procost BKI_DEFAULT(1); /* estimated execution cost */

float4  prorows BKI_DEFAULT(0); /* estimated # of rows out (if 
proretset) */

Oid provariadic BKI_DEFAULT(0); /* element type of variadic array, 
or 0 */

regproc protransform BKI_DEFAULT(0); /* transforms calls to it during 
planning */

boolproisagg BKI_DEFAULT(f); /* is it an aggregate? */

boolproiswindow BKI_DEFAULT(f); /* is it a window function? */

boolprosecdef BKI_DEFAULT(f); /* security definer */

boolproleakproof BKI_DEFAULT(f); /* is it a leak-proof function? */

boolproisstrict BKI_DEFAULT(f); /* strict with respect to NULLs? */

boolproretset BKI_DEFAULT(f); /* returns a set? */

charprovolatile BKI_DEFAULT(v); /* see PROVOLATILE_ categories 
below */

charproparallel BKI_DEFAULT(u); /* see PROPARALLEL_ categories 
below */

int16   pronargs; /* number of arguments */

int16   pronargdefaults BKI_DEFAULT(0); /* number of arguments with 
defaults */

Oid prorettype; /* OID of result type */

/*
 * variable-length fields start here, but we allow direct access to
 * proargtypes
 */

oidvectorproargtypes; /* parameter types (excludes OUT params) */

#ifdef CATALOG_VARLEN

Oid proallargtypes[1] BKI_DEFAULT(_null_); /* all param types (NULL 
if IN only) */

charproargmodes[1] BKI_DEFAULT(_null_); /* parameter modes (NULL if 
IN only) */

textproargnames[1] BKI_DEFAULT(_null_); /* parameter names (NULL if 
no names) */

pg_node_tree proargdefaults BKI_DEFAULT(_null_); /* list of expression 
trees for argument defaults (NULL if none) */

Oid protrftypes[1] BKI_DEFAULT(_null_); /* types for which to apply 
transforms */

textprosrc BKI_FORCE_NOT_NULL; /* procedure source text */

textprobin BKI_DEFAULT(_null_); /* secondary procedure info (can be 
NULL) */

textproconfig[1] BKI_DEFAULT(_null_); /* procedure-local GUC 
settings */

aclitem proacl[1] BKI_DEFAULT(_null_); /* access permissions */
#endif
} FormData_pg_proc;

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] new function for tsquery creartion

2018-03-22 Thread Teodor Sigaev




I am extending phrase operator  is such way that it will have 
syntax that means from n to m words, so I will use such syntax ()
further. I found that a AROUND(N) b is exactly the same as a <-N,N> b
and it can be replaced while parsing. So, what do you think of such
idea? In this patch I have noticed some unobvious behavior.


I think new operator should be a subject for separate patch. And I prefer idea 
about range phrase operator.


--
Teodor Sigaev   E-mail: teo...@sigaev.ru
   WWW: http://www.sigaev.ru/

Re: WIP: a way forward on bootstrap data

On 3/22/18, Tom Lane  wrote:
> Looking at this again, I think a big chunk of the readability problem here
> is just from the fact that we have long, similar-looking lines tightly
> packed.  If it were reformatted to have comment lines and whitespace
> between, it might not look nearly as bad.
>
...
> Anyway, for the moment I'd stick with BKI_LOOKUP rather than undoing
> that work.  I think it's a more transparent way of saying what we
> want than the magic-OID-typedefs approach.  The formatting issue is
> just a mild annoyance, and it's not really BKI_LOOKUP's fault anyway.

Okay, I'll do it with comments and whitespace.

> While I'm thinking of it --- I noticed one or two places where you
> had "BKI_DEFAULT(\0)".  That coding scares me a bit --- gcc seems to
> tolerate it, but other C compilers might feel that \0 is not a valid
> preprocessing token, or it might confuse some editors' syntax highlight
> rules.  I'd rather write cases like this as "BKI_DEFAULT('\0')".  IOW,
> the argument should be a valid C identifier, number, or string literal.

Hmm, I only see this octal in pg_type.h

chartypdelim BKI_DEFAULT(\054);

Which I hope is fine. Were you thinking of this comment in
pg_attribute.h? We use the double-quoted empty string for postgres.bki
and change it to  '\0' for schemapg.h.

/* One of the ATTRIBUTE_IDENTITY_* constants below, or '\0' */
charattidentity BKI_DEFAULT("");


-John Naylor

Re: missing support of named convention for procedures

Hi

2018-03-21 8:18 GMT+01:00 Pavel Stehule :

>
>
> 2018-03-20 17:31 GMT+01:00 Peter Eisentraut  com>:
>
>> On 3/16/18 06:29, Pavel Stehule wrote:
>> > attached patch fixes it
>>
>> The fix doesn't seem to work for LANGUAGE SQL procedures.  For example:
>>
>> CREATE PROCEDURE ptest5(a int, b int DEFAULT 0)
>> LANGUAGE SQL
>> AS $$
>> INSERT INTO cp_test VALUES (a, 'foo');
>> INSERT INTO cp_test VALUES (b, 'bar');
>> $$;
>> CALL ptest5(a => 1, b => 2);  -- ok
>> CALL ptest5(b => 3, a => 4);  -- ok
>> CALL ptest5(5);
>> ERROR:  no value found for parameter 2
>> CONTEXT:  SQL function "ptest5" statement 2
>> CALL ptest5(a => 6);
>> ERROR:  no value found for parameter 2
>> CONTEXT:  SQL function "ptest5" statement 2
>>
>> I'm not quite sure why this behaves differently from plpgsql.  Needs
>> more digging.
>>
>
> Do you working on this issue? Maybe tomorrow I'll have a time to look
> there.
>

attached patch should to fix it

Regards

Pavel

>
> Regards
>
> Pavel
>
>
>>
>> --
>> Peter Eisentraut  http://www.2ndQuadrant.com/
>> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>>
>
>
diff --git a/src/backend/commands/functioncmds.c b/src/backend/commands/functioncmds.c
index 86fa8c0dd7..75797338f4 100644
--- a/src/backend/commands/functioncmds.c
+++ b/src/backend/commands/functioncmds.c
@@ -55,6 +55,7 @@
 #include "executor/executor.h"
 #include "miscadmin.h"
 #include "optimizer/var.h"
+#include "optimizer/clauses.h"
 #include "parser/parse_coerce.h"
 #include "parser/parse_collate.h"
 #include "parser/parse_expr.h"
@@ -2254,6 +2255,15 @@ ExecuteCallStmt(CallStmt *stmt, ParamListInfo params, bool atomic, DestReceiver
 		elog(ERROR, "cache lookup failed for function %u", fexpr->funcid);
 	if (!heap_attisnull(tp, Anum_pg_proc_proconfig))
 		callcontext->atomic = true;
+
+	/*
+	 * The named and default arguments can be used, so try to reorder
+	 * arguments and append default arguments. The number of arguments
+	 * can be changed, refresh nargs.
+	 */
+	fexpr->args = expand_function_arguments(fexpr->args, fexpr->funcresulttype, tp);
+	nargs = list_length(fexpr->args);
+
 	ReleaseSysCache(tp);
 
 	/* Initialize function call structure */
diff --git a/src/backend/optimizer/util/clauses.c b/src/backend/optimizer/util/clauses.c
index 93e5658a35..a1c7c45fae 100644
--- a/src/backend/optimizer/util/clauses.c
+++ b/src/backend/optimizer/util/clauses.c
@@ -130,8 +130,6 @@ static Expr *simplify_function(Oid funcid,
   Oid result_collid, Oid input_collid, List **args_p,
   bool funcvariadic, bool process_args, bool allow_non_const,
   eval_const_expressions_context *context);
-static List *expand_function_arguments(List *args, Oid result_type,
-		  HeapTuple func_tuple);
 static List *reorder_function_arguments(List *args, HeapTuple func_tuple);
 static List *add_function_defaults(List *args, HeapTuple func_tuple);
 static List *fetch_function_defaults(HeapTuple func_tuple);
@@ -4112,7 +4110,7 @@ simplify_function(Oid funcid, Oid result_type, int32 result_typmod,
  * cases it handles should never occur there.  This should be OK since it
  * will fall through very quickly if there's nothing to do.
  */
-static List *
+List *
 expand_function_arguments(List *args, Oid result_type, HeapTuple func_tuple)
 {
 	Form_pg_proc funcform = (Form_pg_proc) GETSTRUCT(func_tuple);
diff --git a/src/include/optimizer/clauses.h b/src/include/optimizer/clauses.h
index ba4fa4b68b..ed854fdd40 100644
--- a/src/include/optimizer/clauses.h
+++ b/src/include/optimizer/clauses.h
@@ -14,9 +14,9 @@
 #ifndef CLAUSES_H
 #define CLAUSES_H
 
+#include "access/htup.h"
 #include "nodes/relation.h"
 
-
 #define is_opclause(clause)		((clause) != NULL && IsA(clause, OpExpr))
 #define is_funcclause(clause)	((clause) != NULL && IsA(clause, FuncExpr))
 
@@ -85,4 +85,7 @@ extern Node *estimate_expression_value(PlannerInfo *root, Node *node);
 extern Query *inline_set_returning_function(PlannerInfo *root,
 			  RangeTblEntry *rte);
 
+extern List *expand_function_arguments(List *args, Oid result_type,
+		  HeapTuple func_tuple);
+
 #endif			/* CLAUSES_H */
diff --git a/src/pl/plpgsql/src/expected/plpgsql_call.out b/src/pl/plpgsql/src/expected/plpgsql_call.out
index 1e94a44f2b..9d7d7da5b0 100644
--- a/src/pl/plpgsql/src/expected/plpgsql_call.out
+++ b/src/pl/plpgsql/src/expected/plpgsql_call.out
@@ -169,3 +169,80 @@ DROP PROCEDURE test_proc1;
 DROP PROCEDURE test_proc3;
 DROP PROCEDURE test_proc4;
 DROP TABLE test1;
+-- named parameters and defaults
+CREATE PROCEDURE test_proc(a int, b int, c int DEFAULT -1)
+AS $$
+BEGIN
+  RAISE NOTICE 'a: %, b: %, c: %', a, b, c;
+END;
+$$ LANGUAGE plpgsql;
+CALL test_proc(10,20,30);
+NOTICE:  a: 10, b: 20, c: 30
+CALL test_proc(10,20);
+NOTICE:  a: 10, b: 20, c: -1
+CALL test_proc(c=>1, a=>3, b=>2);
+NOTICE:  a: 3, b: 2, c: 1
+DROP PROCEDURE test_proc;
+CREATE PROCEDURE test_proc1(INOUT _a int, INOUT _b int)

Re: [PATCH] Verify Checksums during Basebackups

2018-03-22 Thread Magnus Hagander

On Sat, Mar 17, 2018 at 10:34 PM, Michael Banck 
wrote:

> Hi,
>
> On Fri, Mar 09, 2018 at 10:35:33PM +0100, Michael Banck wrote:
> > Possibly open questions:
> >
> > 1. I have not so far changed the replication protocol to make verifying
> > checksums optional. I can go about that next if the consensus is that we
> > need such an option (and cannot just check it everytime)?
>
> I think most people (including those I had off-list discussions about
> this with) were of the opinion that such an option should be there, so I
> added an additional option NOVERIFY_CHECKSUMS to the BASE_BACKUP
> replication command and also an option -k / --no-verify-checksums to
> pg_basebackup to trigger this.
>
> Updated patch attached.
>
>
Notes:

+   if (checksum_failure == true)

Really, just if(checksum_failure)

+   errmsg("checksum mismatch during basebackup")));

Should be "base backup" in messages.

+static const char *skip[] = {

I think that needs a much better name than just "skip". Skip for what? In
particular since we are just skipping it for checksums, and not for the
actual basebackup, that name is actively misinforming.

+   filename = basename(pstrdup(readfilename));
+   if (!noverify_checksums && DataChecksumsEnabled() &&
+   !skipfile(filename) &&
+   (strncmp(readfilename, "./global/", 9) == 0 ||
+   strncmp(readfilename, "./base/", 7) == 0 ||
+   strncmp(readfilename, "/", 1) == 0))
+   verify_checksum = true;

I would include the checks for global, base etc into the skipfile()
function as well (also renamed).

+* Only check pages which have not been modified since the
+* start of the base backup.

I think this needs a description of why, as well (without having read this
thread, this is a pretty subtle case).

+system_or_bail 'dd', 'conv=notrunc', 'oflag=seek_bytes', 'seek=4000',
'bs=9', 'count=1', 'if=/dev/zero', "of=$pgdata/$pg_class";

This part of the test will surely fail on Windows, not having a /dev/zero.
Can we easily implement this part natively in perl perhaps? Somebody who
knows more about which functionality is OK to use within this system can
perhaps comment?

Most of that stuff is trivial and can be cleaned up at commit time. Do you
want to send an updated patch with a few of those fixes, or should I clean
it?

The test thing is a stopper until we figure that one out though. And while
at it -- it seems we don't have any tests for the checksum feature in
general. It would probably make sense to consider that at the same time as
figuring out the right way to do this one.

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: WIP: a way forward on bootstrap data

I wrote:

> On 3/21/18, Tom Lane  wrote:
>> I've got mixed feelings about the whitespace lines between fields.  They
>> seem like they are mostly bulking up the code and we could do without
>> 'em.
>> On the other hand, pgindent will insist on putting one before any
>> multi-line field comment, and so that would create inconsistent
>> formatting
>> if we don't use 'em elsewhere.  Thoughts?
>
> I'll do it both ways for one header and post the results for people to look
> at.

I've attached an earlier version of pg_proc.h with both formats as I
understand them. I turned a couple comments into multi-line comments
to demonstrate. I think without spaces it's just as hard to read as
with multiple annotations. I'd vote for spaces, but then again I'm not
the one who has to read these things very often.

-John Naylor
/*-
 *
 * pg_proc.h
 *	  definition of the system "procedure" relation (pg_proc)
 *	  along with the relation's initial contents.
 *
 * Portions Copyright (c) 1996-2018, PostgreSQL Global Development Group
 * Portions Copyright (c) 1994, Regents of the University of California
 *
 * src/include/catalog/pg_proc.h
 *
 * NOTES
 *	  The Catalog.pm module reads this file and derives schema
 *	  information.
 *
 *-
 */
#ifndef PG_PROC_H
#define PG_PROC_H

#include "catalog/genbki.h"

/* 
 *		pg_proc definition.  cpp turns this into
 *		typedef struct FormData_pg_proc
 * 
 */
#define ProcedureRelationId  1255
#define ProcedureRelation_Rowtype_Id  81

CATALOG(pg_proc,1255) BKI_BOOTSTRAP BKI_ROWTYPE_OID(81) BKI_SCHEMA_MACRO
{
	/* procedure name */
	NameData	proname;
	/* OID of namespace containing this proc */
	Oid			pronamespace BKI_DEFAULT(PGNSP);
	/* procedure owner */
	Oid			proowner BKI_DEFAULT(PGUID);
	/* OID of pg_language entry */
	Oid			prolang BKI_DEFAULT(12);
	/* estimated execution cost */
	float4		procost BKI_DEFAULT(1);

	/* estimated # of 
	 * rows out (if proretset) */
	float4		prorows BKI_DEFAULT(0);
	/* element type of variadic array, or 0 */
	Oid			provariadic BKI_DEFAULT(0);
	/* transforms calls to it during planning */
	regproc		protransform BKI_DEFAULT(0);
	/* is it an aggregate? */
	bool		proisagg BKI_DEFAULT(f);
	/* is it a window function? */
	bool		proiswindow BKI_DEFAULT(f);
	/* security definer */
	bool		prosecdef BKI_DEFAULT(f);
	/* is it a leak-proof function? */
	bool		proleakproof BKI_DEFAULT(f);
	/* strict with respect to NULLs? */
	bool		proisstrict BKI_DEFAULT(f);
	/* returns a set? */
	bool		proretset BKI_DEFAULT(f);
	/* see PROVOLATILE_ categories below */
	char		provolatile BKI_DEFAULT(v);
	/* see PROPARALLEL_ categories below */
	char		proparallel BKI_DEFAULT(u);
	/* number of arguments */
	int16		pronargs;
	/* number of arguments with defaults */
	int16		pronargdefaults BKI_DEFAULT(0);
	/* OID of result type */
	Oid			prorettype;

	/*
	 * variable-length fields start here, but we allow direct access to
	 * proargtypes
	 */

	/* parameter types (excludes OUT params) */
	oidvector	proargtypes;

#ifdef CATALOG_VARLEN

	/* all param types (NULL if IN only) */
	Oid			proallargtypes[1] BKI_DEFAULT(_null_);
	/* parameter modes (NULL if IN only) */
	char		proargmodes[1] BKI_DEFAULT(_null_);
	/* parameter names (NULL if no names) */
	text		proargnames[1] BKI_DEFAULT(_null_);

	/* list of expression trees 
	 * for argument defaults (NULL if none) */
	pg_node_tree proargdefaults BKI_DEFAULT(_null_);
	/* types for which to apply transforms */
	Oid			protrftypes[1] BKI_DEFAULT(_null_);
	/* procedure source text */
	text		prosrc BKI_FORCE_NOT_NULL;
	/* secondary procedure info (can be NULL) */
	text		probin BKI_DEFAULT(_null_);
	/* procedure-local GUC settings */
	text		proconfig[1] BKI_DEFAULT(_null_);
	/* access permissions */
	aclitem		proacl[1] BKI_DEFAULT(_null_);
#endif
} FormData_pg_proc;

/* 
 *		Form_pg_proc corresponds to a pointer to a tuple with
 *		the format of pg_proc relation.
 * 
 */
typedef FormData_pg_proc *Form_pg_proc;

/* 
 *		compiler constants for pg_proc
 * 
 */
#define Natts_pg_proc	29
#define Anum_pg_proc_proname			1
#define Anum_pg_proc_pronamespace		2
#define Anum_pg_proc_proowner			3
#define Anum_pg_proc_prolang			4
#define Anum_pg_proc_procost			5
#define Anum_pg_proc_prorows			6
#define Anum_pg_proc_provariadic		7
#define Anum_pg_proc_protransform		8
#define Anum_pg_proc_proisagg			9
#define Anum_pg_proc_proiswindow		10
#define Anum_pg_proc_prosecdef			11
#define Anum_pg_proc_proleakproof		12
#define Anum_pg_proc_proisstrict		13
#define Anum_pg_proc_proretset			14
#define Anum_pg_proc_provolatile		15
#define Anum_pg_proc_proparallel		16
#define Anum_pg_proc_pronargs			17
#define Anum_pg_proc_pronargdefaults	18
#define Anum_pg_proc_prorettype			19
#define Anum_pg_proc_proargtypes		20
#define

Re: WIP: a way forward on bootstrap data

John Naylor  writes:
> On 3/21/18, Tom Lane  wrote:
>>  regproc aggmtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);
>>  regproc aggminvtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);
>>  regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);

>> I'm not sure if there's anything much to be done about this.  BKI_DEFAULT
>> isn't so bad, but additional annotations get unreadable fast.  Maybe
>> BKI_LOOKUP was a bad idea after all, and we should just invent more
>> Oid-equivalent typedef names.

> Well, until version 7, I used "fake" type aliases that only genbki.pl
> could see. The C compiler and postgres.bki could only see the actual
> Oid/oid type. Perhaps it was a mistake to model their appearance after
> regproc (regtype, etc), because that was misleading. Maybe something
> more obviously transient like 'lookup_typeoid? I'm leaning towards
> this idea.

Looking at this again, I think a big chunk of the readability problem here
is just from the fact that we have long, similar-looking lines tightly
packed.  If it were reformatted to have comment lines and whitespace
between, it might not look nearly as bad.

> Another possibility is to teach the pgindent pre_/post_indent
> functions to preserve annotation formatting, but I'd rather not add
> yet another regex to that script. Plus, over the next 10+ years, I
> could see people adding several more BKI_* macros, leading to
> readability issues regardless of formatting, so maybe we should nip
> this one in the bud.

Well, whether or not we invent BKI_LOOKUP, the need for other kinds
of annotations isn't likely to be lessened.

I wondered whether we could somehow convert the format into multiple
lines, say

regproc aggmfinalfn
BKI_DEFAULT(-)
BKI_LOOKUP(pg_proc);

but some quick experimentation was discouraging: either Emacs' C
syntax mode, or pgindent, or both would make a hash of it.  It
wasn't great from the vertical-space-consumption standpoint either.

Anyway, for the moment I'd stick with BKI_LOOKUP rather than undoing
that work.  I think it's a more transparent way of saying what we
want than the magic-OID-typedefs approach.  The formatting issue is
just a mild annoyance, and it's not really BKI_LOOKUP's fault anyway.

While I'm thinking of it --- I noticed one or two places where you
had "BKI_DEFAULT(\0)".  That coding scares me a bit --- gcc seems to
tolerate it, but other C compilers might feel that \0 is not a valid
preprocessing token, or it might confuse some editors' syntax highlight
rules.  I'd rather write cases like this as "BKI_DEFAULT('\0')".  IOW,
the argument should be a valid C identifier, number, or string literal.

regards, tom lane

Re: new function for tsquery creartion

2018-03-22 Thread Dmitry Ivanov


Hi David,

I'd like to take over from Victor. I'll post a revised version of the 
patch in a couple of days.


--
Dmitry Ivanov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: faster testing with symlink installs

Michael Paquier  writes:
> The last complain on the matter I can find actually involves the same
> people as this thread :)
> https://www.postgresql.org/message-id/54DE457F.2090206%40gmx.net

> So the patch should be marked as rejected or at least returned with
> feedback?

IMO, changing this behavior is more likely to draw complaints
than praise.

regards, tom lane

FYI: jOOQ blog

2018-03-22 Thread Bear Giles

If you want to know how PostgreSQL compares to other databases, or are
looking for ideas on areas to improve, the jOOQ blog looks like a good
resource: https://blog.jooq.org/. jOOQ is a java library that provides a
database-agnostic way to access many different types of databases. I know
it does some query optimizations but I don't know how advanced it is. The
other reason for them to track advanced functionality available in some
databases is to ensure that they support users who want to use those
features.

Most of the blog is specific to java or its library but there's also
frequent articles on advanced functionality (row values, window functions,
etc.), query optimization, and comparisons between the databases. The
comparisons of query optimizations in particular might be good low hanging
fruit since it doesn't involve any user-facing changes, just recognizing a
few additional patterns in the query parse tree and knowing a more
efficient way to perform equivalent work.

Bear

Re: constraint exclusion and nulls in IN (..) clause

Amit Langote  writes:
> I too wasn't sure if the patch's modifications to
> operator_predicate_proof() led to correct handling for the case where both
> clause_const and pred_const are both NULL consts.  ISTM that the result in
> that case becomes what operator_same_subexprs_proof() would return for the
> pred_op (possibly commuted) and clause_op pair.  But maybe we don't end up
> in that case much.

Probably not.  It's hard to get to this type of case at all, because in
most cases, if we have a null constant as a subexpression, previous
const-simplification would have replaced the whole expression with null.
I think in practice it's only interesting when the upper levels of
predtest.c have constructed an operator expression out of parts of the
given clauses, which is exactly what happens for cases like IN-lists.

regards, tom lane

Re: Boolean partitions syntax

2018-03-22 Thread David Steele

On 3/21/18 10:59 PM, Amit Langote wrote:
> On 2018/03/21 23:31, David Steele wrote:
>> Hi Amit,
>>
>> On 3/6/18 9:44 AM, David Steele wrote:
>>> On 3/2/18 2:27 AM, Amit Langote wrote:
 On 2018/03/02 15:58, Andres Freund wrote:
> On 2018-02-02 17:00:24 -0500, Tom Lane wrote:
>> Peter Eisentraut  writes:
>>> There might be other options, but one way to solve this would be to
>>> treat partition bounds as a general expression in the grammar and then
>>> check in post-parse analysis that it's a constant.
>>
>> That's pretty much what I said upthread.  What I basically don't like
>> about the current setup is that it's assuming that the bound item is
>> a bare literal.  Even disregarding future-extension issues, that's bad
>> because it can't result in an error message smarter than "syntax error"
>> when someone tries the rather natural thing of writing a more complicated
>> expression.
>
> Given the current state of this patch, with a number of senior
> developers disagreeing with the design, and the last CF being in
> progress, I think we should mark this as returned with feedback.

 I see no problem with pursuing this in the next CF if the consensus is
 that we should fix how partition bounds are parsed, instead of adopting
 one of the patches to allow the Boolean literals to be accepted as
 partition bounds.
>>>
>>> I'm inclined to mark this patch Returned with Feedback unless I hear
>>> opinions to the contrary.
>>
>> Hearing no opinions to the contrary I have marked this entry Returned
>> with Feedback.  Please resubmit when you have an updated patch.
> 
> OK.
> 
> Btw, there is an 11dev open item recently added to the wiki that's related
> to this, but I think we might be able to deal with it independently of
> this proposal.
> 
> * Partitions with bool partition keys *
> https://wiki.postgresql.org/wiki/PostgreSQL_11_Open_Items#Open_Issues

If you want to bring this patch up to date and recast it as a bug fix
for the open issue I'll be happy to add it to the CF as a bug fix.

However, it seems to me the best plan might be to start with David's
patch [1] and make it play nice with old pg_dumps.

Thanks,
-- 
-David
da...@pgmasters.net

[1]
https://www.postgresql.org/message-id/flat/CAKJS1f-BL%2Br5FXSejDu%3D%2BMAvzRARaawRnQ_ZFtbv_o6tha9NJw%40mail.gmail.com#CAKJS1f-BL+r5FXSejDu=+mavzraraawrnq_zftbv_o6tha9...@mail.gmail.com

Re: [HACKERS] advanced partition matching algorithm for partition-wise join

2018-03-22 Thread Ashutosh Bapat

On Thu, Mar 22, 2018 at 4:32 AM, Dmitry Dolgov <9erthali...@gmail.com> wrote:
>> On 12 March 2018 at 06:00, Ashutosh Bapat  
>> wrote:
>> Thanks for the note. Here are rebased patches.
>
> Since I started to look at this patch, I can share few random notes (although
> it's not a complete review, I'm in the progress now), maybe it can be helpful.
>
> In `partition_range_bounds_merge`
>
> + if (!merged)
> + break;
>
> is a bit redundant I think, because every time `merged` set to false it
> followed by break.

Yes, right now. May be I should turn it into Assert(merged); What do you think?

>
> At the end same function
>
> + if (merged)
> + merged = merge_default_partitions(outer_bi, outer_pmap, outer_mmap,
> + inner_bi, inner_pmap, inner_mmap,
> + jointype, _index,
> + _index);
>
> Looks like I misunderstand something in the algorithm, but aren't default
> partitions were already processed before e.g. here:
>
> + /*
> +  * Default partition on the outer side forms the default
> +  * partition of the join result.
> +  */

The default partition handling in the loop handles the cases of
missing partitions as explained in a comment
/*
 * If a range appears in one of the joining relations but not the other
 * (as a whole or part), the rows in the corresponding partition will
 * not have join partners in the other relation, unless the other
 * relation has a default partition.

But merge_default_partitions() tries to map the default partitions
from both the relations.

>
> Also in here
>
> + /* Free any memory we used in this function. */
> + list_free(merged_datums);
> + list_free(merged_indexes);
> + pfree(outer_pmap);
> + pfree(inner_pmap);
> + pfree(outer_mmap);
> + pfree(inner_mmap);
>
> I think `merged_kinds` is missing.

Done.

>
> I've noticed that in some places `IS_PARTITIONED_REL` was replaced
>
> - if (!IS_PARTITIONED_REL(joinrel))
> + if (joinrel->part_scheme == NULL)
>
> but I'm not quite follow why? Is it because `boundinfo` is not available
> anymore at this point? If so, maybe it makes sense to update the commentary 
> for
> this macro and mention to not use for joinrel.

This is done in try_partitionwise_join(). As explained in the comment
/*
 * Get the list of matching partitions to be joined along with the
 * partition bounds of the join relation. Because of the restrictions
 * imposed by partition matching algorithm, not every pair of joining
 * relations for this join will be able to use partition-wise join. But all
 * those pairs which can use partition-wise join will produce the same
 * partition bounds for the join relation.
 */
boundinfo for the join relation is built in this function. So, we
don't have join relation's partitioning information fully set up yet.
So we can't use IS_PARTITIONED_REL() there. joinrel->part_scheme if
set tells that the joining relations have matching partition schemes
and thus the join relation can possibly use partition-wise join
technique. If it's not set, then we can't use partition-wise join.

But IS_PARTITIONED_REL() is still useful at a number of other places,
where it's known to encounter a RelOptInfo whose partitioning
properties are fully setup. So, I don't think we should change the
macro or the comments above it.

>
> Also, as for me it would be helpful to have some commentary about this new
> `partsupfunc`, what is exactly the purpose of it (but it's maybe just me
> missing some obvious things from partitioning context)
>
> + FmgrInfo   *partsupfunc;

It's just copied from Relation::PartitionKey as is. It stores the
functions required to compare two partition key datums. Since we use
those functions frequently their FmgrInfo is cached.

-- 
Best Wishes,
Ashutosh Bapat
EnterpriseDB Corporation
The Postgres Database Company

RE: pg_stat_statements HLD for futur developments



Hello,

My question is more as there are so many developments arround 
pg_stat_statements (see the list at the end of the initial post):


What is the roadmap for its design ?


None? As for any feature, some people may have some plans, that they may 
end up developing or not. If developed, it may end up committed or not. 
Kind of a Darwinian process which involves a degree of randomness.



Could a PLANID column be added to help new developments working on plans and 
parameters ?


Dunno. I understand that the underlying suggestion is that selected plans 
may be kept as queries are kept, and that you are refering to 
"https://commitfest.postgresql.org/17/1470/;.


Maybe ask your question on the corresponding thread, and contribute to 
reviewing the patch?


As the same plan may be used for two distinct queries and one query may 
yield distinct plans, I'd guess that there is a potential n-n 
relationship, but maybe it is simpler to keep a list of plans attached to 
their queries somehow.


Could all the new columns be added to the actual view, or should they be 
added to others like pg_stat_statements_plans or 
pg_stat_statements_parameters reusing pgss's pk (userid, dbid, queryid, 
planid) ?


Out of the box I'd be fine with a pg_stat_plans, and some mapping between 
plans and statements.


--
Fabien.

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-22 Thread Alexander Korotkov

On Mon, Mar 19, 2018 at 8:45 AM, Kyotaro HORIGUCHI <
horiguchi.kyot...@lab.ntt.co.jp> wrote:

> At Mon, 19 Mar 2018 11:12:58 +0900, Masahiko Sawada 
> wrote in

Re: [HACKERS] GUC for cleanup indexes threshold.

2018-03-22 Thread Alexander Korotkov

On Mon, Mar 19, 2018 at 5:12 AM, Masahiko Sawada 
wrote:

> On Wed, Mar 14, 2018 at 9:25 PM, Alexander Korotkov
>  wrote:
> > On Wed, Mar 14, 2018 at 7:40 AM, Masahiko Sawada 
> > wrote:
> >>
> >> On Sat, Mar 10, 2018 at 3:40 AM, Alexander Korotkov
> >>  wrote:
> >> > On Fri, Mar 9, 2018 at 3:12 PM, Masahiko Sawada <
> sawada.m...@gmail.com>
> >> > wrote:
> >> >>
> >> >> On Fri, Mar 9, 2018 at 8:43 AM, Alexander Korotkov
> >> >>  wrote:
> >> >> > 2) These parameters are reset during btbulkdelete() and set during
> >> >> > btvacuumcleanup().
> >> >>
> >> >> Can't we set these parameters even during btbulkdelete()? By keeping
> >> >> them up to date, we will able to avoid an unnecessary cleanup vacuums
> >> >> even after index bulk-delete.
> >> >
> >> >
> >> > We certainly can update cleanup-related parameters during
> >> > btbulkdelete().
> >> > However, in this case we would update B-tree meta-page during each
> >> > VACUUM cycle.  That may cause some overhead for non append-only
> >> > workloads.  I don't think this overhead would be sensible, because in
> >> > non append-only scenarios VACUUM typically writes much more of
> >> > information.
> >> > But I would like this oriented to append-only workload patch to be
> >> > as harmless as possible for other workloads.
> >>
> >> What overhead are you referring here? I guess the overhead is only the
> >> calculating the oldest btpo.xact. And I think it would be harmless.
> >
> >
> > I meant overhead of setting last_cleanup_num_heap_tuples after every
> > btbulkdelete with wal-logging of meta-page.  I bet it also would be
> > harmless, but I think that needs some testing.
>
> Agreed.
>
> After more thought, it might be too late but we can consider the
> possibility of another idea proposed by Peter. Attached patch
> addresses the original issue of index cleanups by storing the epoch
> number of page deletion XID into PageHeader->pd_prune_xid which is
> 4byte field. Comparing to the current proposed patch this patch
> doesn't need neither the page upgrade code nor extra WAL-logging. If
> we also want to address cases other than append-only case we will
> require the bulk-delete method of scanning whole index and of logging
> WAL. But it leads some extra overhead. With this patch we no longer
> need to depend on the full scan on b-tree index. This might be useful
> for a future when we make the bulk-delete of b-tree index not scan
> whole index.


Storing epoch in deleted pages is good by itself, because it makes us
free to select the moment of time to recycle those pages without risk
of wraparound.

However, I see that you are comparing relative change of num_heap_tuples
before and after vacuum to vacuum_cleanup_index_scale_factor.
The problem is that if vacuum is very aggressive (and that is likely for
append only case if this patch is committed), then num_heap_tuples
changes very slightly every vacuum cycle.  So, cleanup would never
happen and statistics could stall indefinitely.

Another issue is that append only workloads are frequently combined
with rare periodical bulk deletes of data.  Assuming that in this patch
you don't consider deleted pages to trigger index cleanup, on such
workload large amounts of deleted pages may reside non-recycled until
next bulk delete cycle.

So, despite I generally like idea of storing epoch of deleted xid in the
page, I think my version of patch is closer to committable shape.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Re: [HACKERS] MERGE SQL Statement for PG11

On Thu, Mar 22, 2018 at 1:15 AM, Peter Geoghegan  wrote:

>
> >
> >
> > No, it did not. We only support VALUES clause with INSERT action.
>
> But can't you have a subselect in the VALUES()? Support for subselects
> seems like a totally distinct thing to the restriction that only a
> (single row) VALUES() is allowed in INSERT actions.
>
>
Ah, right. That works even today.

postgres=# CREATE TABLE target (a int, b text);
CREATE TABLE

postgres=# MERGE INTO target USING (SELECT 1) s ON false WHEN NOT MATCHED
THEN INSERT VALUES ((SELECT count(*) FROM pg_class), (SELECT relname FROM
pg_class LIMIT 1));
MERGE 1

postgres=# SELECT * FROM target;
  a  |   b
-+
 755 | pgbench_source
(1 row)


Thanks,
Pavan


-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

RE: pg_stat_statements HLD for futur developments

2018-03-22 Thread legrand legrand

Thank you Fabien.


My question is more as there are so many developments arround 
pg_stat_statements (see the list at the end of the initial post):


What is the roadmap for its design ?


Could a PLANID column be added to help new developments working on plans and 
parameters ?

Could all the new columns be added to the actual view, or should they be added 
to others

like pg_stat_statements_plans or pg_stat_statements_parameters reusing pgss's 
pk (userid, dbid, queryid, planid) ?


Regards

PAscal






De : Fabien COELHO 
Envoyé : jeudi 22 mars 2018 09:32:13
À : legrand legrand
Cc : pgsql-hack...@postgresql.org
Objet : Re: pg_stat_statements HLD for futur developments


Hello,

> As a new user of PostgreSQL, I have started using pg_stat_statements, and
> was pleased but a little surprised:
>
> First of all, the normalized form of the query string makes it impossible to
> be used in EXPLAIN commands.

Yes, because of the normalization.

> Second, normalized constants and parameters values where missing to be able
> to test optimizations results manually with EXPLAIN.

The normalization is entirely voluntary. Otherwise, probably every query
would generate a distinct entry, which for a high load system would be a
very bad idea, and there would be no point to the extension.

Note that with recent pg you can reuse the query with with a PREPARE, and
then EXPLAIN on the EXECUTE.

   PREPARE foo(INT) AS ;
   EXPLAIN EXECUTE foo(5432);
   EXPLAIN EXECUTE foo(2345);
   DEALLOCATE foo;

> Third, execution plan was not available (making usage of AUTO_EXPLAIN

Yep, but ISTM that the plan might differ depending on the actual values,
so it would not make much sense to keep it anyway.

--
Fabien.

Re: ON CONFLICT DO UPDATE for partitioned tables

On Thu, Mar 22, 2018 at 3:01 PM, Amit Langote  wrote:

>
> >
> > Why do we need to pin the descriptor? If we do need, why don't we need
> > corresponding ReleaseTupDesc() call?
>
> PinTupleDesc was added in the patch as Alvaro had submitted it upthread,
> which it wasn't clear to me either why it was needed.  Looking at it
> closely, it seems we can get rid of it if for the lack of corresponding
> ReleaseTupleDesc().  ExecSetSlotDescriptor() seems to take care of pinning
> and releasing tuple descriptors that are passed to it.  If some
> partition's tupDesc remains pinned because it was the last one that was
> passed to it, the final ExecResetTupleTable will take care of releasing it.
>
> I have removed the instances of PinTupleDesc in the updated patch, but I'm
> not sure why the loose PinTupleDesc() in the previous version of the patch
> didn't cause reference leak warnings or some such.
>

Yeah, it wasn't clear to me as well. But I did not investigate. May be
Alvaro knows better.


> > I am still confused if the partition_conflproj_tdescs really belongs to
> > PartitionTupleRouting or should it be a part of the ResultRelInfo. FWIW
> for
> > the
> > MERGE patch, I moved everything to a new struct and made it part of the
> > ResultRelInfo. If no re-mapping is necessary, we can just point to the
> > struct
> > in the root relation's ResultRelInfo. Otherwise create and populate a new
> > one
> > in the partition relation's ResultRelInfo.
> >
> > + leaf_part_rri->ri_onConflictSetWhere =
> > + ExecInitQual(onconflwhere, >ps);
> > + }
> >
> > So ri_onConflictSetWhere and ri_onConflictSetProj are part of the
> > ResultRelInfo. Why not move mt_conflproj_tupdesc,
> > partition_conflproj_tdescs,
> > partition_arbiter_indexes etc to the ResultRelInfo as well?
>
> I have moved both the projection tupdesc and the arbiter indexes list into
> ResultRelInfo as I wrote above.
>
>
Thanks. It's looking much better now. I think we can possibly move all ON
CONFLICT related members to a separate structure and just copy the pointer
to the structure if (map == NULL). That might make the code a bit more tidy.

Is there anything that needs to be done for transition tables? I checked
and didn't see anything, but please check.

Thanks,
Pavan

-- 
 Pavan Deolasee   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: [PROPOSAL] Shared Ispell dictionaries

2018-03-22 Thread Arthur Zakirov

On Wed, Mar 21, 2018 at 12:00:52PM +0300, Arthur Zakirov wrote:
> On Tue, Mar 20, 2018 at 09:30:15PM +0100, Tomas Vondra wrote:
> > I wonder if these restrictions needed? I mean, why not to allow setting
> > max_shared_dictionaries_size below the size of loaded dictionaries?
> > 
> > Of course, on the one hand those restriction seem sensible. On the other
> > hand, perhaps in some cases it would be useful to allow violating them?
> > 
> > I mean, why not to simply disable loading of new dictionaries when
> > 
> > (max_shared_dictionaries_size < loaded_size)
> > 
> > Maybe I'm over-thinking this though. It's probably safer and less
> > surprising to enforce the restrictions.
> 
> Hm, yes in some cases this check may be over-engineering. I thought that
> it is reasonable and safer in v7 patch. But there are similar GUCs,
> wal_keep_segments and max_wal_size, which don't do additional checks.
> And people are fine with them. So I removed that check from the variable.
> 
> Please find the attached new version of the patch.

I forgot to fix regression tests for max_shared_dictionaries_size. Also
I'm not confident about using pg_reload_conf() in regression tests.
I haven't found where pg_reload_conf() is used in tests. So I removed
max_shared_dictionaries_size tests for now.

Sorry for the noise.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company
diff --git a/src/backend/tsearch/spell.c b/src/backend/tsearch/spell.c
index b9fdd77e19..e071994523 100644
--- a/src/backend/tsearch/spell.c
+++ b/src/backend/tsearch/spell.c
@@ -78,6 +78,8 @@
 #define tmpalloc(sz)  MemoryContextAlloc(Conf->buildCxt, (sz))
 #define tmpalloc0(sz)  MemoryContextAllocZero(Conf->buildCxt, (sz))
 
+#define tmpstrdup(str) MemoryContextStrdup(Conf->buildCxt, (str))
+
 /*
  * Prepare for constructing an ISpell dictionary.
  *
@@ -498,7 +500,7 @@ NIAddSpell(IspellDict *Conf, const char *word, const char 
*flag)
Conf->Spell[Conf->nspell] = (SPELL *) tmpalloc(SPELLHDRSZ + 
strlen(word) + 1);
strcpy(Conf->Spell[Conf->nspell]->word, word);
Conf->Spell[Conf->nspell]->p.flag = (*flag != '\0')
-   ? cpstrdup(Conf, flag) : VoidString;
+   ? tmpstrdup(flag) : VoidString;
Conf->nspell++;
 }
 
@@ -1040,7 +1042,7 @@ setCompoundAffixFlagValue(IspellDict *Conf, 
CompoundAffixFlag *entry,
entry->flag.i = i;
}
else
-   entry->flag.s = cpstrdup(Conf, s);
+   entry->flag.s = tmpstrdup(s);
 
entry->flagMode = Conf->flagMode;
entry->value = val;
@@ -1536,6 +1538,9 @@ nextline:
return;
 
 isnewformat:
+   pfree(recoded);
+   pfree(pstr);
+
if (oldformat)
ereport(ERROR,
(errcode(ERRCODE_CONFIG_FILE_ERROR),
diff --git a/contrib/dict_int/dict_int.c b/contrib/dict_int/dict_int.c
index 56ede37089..e11d1129e9 100644
--- a/contrib/dict_int/dict_int.c
+++ b/contrib/dict_int/dict_int.c
@@ -30,7 +30,7 @@ PG_FUNCTION_INFO_V1(dintdict_lexize);
 Datum
 dintdict_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictInt*d;
ListCell   *l;
 
@@ -38,7 +38,7 @@ dintdict_init(PG_FUNCTION_ARGS)
d->maxlen = 6;
d->rejectlong = false;
 
-   foreach(l, dictoptions)
+   foreach(l, init_data->dictoptions)
{
DefElem*defel = (DefElem *) lfirst(l);
 
diff --git a/contrib/dict_xsyn/dict_xsyn.c b/contrib/dict_xsyn/dict_xsyn.c
index a79ece240c..c3146bae3c 100644
--- a/contrib/dict_xsyn/dict_xsyn.c
+++ b/contrib/dict_xsyn/dict_xsyn.c
@@ -140,7 +140,7 @@ read_dictionary(DictSyn *d, const char *filename)
 Datum
 dxsyn_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
DictSyn*d;
ListCell   *l;
char   *filename = NULL;
@@ -153,7 +153,7 @@ dxsyn_init(PG_FUNCTION_ARGS)
d->matchsynonyms = false;
d->keepsynonyms = true;
 
-   foreach(l, dictoptions)
+   foreach(l, init_data->dictoptions)
{
DefElem*defel = (DefElem *) lfirst(l);
 
diff --git a/contrib/unaccent/unaccent.c b/contrib/unaccent/unaccent.c
index 247c202755..2e66331ed8 100644
--- a/contrib/unaccent/unaccent.c
+++ b/contrib/unaccent/unaccent.c
@@ -267,12 +267,12 @@ PG_FUNCTION_INFO_V1(unaccent_init);
 Datum
 unaccent_init(PG_FUNCTION_ARGS)
 {
-   List   *dictoptions = (List *) PG_GETARG_POINTER(0);
+   DictInitData *init_data = (DictInitData *) PG_GETARG_POINTER(0);
TrieChar   *rootTrie = NULL;
boolfileloaded = false;
ListCell   *l;
 
-   foreach(l, dictoptions)
+   foreach(l, init_data->dictoptions)
{
DefElem*defel = (DefElem *) lfirst(l);

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-22 Thread Jeevan Chalke

On Thu, Mar 22, 2018 at 10:06 AM, Ashutosh Bapat <
ashutosh.ba...@enterprisedb.com> wrote:

> On Thu, Mar 22, 2018 at 3:26 AM, Robert Haas 
> wrote:
> >
> > Is there a good reason not to use input_rel->relids as the input to
> > fetch_upper_rel() in all cases, rather than just at subordinate
> > levels?
> >
>
> That would simplify some code in these patches. We have set
> upper_rel->relids to NULL for non-other upper relation since Tom
> expected to use relids to mean something other than scan/join relids.
> With these patch-sets for grouping rels we are using upper_rel->relids
> to the relids of underlying scan/join relation. So it does make sense
> to set relids to input_rel->relids for all the grouping rels whether
> "other" or non-"other" grouping rels.
>
> But with this change, we have to change all the existing code to pass
> input_rel->relids to fetch_upper_rel(). If we don't do that or in
> future somebody calls that function with relids = NULL we will produce
> two relations which are supposed to do the same thing but have
> different relids set. That's because fetch_upper_rel() creates a
> relation if one does not exist whether or not the caller intends to
> create one. We should probably create two functions 1. to build an
> upper relation and 2. to search for it similar to what we have done
> for join relations and base relation. The other possibility is to pass
> a flag to fetch_upper_rel() indicating whether a caller intends to
> create a new relation when one doesn't exist. With this design a
> caller can be sure that an upper relation will not be created when it
> wants to just fetch an existing relation (and error out/assert if it
> doesn't find one.).
>

Like Ashutosh said, splitting fetch_upper_rel() in two functions,
build_upper_rel() and find_upper_rel() looks better.

However, I am not sure whether setting relids in a top-most grouped rel is
a good idea or not. I remember we need this while working on Aggregate
PushDown, and in [1] Tom Lane opposed the idea of setting the relids in
grouped_rel.

If we want to go with this, then I think it should be done as a separate
stand-alone patch.

[1]
https://www.postgresql.org/message-id/CAFjFpRdUz6h6cmFZFYAngmQAX8Zvo+MZsPXidZ077h=gp9b...@mail.gmail.com


>
> --
> Best Wishes,
> Ashutosh Bapat
> EnterpriseDB Corporation
> The Postgres Database Company
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company

Re: constraint exclusion and nulls in IN (..) clause

On 2018/03/21 23:00, Tom Lane wrote:
> Emre Hasegeli  writes:
>> I am not sure if we are covering the case when clause_const and
>> pred_const are both NULL.  In this case, we should be able to return
>> true only by checking op_strict(pred_op) or maybe even without
>> checking that.  Am I mistaken?
> 
> Yeah, that's there.  We need both operators to be strict, I think;
> otherwise we can't really assume anything about what they'd return
> for NULL inputs.  But if they are, we have NULL => NULL which is
> valid for all proof cases.

Thank you for closing the CF entry.

I too wasn't sure if the patch's modifications to
operator_predicate_proof() led to correct handling for the case where both
clause_const and pred_const are both NULL consts.  ISTM that the result in
that case becomes what operator_same_subexprs_proof() would return for the
pred_op (possibly commuted) and clause_op pair.  But maybe we don't end up
in that case much.

Regards,
Amit

Re: [HACKERS] Partition-wise aggregation/grouping

2018-03-22 Thread Jeevan Chalke

On Thu, Mar 22, 2018 at 3:26 AM, Robert Haas  wrote:

> On Wed, Mar 21, 2018 at 11:33 AM, Jeevan Chalke
>  wrote:
> > Let me try to explain this:
> > 1. GROUPING_CAN_PARTITIONWISE_AGG
> > 2. extra->is_partial_aggregation
> > 3. extra->perform_partial_partitionwise_aggregation
>
> Please find attached an incremental patch that attempts to refactor
> this logic into a simpler form.  What I've done is merged all three of
> the above Booleans into a single state variable called 'patype', which
> can be one of PARTITIONWISE_AGGREGATE_NONE,
> PARTITIONWISE_AGGREGATE_FULL, and PARTITIONWISE_AGGREGATE_PARTIAL.
> When create_ordinary_grouping_paths() is called, extra.patype is the
> value for the parent relation; that function computes a new value and
> passes it down to create_partitionwise_grouping_paths(), which inserts
> into the new 'extra' structure for the child.
>
> Essentially, in your system, extra->is_partial_aggregation and
> extra->perform_partial_partitionwise_aggregation both corresponded to
> whether or not patype == PARTITIONWISE_AGGREGATE_PARTIAL, but the
> former indicated whether the *parent* was doing partition-wise
> aggregation (and thus we needed to generate only partial paths) while
> the latter indicated whether the *current* relation was doing
> partition-wise aggregation (and thus we needed to force creation of
> partially_grouped_rel).  This took me a long time to understand
> because of the way the fields were named; they didn't indicate that
> one was for the parent and one for the current relation.  Meanwhile,
> GROUPING_CAN_PARTITIONWISE_AGG indicated whether partition-wise
> aggregate should be tried at all for the current relation; there was
> no analogous indicator for the parent relation because we can't be
> processing a child at all if the parent didn't decide to do
> partition-wise aggregation.  So to know what was happening for the
> current relation you had to look at GROUPING_CAN_PARTITIONWISE_AGG +
> extra->perform_partial_partitionwise_aggregation, and to know what was
> happening for the parent relation you just looked at
> extra->is_partial_aggregation.  With this proposed refactoring patch,
> there's just one patype value at each level, which at least to me
> seems simpler.  I tried to improve the comments somewhat, too.
>

Leeks cleaner now. Thanks for refactoring it.

I have merged these changes and flatten all previuos changes into the main
patch.


>
> You have some long lines that it would be good to break, like this:
>
>  child_extra.targetList = (List *) adjust_appendrel_attrs(root,
>
> (Node *) extra->targetList,
>
> nappinfos,
>
> appinfos);
>
> If you put a newline after (List *), the formatting will come out
> nicer -- it will fit within 80 columns.  Please go through the patches
> and make these kinds of changes for lines over 80 columns where
> possible.
>

OK. Done.
I was under impression that it is not good to split the first line. What if
in split version, while applying any new patch or in Merge process
something gets inserted between them? That will result into something
unexpected. But I am not sure if that's ever a possibility.


> I guess we'd better move the GROUPING_CAN_* constants to a header
> file, if they're going to be exposed through GroupPathExtraData.  That
> can go in some refactoring patch.
>

Yep. Moved that into 0003 where we create GroupPathExtraData.


> Is there a good reason not to use input_rel->relids as the input to
> fetch_upper_rel() in all cases, rather than just at subordinate
> levels?
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>



-- 
Jeevan Chalke
Technical Architect, Product Development
EnterpriseDB Corporation
The Enterprise PostgreSQL Company


partition-wise-agg-v23.tar.gz
Description: application/gzip

Re: MCV lists for highly skewed distributions

2018-03-22 Thread Dean Rasheed

On 19 March 2018 at 16:59, John Naylor  wrote:
> On 3/19/18, Dean Rasheed  wrote:
>> As promised, here is a new patch, with comment updates, per John and
>> Tomas' suggestions, plus the continuity correction, which seemed just
>> about worthwhile.
>
> Great. I'm happy with the behavior of the patch. I've marked it ready
> for committer.
>

Thanks for testing. I also did more testing with tables 10-20 times as
large and I was happy with the results, so I have pushed this.

Regards,
Dean

Re: [HACKERS] Secondary index access optimizations

2018-03-22 Thread Konstantin Knizhnik




On 21.03.2018 20:30, Konstantin Knizhnik wrote:



On 01.03.2018 23:15, Andres Freund wrote:

Hi,


This patch seems like quite a good improvement. One thing I've not
really looked at but am slightly concerned in passing: Are there cases
where we now would do a lot of predicate pruning work even though the
overhead of just evaluating the qual is trivial, e.g. because there's
only one row due to a pkey? The predtest code is many things but
lightning fast is not one of them.  Obviously that won't matter for
analytics queries, but I could see it hurt in OLTPish workloads...

Greetings,

Andres Freund


Hi,
I am sorry for delay with answer.

I understand your concern and did the following experiment.
I have initialized the database in the following way:

create table base (k integer primary key, v integer);
create table part1 (check (k between 1 and 1)) inherits (base);
create table part2 (check (k between 10001 and 2)) inherits (base);
create index pi1 on part1(v);
create index pi2 on part2(v);
insert into part1 values (generate series(1,1), random()*10);
insert into part2 values (generate_series(10001,2), random()*10);
vacuum analyze part1;
vacuum analyze part2;

Vanilla Postgres uses the following plan:

explain select * from base where k between 1 and 2 and v = 100;
  QUERY PLAN
---
 Append  (cost=0.00..16.63 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
   ->  Index Scan using pi2 on part2  (cost=0.29..8.31 rows=1 width=8)
 Index Cond: (v = 100)
 Filter: ((k >= 1) AND (k <= 2))
(9 rows)

Execution of this query 10 times gives is done in 12 seconds.
With applied patch query plan is changed to:

  QUERY PLAN
---
 Append  (cost=0.00..16.62 rows=3 width=8)
   ->  Seq Scan on base  (cost=0.00..0.00 rows=1 width=8)
 Filter: ((k >= 1) AND (k <= 2) AND (v = 100))
   ->  Index Scan using pi1 on part1  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
   ->  Index Scan using pi2 on part2  (cost=0.29..8.30 rows=1 width=8)
 Index Cond: (v = 100)
(7 rows)

Elapsed time of 10 query executions is 13 seconds.
So you was right that increased query optimization time exceeds 
advantage of extra checks elimination.

But it is true only if we are not using prepare statements.
With prepared statements results are the following:

Vanilla:   0m3.915s
This patch: 0m3.563s

So improvement is not so large, but it exists.
If somebody wants to repeat my experiments, I attached to this mail 
small shell script which I used to run query several times.

Non-prepared query is launched using the following command:

time ./repeat-query.sh "select * from base where k between 1 and 2 
and v = 100" 10


and prepared query:

time ./repeat-query.sh "execute select100" 10 "prepare select100 
as select * from base where k between 1 and 2 and v = 100"


And once again I want to notice that using prepared statements can 
increase performance almost 3 times!
As far as using prepared statements is not always possible I want to 
recall autoprepare patch waiting at the commitfest:


https://www.postgresql.org/message-id/flat/8eed9c23-19ba-5404-7a9e-0584b836b3f3%40postgrespro.ru#8eed9c23-19ba-5404-7a9e-0584b836b...@postgrespro.ru

--
Konstantin Knizhnik
Postgres Professional:http://www.postgrespro.com
The Russian Postgres Company

Attached please find rebased version of the patch.
Also I do more testing, now using pgbench.
Scripts for initialization of the database and for custom script for 
pgbench are attached.

Results at my computer are the following:

pgbench options
Vanilla
Patch
-c 1
9208
8289
-c 1 -M prepared
38503   41206
-c 10
39224
34040
-c 10 -M prepared
165465
172874






--
Konstantin Knizhnik
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

diff --git a/contrib/postgres_fdw/expected/postgres_fdw.out b/contrib/postgres_fdw/expected/postgres_fdw.out
index a2b1384..84b8543 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -631,12 +631,12 @@ EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NULL;-- Nu
Remote SQL: SELECT "C 1", c2, c3, c4, c5, c6, c7, c8 FROM "S 1"."T 1" WHERE (("C 1" IS NULL))
 (3 rows)
 
-EXPLAIN (VERBOSE, COSTS OFF) SELECT * FROM ft1 t1 WHERE c1 IS NOT NULL;-- NullTest
- QUERY PLAN

Re: ON CONFLICT DO UPDATE for partitioned tables

Fujita-san, Pavan,

Thank you both for reviewing.  Replying to both emails here.

On 2018/03/20 20:53, Etsuro Fujita wrote:
> Here are comments on executor changes in (the latest version of) the patch:
> 
> @@ -421,8 +424,18 @@ ExecInsert(ModifyTableState *mtstate,
>  ItemPointerData conflictTid;
>  bool    specConflict;
>  List   *arbiterIndexes;
> +    PartitionTupleRouting *proute =
> +    mtstate->mt_partition_tuple_routing;
> 
> -    arbiterIndexes = node->arbiterIndexes;
> +    /* Use the appropriate list of arbiter indexes. */
> +    if (mtstate->mt_partition_tuple_routing != NULL)
> +    {
> +    Assert(partition_index >= 0 && proute != NULL);
> +    arbiterIndexes =
> +    proute->partition_arbiter_indexes[partition_index];
> +    }
> +    else
> +    arbiterIndexes = node->arbiterIndexes;
> 
> To handle both cases the same way, I wonder if it would be better to have
> the arbiterindexes list in ResultRelInfo as well, as mentioned by Alvaro
> upthread, or to re-add mt_arbiterindexes as before and set it to
> proute->partition_arbiter_indexes[partition_index] before we get here,
> maybe in ExecPrepareTupleRouting, in the case of tuple routing.

It's a good idea.  I somehow missed that Alvaro had already mentioned it.

In HEAD, we now have ri_onConflictSetProj and ri_onConflictSetWhere.  I
propose we name the field ri_onConflictArbiterIndexes as done in the
updated patch.

>  ExecOnConflictUpdate(ModifyTableState *mtstate,
>   ResultRelInfo *resultRelInfo,
> + TupleDesc onConflictSetTupdesc,
>   ItemPointer conflictTid,
>   TupleTableSlot *planSlot,
>   TupleTableSlot *excludedSlot,
> @@ -1419,6 +1459,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
>  ExecCheckHeapTupleVisible(estate, , buffer);
> 
>  /* Store target's existing tuple in the state's dedicated slot */
> +    ExecSetSlotDescriptor(mtstate->mt_existing, RelationGetDescr(relation));
>  ExecStoreTuple(, mtstate->mt_existing, buffer, false);
> 
>  /*
> @@ -1462,6 +1503,7 @@ ExecOnConflictUpdate(ModifyTableState *mtstate,
>  }
> 
>  /* Project the new tuple version */
> +    ExecSetSlotDescriptor(mtstate->mt_conflproj, onConflictSetTupdesc);
>  ExecProject(resultRelInfo->ri_onConflictSetProj);
> 
> Can we do ExecSetSlotDescriptor for mtstate->mt_existing and
> mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple
> routing?  That would make the API changes to ExecOnConflictUpdate
> unnecessary.

That's a good idea too, so done.

> 
> @@ -2368,9 +2419,13 @@ ExecInitModifyTable(ModifyTable *node, EState
> *estate, int eflags)
>  econtext = mtstate->ps.ps_ExprContext;
>  relationDesc = resultRelInfo->ri_RelationDesc->rd_att;
> 
> -    /* initialize slot for the existing tuple */
> -    mtstate->mt_existing =
> -    ExecInitExtraTupleSlot(mtstate->ps.state, relationDesc);
> +    /*
> + * Initialize slot for the existing tuple.  We determine which
> + * tupleDesc to use for this after we have determined which relation
> + * the insert/update will be applied to, possibly after performing
> + * tuple routing.
> + */
> +    mtstate->mt_existing = ExecInitExtraTupleSlot(mtstate->ps.state,
> NULL);
> 
>  /* carried forward solely for the benefit of explain */
>  mtstate->mt_excludedtlist = node->exclRelTlist;
> @@ -2378,8 +2433,16 @@ ExecInitModifyTable(ModifyTable *node, EState
> *estate, int eflags)
>  /* create target slot for UPDATE SET projection */
>  tupDesc = ExecTypeFromTL((List *) node->onConflictSet,
>   relationDesc->tdhasoid);
> +    PinTupleDesc(tupDesc);
> +    mtstate->mt_conflproj_tupdesc = tupDesc;
> +
> +    /*
> + * Just like the "existing tuple" slot, we'll defer deciding which
> + * tupleDesc to use for this slot to a point where tuple routing has
> + * been performed.
> + */
>  mtstate->mt_conflproj =
> -    ExecInitExtraTupleSlot(mtstate->ps.state, tupDesc);
> +    ExecInitExtraTupleSlot(mtstate->ps.state, NULL);
> 
> If we do ExecInitExtraTupleSlot for mtstate->mt_existing and
> mtstate->mt_conflproj in ExecPrepareTupleRouting in the case of tuple
> routing, as said above, we wouldn't need this changes.  I think doing that
> only in the case of tuple routing and keeping this as-is would be better
> because that would save cycles in the normal case.

Hmm, I think we shouldn't be doing ExecInitExtraTupleSlot in
ExecPrepareTupleRouting, because we shouldn't have more than one instance
of mtstate->mt_existing and mtstate->mt_conflproj slots.

As you also said above, I think you meant to

Re: pg_stat_statements HLD for futur developments



Hello,


As a new user of PostgreSQL, I have started using pg_stat_statements, and
was pleased but a little surprised:

First of all, the normalized form of the query string makes it impossible to
be used in EXPLAIN commands.


Yes, because of the normalization.


Second, normalized constants and parameters values where missing to be able
to test optimizations results manually with EXPLAIN.


The normalization is entirely voluntary. Otherwise, probably every query 
would generate a distinct entry, which for a high load system would be a 
very bad idea, and there would be no point to the extension.


Note that with recent pg you can reuse the query with with a PREPARE, and 
then EXPLAIN on the EXECUTE.


  PREPARE foo(INT) AS ;
  EXPLAIN EXECUTE foo(5432);
  EXPLAIN EXECUTE foo(2345);
  DEALLOCATE foo;


Third, execution plan was not available (making usage of AUTO_EXPLAIN


Yep, but ISTM that the plan might differ depending on the actual values, 
so it would not make much sense to keep it anyway.


--
Fabien.

Re: WIP: a way forward on bootstrap data

On 3/21/18, Tom Lane  wrote:
> John Naylor  writes:
>> [ v11-bootstrap-data-conversion.tar.gz ]
>
> I've done a round of review work on this, focusing on the Makefile
> infrastructure.  I found a bunch of problems with parallel builds and
> VPATH builds, which are addressed in the attached incremental patch.
>
[explanation of Make issues and stamp files]

> In the attached, I've also done some more work on the README file
> and cleaned up a few other little things.

Thanks for pulling my attempt at makefile hackery across the finish
line. It sounds like there are no more obvious structural issues
remaining (fingers crossed). Your other improvements make sense.

>  I did note a bit of distressing inconsistency in the formatting of
> the catalog struct declarations, some of which predates this patch but it
> seems like you've introduced more.  I think what we ought to standardize
> on is a format similar to this in pg_opclass.h:
>
> CATALOG(pg_opclass,2616)
> {
> /* index access method opclass is for */
> Oid opcmethod   BKI_LOOKUP(pg_am);
>
[snip]

That is the most sensible format. Did you mean all 62 catalog headers
for future-proofing, or just the ones with annotations now?

> The former convention used in some places, of field descriptions in
> same-line comments, clearly won't work anymore if we're sticking
> BKI_DEFAULT annotations there.

Yeah.

> I also don't like the format used in, eg,
> pg_aggregate.h of putting field descriptions in a separate comment block
> before the struct proper.  Bitter experience has shown that there are a
> lot of people on this project who won't update comments that are more than
> about two lines away from the code they change; so the style in
> pg_aggregate.h is just inviting maintenance oversights.

Okay.

> I've got mixed feelings about the whitespace lines between fields.  They
> seem like they are mostly bulking up the code and we could do without 'em.
> On the other hand, pgindent will insist on putting one before any
> multi-line field comment, and so that would create inconsistent formatting
> if we don't use 'em elsewhere.  Thoughts?

I'll do it both ways for one header and post the results for people to look at.

> Speaking of pgindent, those prettily aligned BKI annotations are a waste
> of effort, because when pgindent gets done with the code it will look
> like
>
[snip]
>   regproc aggmtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);
>   regproc aggminvtransfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);
>   regproc aggmfinalfn BKI_DEFAULT(-) BKI_LOOKUP(pg_proc);

> I'm not sure if there's anything much to be done about this.  BKI_DEFAULT
> isn't so bad, but additional annotations get unreadable fast.  Maybe
> BKI_LOOKUP was a bad idea after all, and we should just invent more
> Oid-equivalent typedef names.

Well, until version 7, I used "fake" type aliases that only genbki.pl
could see. The C compiler and postgres.bki could only see the actual
Oid/oid type. Perhaps it was a mistake to model their appearance after
regproc (regtype, etc), because that was misleading. Maybe something
more obviously transient like 'lookup_typeoid? I'm leaning towards
this idea.

Another possibility is to teach the pgindent pre_/post_indent
functions to preserve annotation formatting, but I'd rather not add
yet another regex to that script. Plus, over the next 10+ years, I
could see people adding several more BKI_* macros, leading to
readability issues regardless of formatting, so maybe we should nip
this one in the bud.

> The attached is just one incremental patch on top of your v11 series.
> I couldn't think of an easy way to migrate the changes back into the
> most relevant diffs of your series, so I didn't try.

I've done that quite a few times while developing this patch series,
so I'm used to it. I'll incorporate your changes soon and also rebase
over the new pg_class column that landed recently. I'll have a new
version by this weekend, assuming we conclude the formatting
discussion, so if you or others have any more comments by then, I'll
include them.

-John Naylor

Re: Hash join in SELECT target list expression keeps consuming memory

2018-03-22 Thread Jaime Soler

Right now we are purging old LO objects because our production system run
out of memory

Mem: 41154296k total, 40797560k used, 356736k free, 15748k buffers
Swap: 16777208k total, 1333260k used, 15443948k free, 35304844k cached

SELECT count(*) FROM pg_largeobject;
count
--
52614842
(1 row)

SELECT pg_size_pretty(pg_table_size('pg_largeobject'));
pg_size_pretty

15 GB
(1 row)

Regards


2018-03-21 16:51 GMT+01:00 Tom Lane :

> Tomas Vondra  writes:
> > On 03/21/2018 02:18 PM, Jaime Soler wrote:
> >> We still get out of memory error during pg_dump execution
> >> pg_dump: reading large objects
> >> out of memory
>
> > H ... that likely happens because of this for loop copying a lot of
> > data:
> > https://github.com/postgres/postgres/blob/master/src/bin/
> pg_dump/pg_dump.c#L3258
>
> The long and the short of it is that too many large objects *will*
> choke pg_dump; this has been obvious since we decided to let it treat
> large objects as heavyweight objects.  See eg
>
> https://www.postgresql.org/message-id/29613.1476969...@sss.pgh.pa.us
>
> I don't think there's any simple fix available.  We discussed some
> possible solutions in
>
> https://www.postgresql.org/message-id/flat/5539483B.
> 3040401%40commandprompt.com
>
> but none of them looked easy.  The best short-term answer is "run
> pg_dump in a less memory-constrained system".
>
> regards, tom lane
>

Re: Removing useless DISTINCT clauses

2018-03-22 Thread Laurenz Albe

Melanie Plageman wrote:
> Contents & Purpose
> ==
> 
> This patch removes any additional columns in the DISTINCT clause when the
> table's primary key columns are entirely present in the DISTINCT clause. This
> optimization works because the PK columns functionally determine the other
> columns in the relation. The patch includes regression test cases.

Would it be very difficult to extend that to "if any unique constraints are
contained in the DISTINCT clause"?

Yours,
Laurenz Albe

Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-22 Thread Simon Riggs

On 21 March 2018 at 12:23, Pavan Deolasee  wrote:

> Fixed

Looks like that this completes all outstanding items with the MERGE code.

I'm now proposing that I move to commit this, following my own final
review, on Tues 27 Mar in 5 days time, giving time for cleanup of
related issues.

If there are any items you believe are still open, please say so now
or mention any other objections you have.

Thanks for all of your detailed comments,

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [HACKERS] MERGE SQL Statement for PG11

2018-03-22 Thread Simon Riggs

On 21 March 2018 at 19:45, Peter Geoghegan  wrote:

>>> * We never actually get around to saying that MERGE is good with bulk
>>> loading, ETL, and so on. I think that we should remark on that in
>>> passing.
>>
>>
>> Suggestion?
>
> How about adding this sentence after "MERGE ... a task that would
> otherwise require multiple procedural language statements":
>
> MERGE can synchronize two tables by modifying one table based on
> differences between it and some other table.
>
> The point here is that we're primarily talking about two whole tables.
> That deserves such prominent placement, as that suggests where users
> might really find MERGE useful, but without being too prescriptive.

The information I have is that many people are expecting MERGE to work
for OLTP since that is how it is used in other databases, not solely
as an ETL command.

So we're not primarily talking about two whole tables.

> Also, instead of saying "There are a variety of differences and
> restrictions between the two statement types [MERGE and INSERT ... ON
> CONFLICT DO UPDATE] and they are not interchangeable", you could
> instead be specific, and say:
>
> MERGE is well suited to synchronizing two tables using multiple
> complex conditions. Using INSERT with ON CONFLICT DO UPDATE works well
> when requirements are simpler. Only ON CONFLICT provides an atomic
> INSERT or UPDATE outcome in READ COMMITTED mode.
>
> BTW, the docs should be clear on the fact that "INSERT ... ON
> CONFLICT" isn't a statement. INSERT is. ON CONFLICT is a clause.

I think it would be better if you wrote a separate additional doc
patch to explain all of this, perhaps in Performance Tips section or
otherwise.

-- 
Simon Riggshttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: [PoC PATCH] Parallel dump to /dev/null

2018-03-22 Thread Michael Banck

Hi,

Am Donnerstag, den 22.03.2018, 10:33 +0900 schrieb Michael Paquier:
> Hence, is there any objection to mark this patch as returned with
> feedback?  

I've done so now.


Cheers,

Michael

-- 
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax:  +49 2166 9901-100
Email: michael.ba...@credativ.de

credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer

Re: pgbench randomness initialization




Patch isn't applyed cleanly anymore.


Indeed. Here is a rebase. All pgbench patches conflict about test cases.

--
Fabien.diff --git a/doc/src/sgml/ref/pgbench.sgml b/doc/src/sgml/ref/pgbench.sgml
index f07ddf1..e4582bf 100644
--- a/doc/src/sgml/ref/pgbench.sgml
+++ b/doc/src/sgml/ref/pgbench.sgml
@@ -680,6 +680,43 @@ pgbench  options 
 d
  
 
  
+  
--random-seed=SEED
+  
+   
+Set random generator seed.  Seeds the system random number generator,
+which then produces a sequence of initial generator states, one for
+each thread.
+Values for SEED may be:
+time (the default, the seed is based on the current 
time),
+rand (use a strong random source, failing if none
+is available), or an unsigned decimal integer value.
+The random generator is invoked explicitly from a pgbench script
+(random... functions) or implicitly (for instance 
option
+--rate uses it to schedule transactions).
+When explicitly set, the value used for seeding is shown on the 
terminal.
+Any value allowed for SEED may also be
+provided through the environment variable
+PGBENCH_RANDOM_SEED.
+To ensure that the provided seed impacts all possible uses, put this 
option
+first or use the environment variable.
+  
+  
+Setting the seed explicitly allows to reproduce a 
pgbench
+run exactly, as far as random numbers are concerned.
+As the random state is managed per thread, this means the exact same
+pgbench run for an identical invocation if there is 
one
+client per thread and there are no external or data dependencies.
+From a statistical viewpoint reproducing runs exactly is a bad idea 
because
+it can hide the performance variability or improve performance unduly,
+e.g. by hitting the same pages as a previous run.
+However, it may also be of great help for debugging, for instance
+re-running a tricky case which leads to an error.
+Use wisely.
+   
+  
+ 
+
+ 
   
--sampling-rate=rate
   

@@ -884,6 +921,11 @@ pgbench  options 
 d
   
 
   
+random_seed 
+   random generator seed (unless overwritten with 
-D)
+  
+
+  
 scale 
current scale factor
   
diff --git a/src/bin/pgbench/pgbench.c b/src/bin/pgbench/pgbench.c
index a15aa06..8397d25 100644
--- a/src/bin/pgbench/pgbench.c
+++ b/src/bin/pgbench/pgbench.c
@@ -154,6 +154,9 @@ int64   latency_limit = 0;
 char  *tablespace = NULL;
 char  *index_tablespace = NULL;
 
+/* random seed used when calling srandom() */
+int64 random_seed = -1;
+
 /*
  * end of configurable parameters
  */
@@ -569,6 +572,7 @@ usage(void)
   "  --log-prefix=PREFIX  prefix for transaction time log 
file\n"
   "   (default: \"pgbench_log\")\n"
   "  --progress-timestamp use Unix epoch timestamps for 
progress\n"
+  "  --random-seed=SEED   set random seed (\"time\", 
\"rand\", integer)\n"
   "  --sampling-rate=NUM  fraction of transactions to log 
(e.g., 0.01 for 1%%)\n"
   "\nCommon options:\n"
   "  -d, --debug  print debugging output\n"
@@ -4433,6 +4437,49 @@ printResults(TState *threads, StatsData *total, 
instr_time total_time,
}
 }
 
+/* call srandom based on some seed. NULL triggers the default behavior. */
+static void
+set_random_seed(const char *seed, const char *origin)
+{
+   /* srandom expects an unsigned int */
+   unsigned int iseed;
+
+   if (seed == NULL || strcmp(seed, "time") == 0)
+   {
+   /* rely on current time */
+   instr_time  now;
+   INSTR_TIME_SET_CURRENT(now);
+   iseed = (unsigned int) INSTR_TIME_GET_MICROSEC(now);
+   }
+   else if (strcmp(seed, "rand") == 0)
+   {
+   /* use some "strong" random source */
+   if (!pg_strong_random(, sizeof(iseed)))
+   {
+   fprintf(stderr, "cannot seed random from a strong 
source\n");
+   exit(1);
+   }
+   }
+   else
+   {
+   /* parse seed unsigned int value */
+   char garbage;
+   if (sscanf(seed, "%u%c", , ) != 1)
+   {
+   fprintf(stderr,
+   "error while scanning '%s' from %s, 
expecting an unsigned integer, 'time' or 'rand'\n",
+   seed, origin);
+   exit(1);
+   }
+   }
+
+   if (seed != NULL)
+   fprintf(stderr, "setting random seed to %u\n", iseed);
+   srandom(iseed);
+   /* no precision

Re: Faster inserts with mostly-monotonically increasing values