date:20180910

Re: can commitfest application resend a mail?

2018-09-10 Thread Pavel Stehule

2018-09-10 20:37 GMT+02:00 Alvaro Herrera :

> On 2018-Sep-10, Pavel Stehule wrote:
>
> > 2018-09-10 20:13 GMT+02:00 Alvaro Herrera :
> >
> > > On 2018-Sep-10, Pavel Stehule wrote:
> > >
> > > > I use web gmail, so it is not a good news. I need it for review. I
> don't
> > > > archive mails in my mailbox, and I would not to start new thread.
> > >
> > > The other option is to paste your email contents in the commitfest app,
> > > which will send it as an email with the proper email headers so that
> > > it's correctly threaded.
> >
> > I am blind. How I can do it?
>
> In a patch page, click "comment" -> "review".
>

I found it. Thank you

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: Retrieve memory size allocated by libpq

2018-09-10 Thread Pavel Stehule

The following review has been posted through the commitfest application:
make installcheck-world:  tested, passed
Implements feature:   not tested
Spec compliant:   not tested
Documentation:tested, passed

The implementation of PQresultMemsize is simple and correct. There are no 
warnings, and all tests passed.

I'll mark this patch as ready for committer

The new status of this patch is: Ready for Committer

Re: [HACKERS] Secondary index access optimizations

2018-09-10 Thread David Rowley

On 9 July 2018 at 13:26, David Rowley  wrote:
> I started looking over this patch and have a few comments:

Hi Konstantin,

Wondering, are you going to be submitting an updated patch for this
commitfest?  If not then I think we can mark this as returned with
feedback as it's been waiting on author for quite a while now.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Michael Paquier

On Mon, Sep 10, 2018 at 06:12:52PM +0300, Sergei Kornilov wrote:
> There were already some discussions about move recovery.conf into GUC
> infrastructure first. I continue work on one most recent patch here:
> https://commitfest.postgresql.org/19/1711/ If this patch will be
> committed - i plan sent new patch to allow change primary_conninfo
> with SIGHUP. 

Nice to hear that!  Please make sure that in the first flavor of the
patch all the recovery parameters are switched to GUC_POSTMASTER so as
they map with the existing behavior.  We should move a subset of those
parameters to SIGHUP after a careful lookup, and as a secondary step.
--
Michael


signature.asc
Description: PGP signature

Re: CREATE ROUTINE MAPPING

2018-09-10 Thread Corey Huinker

On Mon, Sep 10, 2018 at 3:28 AM Hannu Krosing 
wrote:

> Hi Corey
>
> Have you looked at pl/proxy ?
>

I have, a long while ago.

> It does this and then some (sharding)
>

PL/proxy isn't a part of the SQL Standard.
PL/proxy only connects to other libpq-speaking databases.
The hope with routine mapping is that other data sources that do not easily
conform to a rows-and-columns metaphor can still expose their data to
postgresql.

Re: speeding up planning with partitions

2018-09-10 Thread Amit Langote

On 2018/09/11 10:11, Peter Geoghegan wrote:
> On Wed, Aug 29, 2018 at 5:06 AM, Amit Langote
>  wrote:
>> It is more or less well known that the planner doesn't perform well with
>> more than a few hundred partitions even when only a handful of partitions
>> are ultimately included in the plan.  Situation has improved a bit in PG
>> 11 where we replaced the older method of pruning partitions one-by-one
>> using constraint exclusion with a much faster method that finds relevant
>> partitions by using partitioning metadata.  However, we could only use it
>> for SELECT queries, because UPDATE/DELETE are handled by a completely
>> different code path, whose structure doesn't allow it to call the new
>> pruning module's functionality.  Actually, not being able to use the new
>> pruning is not the only problem for UPDATE/DELETE, more on which further
>> below.
> 
> This was a big problem for the SQL MERGE patch. I hope that this
> problem can be fixed.

Sorry if I've missed some prior discussion about this, but could you
clarify which aspect of UPDATE/DELETE planning got in the way of SQL MERGE
patch?  It'd be great if you can point to an email or portion of the SQL
MERGE discussion thread where this aspect was discussed.

In the updated patch that I'm still hacking on (also need to look at
David's comments earlier today before posting it), I have managed to
eliminate the separation of code paths handling SELECT vs UPDATE/DELETE on
inheritance tables, but I can't be sure if the new approach (also) solves
any problems that were faced during SQL MERGE development.

Thanks,
Amit

Re: speeding up planning with partitions

2018-09-10 Thread Peter Geoghegan

On Wed, Aug 29, 2018 at 5:06 AM, Amit Langote
 wrote:
> It is more or less well known that the planner doesn't perform well with
> more than a few hundred partitions even when only a handful of partitions
> are ultimately included in the plan.  Situation has improved a bit in PG
> 11 where we replaced the older method of pruning partitions one-by-one
> using constraint exclusion with a much faster method that finds relevant
> partitions by using partitioning metadata.  However, we could only use it
> for SELECT queries, because UPDATE/DELETE are handled by a completely
> different code path, whose structure doesn't allow it to call the new
> pruning module's functionality.  Actually, not being able to use the new
> pruning is not the only problem for UPDATE/DELETE, more on which further
> below.

This was a big problem for the SQL MERGE patch. I hope that this
problem can be fixed.

-- 
Peter Geoghegan

Re: speeding up planning with partitions

2018-09-10 Thread David Rowley

On 30 August 2018 at 21:29, Amit Langote  wrote:
> Attached updated patches, with 0002 containing the changes mentioned above.

Here's my first pass review on this:

0001:

1. I think the following paragraph should probably mention something
about some performance difference between the two methods:

   
Constraint exclusion works in a very similar way to partition
pruning, except that it uses each table's CHECK
constraints — which gives it its name — whereas partition
pruning uses the table's partition bounds, which exist only in the
case of declarative partitioning.  Another difference is that
constraint exclusion is only applied at plan time; there is no attempt
to remove partitions at execution time.
   

Perhaps tagging on. "Constraint exclusion is also a much less
efficient way of eliminating unneeded partitions as metadata for each
partition must be loaded in the planner before constraint exclusion
can be applied.  This is not a requirement for partition pruning."

2. I think "rootrel" should be named "targetpart" in:

+ RelOptInfo *rootrel = root->simple_rel_array[root->parse->resultRelation];

3. Why does the following need to list_copy()?

+ List*saved_join_info_list = list_copy(root->join_info_list);

4. Is the "root->parse->commandType != CMD_INSERT" required in:

@@ -181,13 +185,30 @@ make_one_rel(PlannerInfo *root, List *joinlist)

  /*
  * Generate access paths for the entire join tree.
+ *
+ * If we're doing this for an UPDATE or DELETE query whose target is a
+ * partitioned table, we must do the join planning against each of its
+ * leaf partitions instead.
  */
- rel = make_rel_from_joinlist(root, joinlist);
+ if (root->parse->resultRelation &&
+ root->parse->commandType != CMD_INSERT &&
+ root->simple_rel_array[root->parse->resultRelation] &&
+ root->simple_rel_array[root->parse->resultRelation]->part_scheme)
+ {

Won't the simple_rel_array entry for the resultRelation always be NULL
for an INSERT?

5. In regards to:

+ /*
+ * Hack to make the join planning code believe that 'partrel' can
+ * be joined against.
+ */
+ partrel->reloptkind = RELOPT_BASEREL;

Have you thought about other implications of join planning for other
member rels, for example, equivalence classes and em_is_child?

6. It would be good to not have to rt_fetch the same RangeTblEntry twice here:

@@ -959,7 +969,9 @@ subquery_planner(PlannerGlobal *glob, Query *parse,
  * needs special processing, else go straight to grouping_planner.
  */
  if (parse->resultRelation &&
- rt_fetch(parse->resultRelation, parse->rtable)->inh)
+ rt_fetch(parse->resultRelation, parse->rtable)->inh &&
+ rt_fetch(parse->resultRelation, parse->rtable)->relkind !=
+ RELKIND_PARTITIONED_TABLE)
  inheritance_planner(root);

7. Why don't you just pass the Parse into the function as a parameter
instead of overwriting PlannerInfo's copy in:

+ root->parse = partition_parse;
+ partitionwise_adjust_scanjoin_target(root, child_rel,
+ subroots,
+ partitioned_rels,
+ resultRelations,
+ subpaths,
+ WCOLists,
+ returningLists,
+ rowMarks);
+ /* Restore the Query for processing the next partition. */
+ root->parse = parse;

8. Can you update the following comment to mention why you're not
calling add_paths_to_append_rel for this case?

@@ -6964,7 +7164,9 @@ apply_scanjoin_target_to_paths(PlannerInfo *root,
  }

  /* Build new paths for this relation by appending child paths. */
- if (live_children != NIL)
+ if (live_children != NIL &&
+ !(rel->reloptkind == RELOPT_BASEREL &&
+   rel->relid == root->parse->resultRelation))
  add_paths_to_append_rel(root, rel, live_children);

9. The following should use >= 0, not > 0

+ while ((relid = bms_next_member(child_rel->relids, relid)) > 0)

0002:

10. I think moving the PlannerInfo->total_table_pages calculation
needs to be done in its own patch. This is a behavioural change where
we no longer count pruned relations in the calculation which can
result in plan changes. I posted a patch in [1] to fix this as a bug
fix as I think the current code is incorrect. My patch also updates
the first paragraph of the comment. You've not done that.

11. "pruning"

+ * And do prunin.  Note that this adds AppendRelInfo's of only the

12. It's much more efficient just to do bms_add_range() outside the loop in:

+ for (i = 0; i < rel->nparts; i++)
+ {
+ rel->part_rels[i] = build_partition_rel(root, rel,
+ rel->part_oids[i]);
+ rel->live_parts = bms_add_member(rel->live_parts, i);
+ }

13. In set_append_rel_size() the foreach(l, root->append_rel_list)
loop could be made to loop over RelOptInfo->live_parts instead which
would save having to skip over AppendRelInfos that don't belong to
this parent. You'd need to make live_parts more general purpose and
also use it to mark children of inheritance parents.

14. I think you can skip the following if both are NULL. You could
likely add more smarts for different join types, but I think you
should at least skip when both are NULL. Perhaps the loop could be

Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2018-09-10 Thread Tomas Vondra

Hi,

On 07/18/2018 09:32 AM, Konstantin Knizhnik wrote:
> 
> 
> On 18.07.2018 02:58, Tomas Vondra wrote:
>> On 07/18/2018 12:41 AM, Konstantin Knizhnik wrote:
>>> ...
>>>
>>> Teodor Sigaev has proposed an alternative approach for calculating
>>> selectivity of multicolumn join or compound index search.
>>> Usually DBA creates compound indexes which can be used  by optimizer to
>>> build efficient query execution plan based on index search.
>>> We can stores statistic for compound keys of such indexes and (as it is
>>> done now for functional indexes) and use it to estimate selectivity
>>> of clauses. I have implemented this idea and will publish patch in
>>> separate thread soon.
>>> Now I just want to share some results for the Tomas examples.
>>>
>>> So for Vanilla Postges without extra statistic  estimated number of rows
>>> is about 4 times smaller than real.
>>>
>> Can you please post plans with parallelism disabled, and perhaps without
>> the aggregate? Both makes reading the plans unnecessarily difficult ...
> 
> 
> Sorry, below are plans with disabled parallel execution on simpler
> query(a=1 and b=1):
> 
> explain analyze SELECT count(*) FROM foo WHERE a=1 and b=1;
> 
> 
> 
> Vanilla:
> 
>  Aggregate  (cost=11035.86..11035.87 rows=1 width=8) (actual
> time=22.746..22.746 rows=1 loops=1)
>    ->  Bitmap Heap Scan on foo  (cost=291.35..11001.97 rows=13553
> width=0) (actual time=9.055..18.711 rows=5 loops=1)
>  Recheck Cond: ((a = 1) AND (b = 1))
>  Heap Blocks: exact=222
>  ->  Bitmap Index Scan on foo_a_b_idx  (cost=0.00..287.96
> rows=13553 width=0) (actual time=9.005..9.005 rows=5 loops=1)
>    Index Cond: ((a = 1) AND (b = 1))
> 
> 
> --
> 
> Vanilla + extra statistic (create statistics ab on a,b from foo):
> 
>  Aggregate  (cost=12693.35..12693.36 rows=1 width=8) (actual
> time=22.747..22.748 rows=1 loops=1)
>    ->  Bitmap Heap Scan on foo  (cost=1490.08..12518.31 rows=70015
> width=0) (actual time=9.399..18.636 rows=5 loops=1)
>  Recheck Cond: ((a = 1) AND (b = 1))
>  Heap Blocks: exact=222
>  ->  Bitmap Index Scan on foo_a_b_idx (cost=0.00..1472.58
> rows=70015 width=0) (actual time=9.341..9.341 rows=5 loops=1)
>    Index Cond: ((a = 1) AND (b = 1))
> 
> --
> 
> Multicolumn index statistic:
> 
>  Aggregate  (cost=11946.35..11946.36 rows=1 width=8) (actual
> time=25.117..25.117 rows=1 loops=1)
>    ->  Bitmap Heap Scan on foo  (cost=1080.47..11819.51 rows=50736
> width=0) (actual time=11.568..21.362 rows=5 loops=1)
>  Recheck Cond: ((a = 1) AND (b = 1))
>  Heap Blocks: exact=222
>  ->  Bitmap Index Scan on foo_a_b_idx (cost=0.00..1067.79
> rows=50736 width=0) (actual time=11.300..11.300 rows=5 loops=1)
>    Index Cond: ((a = 1) AND (b = 1))
> 

I wonder what happened to this alternative approach, relying on stats
from multicolumn indexes ...

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: Index Skip Scan

2018-09-10 Thread Dmitry Dolgov

> On Mon, 18 Jun 2018 at 17:26, Jesper Pedersen  
> wrote:
>
> I took Thomas Munro's previous patch [2] on the subject, added a GUC, a
> test case, documentation hooks, minor code cleanups, and made the patch
> pass an --enable-cassert make check-world run. So, the overall design is
> the same.

I've looked through the patch more closely, and have a few questions:

* Is there any reason why only copy function for the IndexOnlyScan node
  includes an implementation for distinctPrefix? Without read/out functionality
  skip doesn't work for parallel scans, so it becomes like that:

=# SET force_parallel_mode TO ON;
=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
  QUERY PLAN

---
 Gather  (cost=1000.43..1001.60 rows=3 width=4)
(actual time=11.054..17672.010 rows=1000 loops=1)
   Output: b
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=91035 read=167369
   ->  Index Skip Scan using idx_t1_b on public.t1
   (cost=0.43..1.30 rows=3 width=4)
   (actual time=1.350..16065.165 rows=1000 loops=1)
 Output: b
 Heap Fetches: 1000
 Buffers: shared hit=91035 read=167369
 Worker 0: actual time=1.350..16065.165 rows=1000 loops=1
   Buffers: shared hit=91035 read=167369
 Planning Time: 0.394 ms
 Execution Time: 6037.800 ms

  and with this functionality it gets better:

=# SET force_parallel_mode TO ON;
=# EXPLAIN (ANALYZE, VERBOSE, BUFFERS ON) SELECT DISTINCT b FROM t1;
QUERY PLAN

---
 Gather  (cost=1000.43..1001.60 rows=3 width=4)
(actual time=3.564..4.427 rows=3 loops=1)
   Output: b
   Workers Planned: 1
   Workers Launched: 1
   Single Copy: true
   Buffers: shared hit=4 read=10
   ->  Index Skip Scan using idx_t1_b on public.t1
   (cost=0.43..1.30 rows=3 width=4)
   (actual time=0.065..0.133 rows=3 loops=1)
 Output: b
 Heap Fetches: 3
 Buffers: shared hit=4 read=10
 Worker 0: actual time=0.065..0.133 rows=3 loops=1
   Buffers: shared hit=4 read=10
 Planning Time: 1.724 ms
 Execution Time: 4.522 ms

* What is the purpose of HeapFetches? I don't see any usage of this variable
  except assigning 0 to it once.

Re: can commitfest application resend a mail?

2018-09-10 Thread Alvaro Herrera

On 2018-Sep-10, Pavel Stehule wrote:

> 2018-09-10 20:13 GMT+02:00 Alvaro Herrera :
> 
> > On 2018-Sep-10, Pavel Stehule wrote:
> >
> > > I use web gmail, so it is not a good news. I need it for review. I don't
> > > archive mails in my mailbox, and I would not to start new thread.
> >
> > The other option is to paste your email contents in the commitfest app,
> > which will send it as an email with the proper email headers so that
> > it's correctly threaded.
> 
> I am blind. How I can do it?

In a patch page, click "comment" -> "review".

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: pg_stat_statements query jumbling question

2018-09-10 Thread legrand legrand

Yes, I

Regards
PAscal



--
Sent from: http://www.postgresql-archive.org/PostgreSQL-hackers-f1928748.html

Re: StandbyAcquireAccessExclusiveLock doesn't necessarily

2018-09-10 Thread Robert Haas

On Fri, Sep 7, 2018 at 6:37 PM, Tom Lane  wrote:
> So my inclination is to remove the reportMemoryError = false parameter,
> and just let an error happen in the unlikely situation that we hit OOM
> for the lock table.

Wouldn't that take down the entire cluster with no restart?

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: can commitfest application resend a mail?

2018-09-10 Thread Pavel Stehule

2018-09-10 20:13 GMT+02:00 Alvaro Herrera :

> On 2018-Sep-10, Pavel Stehule wrote:
>
> > I use web gmail, so it is not a good news. I need it for review. I don't
> > archive mails in my mailbox, and I would not to start new thread.
>
> The other option is to paste your email contents in the commitfest app,
> which will send it as an email with the proper email headers so that
> it's correctly threaded.
>

I am blind. How I can do it?

Pavel


> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: can commitfest application resend a mail?

2018-09-10 Thread Alvaro Herrera

On 2018-Sep-10, Pavel Stehule wrote:

> I use web gmail, so it is not a good news. I need it for review. I don't
> archive mails in my mailbox, and I would not to start new thread.

The other option is to paste your email contents in the commitfest app,
which will send it as an email with the proper email headers so that
it's correctly threaded.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: can commitfest application resend a mail?

2018-09-10 Thread Pavel Stehule

2018-09-10 20:01 GMT+02:00 Magnus Hagander :

> On Mon, Sep 10, 2018 at 10:53 AM, Pavel Stehule 
> wrote:
>
>> Hi
>>
>> I cannot to find a page, where I can resend a mail from mailing list
>> archive.
>>
>> Can do it the commitfest application? It can much more comfortable, than
>> searching it via majordomo.
>>
>> p.s. How I can do it with current version of PostgreSQL mailing list
>> archive?
>>
>>
> It currently can not, but it's on the list of "things we'd like to have".
>
> The closet you can get today is probably to pick the thread in the
> archives and then download the thread as mbox. If your MUA can deal with
> mbox:es, that is (which is a non-solution for people using web based cloud
> mail, we're well aware and this is why we still want the functionality for
> resend)
>

I use web gmail, so it is not a good news. I need it for review. I don't
archive mails in my mailbox, and I would not to start new thread.

Regards

Pavel


> --
>  Magnus Hagander
>  Me: https://www.hagander.net/ 
>  Work: https://www.redpill-linpro.com/ 
>

Re: can commitfest application resend a mail?

2018-09-10 Thread Magnus Hagander

On Mon, Sep 10, 2018 at 10:53 AM, Pavel Stehule 
wrote:

> Hi
>
> I cannot to find a page, where I can resend a mail from mailing list
> archive.
>
> Can do it the commitfest application? It can much more comfortable, than
> searching it via majordomo.
>
> p.s. How I can do it with current version of PostgreSQL mailing list
> archive?
>
>
It currently can not, but it's on the list of "things we'd like to have".

The closet you can get today is probably to pick the thread in the archives
and then download the thread as mbox. If your MUA can deal with mbox:es,
that is (which is a non-solution for people using web based cloud mail,
we're well aware and this is why we still want the functionality for
resend)

-- 
 Magnus Hagander
 Me: https://www.hagander.net/ 
 Work: https://www.redpill-linpro.com/

Re: warnings from parser?

2018-09-10 Thread Pavel Stehule

2018-09-10 19:39 GMT+02:00 Pavel Stehule :

>
>
> 2018-09-10 19:38 GMT+02:00 Andrew Gierth :
>
>> > "Pavel" == Pavel Stehule  writes:
>>
>>  >> gram.y: In function ‘base_yyparse’:
>>  >> gram.y:4507:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
>>  >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>>  >> n-> collClause = $7;
>>  >> ^
>>  >> gram.y:4519:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
>>  >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>>  >> n-> collClause = $10;
>>  >> ^
>>
>> That's not the text that appears on those lines of gram.y in current git
>> master. What are you trying to compile?
>>
>>  Pavel> /usr/bin/bison -Wno-deprecated  -d -o preproc.c preproc.y
>>  Pavel> preproc.y:4744.2: warning: empty rule for typed nonterminal, and
>> no action
>>
>> preproc.y is a generated file, created from gram.y and other files by a
>> perl script.
>>
>> So I'm guessing you're using a modified gram.y and everything else is
>> fallout from an error there.
>>
>>
> maybe my git repository is broken
>
> I'll recheck it.
>
> Thank you
>

It was from my branch.

I am sorry for noise.

Regards

Pavel


> Pavel
>
>> --
>> Andrew (irc:RhodiumToad)
>>
>
>

can commitfest application resend a mail?

2018-09-10 Thread Pavel Stehule

Hi

I cannot to find a page, where I can resend a mail from mailing list
archive.

Can do it the commitfest application? It can much more comfortable, than
searching it via majordomo.

p.s. How I can do it with current version of PostgreSQL mailing list
archive?

Regards

Pavel

Re: warnings from parser?

2018-09-10 Thread Pavel Stehule

2018-09-10 19:38 GMT+02:00 Andrew Gierth :

> > "Pavel" == Pavel Stehule  writes:
>
>  >> gram.y: In function ‘base_yyparse’:
>  >> gram.y:4507:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
>  >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>  >> n-> collClause = $7;
>  >> ^
>  >> gram.y:4519:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
>  >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>  >> n-> collClause = $10;
>  >> ^
>
> That's not the text that appears on those lines of gram.y in current git
> master. What are you trying to compile?
>
>  Pavel> /usr/bin/bison -Wno-deprecated  -d -o preproc.c preproc.y
>  Pavel> preproc.y:4744.2: warning: empty rule for typed nonterminal, and
> no action
>
> preproc.y is a generated file, created from gram.y and other files by a
> perl script.
>
> So I'm guessing you're using a modified gram.y and everything else is
> fallout from an error there.
>
>
maybe my git repository is broken

I'll recheck it.

Thank you

Pavel

> --
> Andrew (irc:RhodiumToad)
>

Re: warnings from parser?

2018-09-10 Thread Andrew Gierth

> "Pavel" == Pavel Stehule  writes:

 >> gram.y: In function ‘base_yyparse’:
 >> gram.y:4507:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
 >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
 >> n-> collClause = $7;
 >> ^
 >> gram.y:4519:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
 >> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
 >> n-> collClause = $10;
 >> ^

That's not the text that appears on those lines of gram.y in current git
master. What are you trying to compile?

 Pavel> /usr/bin/bison -Wno-deprecated  -d -o preproc.c preproc.y
 Pavel> preproc.y:4744.2: warning: empty rule for typed nonterminal, and no 
action

preproc.y is a generated file, created from gram.y and other files by a
perl script.

So I'm guessing you're using a modified gram.y and everything else is
fallout from an error there.

-- 
Andrew (irc:RhodiumToad)

Re: warnings from parser?

2018-09-10 Thread Pavel Stehule

2018-09-10 19:23 GMT+02:00 Pavel Stehule :

> Hi
>
> I got new warnings
>
> gram.y: In function ‘base_yyparse’:
> gram.y:4507:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>   n->collClause = $7;
> ^
> gram.y:4519:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
> CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
>   n->collClause = $10;
> ^
>


/usr/bin/bison -Wno-deprecated  -d -o preproc.c preproc.y
preproc.y:4744.2: warning: empty rule for typed nonterminal, and no action
[-Wother]
 |  AlterExtensionStmt:
  ^



>
> Regards
>
> Pavel
>

warnings from parser?

2018-09-10 Thread Pavel Stehule

Hi

I got new warnings

gram.y: In function ‘base_yyparse’:
gram.y:4507:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
  n->collClause = $7;
^
gram.y:4519:20: warning: assignment to ‘CollateClause *’ {aka ‘struct
CollateClause *’} from incompatible pointer type ‘Node *’ {aka ‘>
  n->collClause = $10;
^

Regards

Pavel

Re: master, static inline and #ifndef FRONTEND

2018-09-10 Thread Andres Freund

Hi,

On 2018-09-10 12:39:16 -0400, Tom Lane wrote:
> Andres Freund  writes:
> > In a recent commit [1] I added a static inline which castoroides
> > dislikes: ...
> > It's obviously trivial to fix this case with by adding an #ifndef
> > FRONTEND. But as castoroides appears to be the only animal showing the
> > problem - after subtracting the animals dead due to the C99 requirement
> > - I wonder if we shouldn't just require "proper" static inline
> > support. castoroides runs a 13yo OS, and newer compilers that do not
> > have the problem are readily available.
> 
> Given that we currently have *no* working Solaris buildfarm members,
> I'm not prepared to tolerate "I can't be bothered to fix this",
> which is what your argument boils down to.

Uh, this seems unnecessarily dismissive. I wrote the above email minutes
after noticing the issue ( which in turn was shortly after the issue
occured), asking for feedback. Hardly "I can't be bothered to fix it"
territory. But adding a lot of #ifndef FRONTENDs isn't entirely free
either...


> We need to get at least one of castoroides and protosciurus back to
> green so that we can do platform-specific testing there [1], and
> castoroides appears to be the easier one to fix in the short term.

Note that rover-firefly runs on a solaris-ish environment and indeed
shows a problem
https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rover_firefly&dt=2018-09-10%2015%3A11%3A16


- Andres

Re: master, static inline and #ifndef FRONTEND

2018-09-10 Thread Tom Lane

Andres Freund  writes:
> In a recent commit [1] I added a static inline which castoroides
> dislikes: ...
> It's obviously trivial to fix this case with by adding an #ifndef
> FRONTEND. But as castoroides appears to be the only animal showing the
> problem - after subtracting the animals dead due to the C99 requirement
> - I wonder if we shouldn't just require "proper" static inline
> support. castoroides runs a 13yo OS, and newer compilers that do not
> have the problem are readily available.

Given that we currently have *no* working Solaris buildfarm members,
I'm not prepared to tolerate "I can't be bothered to fix this",
which is what your argument boils down to.  We need to get at least
one of castoroides and protosciurus back to green so that we can
do platform-specific testing there [1], and castoroides appears to be
the easier one to fix in the short term.

In the long run it may well be reasonable to shift focus to newer
compilers and/or newer Solaris versions, but for today, I'm going
to go add the #ifndef.

regards, tom lane

[1] for instance,
https://www.postgresql.org/message-id/16984.1536596764%40sss.pgh.pa.us

Re: Query is over 2x slower with jit=on

2018-09-10 Thread Andres Freund

Hi,

On 2018-09-10 15:42:55 +0530, Amit Khandekar wrote:
> Attached is a patch that accumulates the per-worker jit counters into
> the leader process.

Thanks!

> I think we better show per-worker jit info also. The current patch
> does not show that. I think it would be easy to continue on the patch
> to show per-worker info also. Under the Gather node, we can show
> per-worker jit counters. I think this would be useful too, besides the
> cumulative figures in the leader process. Comments ?

Yes, I think that'd be good. I think we either should print the stats at
the top level as $leader_value, $combined_worker_value, $total_value or
just have the $combined_worker_value at the level where we print other
stats from the worker, too.

>  /*
> + * Add up the workers' JIT instrumentation from dynamic shared memory.
> + */
> +static void
> +ExecParallelRetrieveJitInstrumentation(PlanState *planstate,
> +
> SharedJitInstrumentation *shared_jit)
> +{
> + int n;
> + JitContext *jit = planstate->state->es_jit;
> +
> + /* If the leader hasn't yet created a jit context, allocate one now. */
> + if (!jit)
> + {
> + planstate->state->es_jit = jit =
> + jit_create_context(planstate->state->es_jit_flags);
> + }

Wouldn't it be better to move the jit instrumentation to outside of the
context, to avoid having to do this?  Or just cope with not having
instrumentation for the leader in this case?  We'd kinda need to deal
with failure to create one anyway?

Greetings,

Andres Freund

Re: Pluggable Storage - Andres's take

2018-09-10 Thread Mithun Cy

On Mon, Sep 10, 2018 at 7:33 PM, Amit Kapila  wrote:
> On Mon, Sep 10, 2018 at 1:12 PM Haribabu Kommi  
> wrote:
>>
>> On Wed, Sep 5, 2018 at 2:04 PM Haribabu Kommi  
>> wrote:
>>>
>> pg_stat_get_tuples_hot_updated and others:
>> /*
>> * Counter tuples_hot_updated stores number of hot updates for heap table
>> * and the number of inplace updates for zheap table.
>> */
>> if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL ||
>> RelationStorageIsZHeap(rel))
>> result = 0;
>> else
>> result = (int64) (tabentry->tuples_hot_updated);
>>
>>
>> Is the special condition is needed? The values should be 0 because of zheap 
>> right?
>>
>
> I also think so.  Beena/Mithun has worked on this part of the code, so
> it is better if they also confirm once.

Yes pg_stat_get_tuples_hot_updated should return 0 for zheap.


-- 
Thanks and Regards
Mithun C Y
EnterpriseDB: http://www.enterprisedb.com

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Sergei Kornilov

Hello

> Is there any reason this cannot be changed via a signal?  Is it a general 
> lack of infrastructure or is it there significant problem we want to ensure 
> never happens?
Just make possible reload recovery.conf is not what the pgsql-hackers community 
want. My patch with exactly this feature was rejected with this reason.
There were already some discussions about move recovery.conf into GUC 
infrastructure first. I continue work on one most recent patch here: 
https://commitfest.postgresql.org/19/1711/ If this patch will be committed - i 
plan sent new patch to allow change primary_conninfo with SIGHUP.

regards, Sergei

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Chris Travers

On Mon, Sep 10, 2018 at 1:36 PM Michael Paquier  wrote:

> On Mon, Sep 10, 2018 at 12:08:07PM +0300, Sergei Kornilov wrote:
> >> This parameter is defined in postgresql.conf
> > Huh, i believe it be in future.
> > Currently it is recovery.conf parameter, and yes - it can be set (or
> > changed) only at database start.
>
> Thanks Sergei for the correction.  Indeed you need to read that as
> recovery.conf, not postgresql.conf.
>

Is there any reason this cannot be changed via a signal?  Is it a general
lack of infrastructure or is it there significant problem we want to ensure
never happens?

> --
> Michael
>


-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: Loaded footgun open_datasync on Windows

2018-09-10 Thread Laurenz Albe

Noah Misch wrote:
> On Wed, Jun 27, 2018 at 12:09:24PM +0200, Laurenz Albe wrote:
> > Michael Paquier wrote:
> > > > I have added it to the July commitfest.
> > > 
> > > Have you looked at the possibility of removing the log file constraints
> > > in pg_upgrade with the change you are doing here so as things would be
> > > more consistent with non-Windows platforms, simplifying some code on the
> > > way?
> > 
> > Can you explain what the "log file constraints" are?
> > If it is in any way related, and I can handle it, sure.
> 
> See this comment in pg_upgrade.h:
> 
> /*
>  * WIN32 files do not accept writes from multiple processes
>  *
>  * On Win32, we can't send both pg_upgrade output and command output to the
>  * same file because we get the error: "The process cannot access the file
>  * because it is being used by another process." so send the pg_ctl
>  * command-line output to a new file, rather than into the server log file.
>  * Ideally we could use UTILITY_LOG_FILE for this, but some Windows platforms
>  * keep the pg_ctl output file open by the running postmaster, even after
>  * pg_ctl exits.
>  *
>  * We could use the Windows pgwin32_open() flags to allow shared file
>  * writes but is unclear how all other tools would use those flags, so
>  * we just avoid it and log a little differently on Windows;  we adjust
>  * the error message appropriately.
>  */
> 
> If you grep src/bin/pg_upgrade for WIN32, roughly a third of the hits are
> workarounds for this problem.  I agree with Michael that removing those
> workarounds would be a good test of frontend pgwin32_open() and worth
> including in the initial commit.

Thank you for the explanation.

I didn't get pg_upgrade to work without the log file hacks; I suspect that there
is more than just log file locking going on, but my Windows skills are limited.

How shall I proceed?
I think that it is important to get pg_test_fsync to work correctly on Windows,
and if my patch does not break the buildfarm, that's what it does.

I have attached a new version, the previous one was bit-rotted.

Yours,
Laurenz Albe
diff --git a/src/bin/initdb/initdb.c b/src/bin/initdb/initdb.c
index 3f203c6ca6..9f6a9dc3d9 100644
--- a/src/bin/initdb/initdb.c
+++ b/src/bin/initdb/initdb.c
@@ -490,7 +490,11 @@ readfile(const char *path)
 	char	   *buffer;
 	int			c;
 
+#ifdef WIN32
+	if ((infile = fopen(path, "rt")) == NULL)
+#else
 	if ((infile = fopen(path, "r")) == NULL)
+#endif
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\" for reading: %s\n"),
 progname, path, strerror(errno));
diff --git a/src/bin/pg_basebackup/pg_receivewal.c b/src/bin/pg_basebackup/pg_receivewal.c
index 8be8d48a8a..912aed8d7c 100644
--- a/src/bin/pg_basebackup/pg_receivewal.c
+++ b/src/bin/pg_basebackup/pg_receivewal.c
@@ -288,7 +288,7 @@ FindStreamingStart(uint32 *tli)
 
 			snprintf(fullpath, sizeof(fullpath), "%s/%s", basedir, dirent->d_name);
 
-			fd = open(fullpath, O_RDONLY | PG_BINARY);
+			fd = open(fullpath, O_RDONLY | PG_BINARY, 0);
 			if (fd < 0)
 			{
 fprintf(stderr, _("%s: could not open compressed file \"%s\": %s\n"),
diff --git a/src/bin/pg_verify_checksums/pg_verify_checksums.c b/src/bin/pg_verify_checksums/pg_verify_checksums.c
index 28c975446e..d1a7121a26 100644
--- a/src/bin/pg_verify_checksums/pg_verify_checksums.c
+++ b/src/bin/pg_verify_checksums/pg_verify_checksums.c
@@ -83,7 +83,7 @@ scan_file(char *fn, int segmentno)
 	int			f;
 	int			blockno;
 
-	f = open(fn, 0);
+	f = open(fn, 0, 0);
 	if (f < 0)
 	{
 		fprintf(stderr, _("%s: could not open file \"%s\": %s\n"),
diff --git a/src/common/file_utils.c b/src/common/file_utils.c
index 48876061c3..8a4cdb47ff 100644
--- a/src/common/file_utils.c
+++ b/src/common/file_utils.c
@@ -283,7 +283,7 @@ fsync_fname(const char *fname, bool isdir, const char *progname)
 	 * unsupported operations, e.g. opening a directory under Windows), and
 	 * logging others.
 	 */
-	fd = open(fname, flags);
+	fd = open(fname, flags, 0);
 	if (fd < 0)
 	{
 		if (errno == EACCES || (isdir && errno == EISDIR))
diff --git a/src/include/port.h b/src/include/port.h
index 74a9dc4d4d..2d40045ad5 100644
--- a/src/include/port.h
+++ b/src/include/port.h
@@ -249,11 +249,8 @@ extern bool rmtree(const char *path, bool rmtopdir);
 #define		O_DIRECT	0x8000
 extern int	pgwin32_open(const char *, int,...);
 extern FILE *pgwin32_fopen(const char *, const char *);
-
-#ifndef FRONTEND
 #define		open(a,b,c) pgwin32_open(a,b,c)
 #define		fopen(a,b) pgwin32_fopen(a,b)
-#endif
 
 /*
  * Mingw-w64 headers #define popen and pclose to _popen and _pclose.  We want

Re: cache lookup failed for constraint when alter table referred by partition table

2018-09-10 Thread Alvaro Herrera

On 2018-Sep-10, Alvaro Herrera wrote:

> On 2018-Sep-10, Justin Pryzby wrote:
> 
> > Adding Alvaro 
> > 
> > On Fri, Sep 07, 2018 at 04:02:13PM +0530, Rajkumar Raghuwanshi wrote:
> > > postgres=# CREATE TABLE non_part (a INT,PRIMARY KEY(a));
> > > postgres=# CREATE TABLE part (a INT REFERENCES non_part(a)) PARTITION BY 
> > > RANGE(a);
> > > postgres=# CREATE TABLE part_p1 PARTITION OF part FOR VALUES FROM 
> > > (MINVALUE) TO (MAXVALUE);
> > > postgres=# ALTER TABLE non_part ALTER COLUMN a TYPE bigint;
> > > *ERROR:  cache lookup failed for constraint 16398*

ATPostAlterTypeCleanup is trying to search the original constraint by
OID in order to drop it, but it's not there -- I suppose it has already
been dropped by recursion in a previous step.  Not sure what the fix is
yet, but I'll look into it later today.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: patch to allow disable of WAL recycling

2018-09-10 Thread Jerry Jelinek

Tomas,

Thank you again for running all of these tests on your various hardware
configurations. I was not aware of the convention that the commented
example in the config file is expected to match the default value, so I was
actually trying to show what to use if you didn't want the default, but I
am happy to update the patch so the comment matches the default. Beyond
that, I am unsure what the next steps are for this proposal.

Thanks again,
Jerry


On Tue, Sep 4, 2018 at 12:41 PM, Tomas Vondra 
wrote:

> Hi,
>
> So here is the last set of benchmark results, this time from ext4 on a
> small SATA-based RAID (3 x 7.2k). As before, I'm only attaching PDFs
> with the simple charts, full results are available in the git repository
> [1]. Overall the numbers are rather boring, with almost no difference
> between the two setups.
>
> That being said, I'm not opposed to introducing the GUC. I'm not going
> to pretend my tests represents all possible HW configs and workloads,
> and I have no trouble believing that it may be quite beneficial in some
> cases.
>
> The one comment about the code is that we usually use the actual default
> value in the config sample. But the patch does this:
>
> +#wal_recycle = off # do not recycle WAL files
>
> while the GUC is defined like this:
>
> {
> {"wal_recycle", PGC_SUSET, WAL_SETTINGS,
> gettext_noop("WAL recycling enabled."),
> NULL
> },
> &wal_recycle,
> true,
> NULL, NULL, NULL
> },
>
> So the default is actually "on" which makes the commented-out config
> sample rather confusing.
>
>
> regards
>
> --
> Tomas Vondra  http://www.2ndQuadrant.com
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>

Re: Pluggable Storage - Andres's take

2018-09-10 Thread Amit Kapila

On Mon, Sep 10, 2018 at 1:12 PM Haribabu Kommi  wrote:
>
> On Wed, Sep 5, 2018 at 2:04 PM Haribabu Kommi  
> wrote:
>>
> pg_stat_get_tuples_hot_updated and others:
> /*
> * Counter tuples_hot_updated stores number of hot updates for heap table
> * and the number of inplace updates for zheap table.
> */
> if ((tabentry = pgstat_fetch_stat_tabentry(relid)) == NULL ||
> RelationStorageIsZHeap(rel))
> result = 0;
> else
> result = (int64) (tabentry->tuples_hot_updated);
>
>
> Is the special condition is needed? The values should be 0 because of zheap 
> right?
>

I also think so.  Beena/Mithun has worked on this part of the code, so
it is better if they also confirm once.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Re: Latest HEAD fails to build

2018-09-10 Thread Tom Lane

Rafia Sabih  writes:
>>> I tried maintaner-clean and distclean but it didn't work.

FWIW, my invariable habit is to do "make distclean" *before* git pull,
not after.  That avoids synchronization problems like this one, where
the updated makefiles don't know about some build product that used
to get made.

maintainer-clean or "git clean -dfx" are less desirable options
because they'll force you to redo the flex and bison runs, which
are rather expensive.

regards, tom lane

Re: SerializeParamList vs machines with strict alignment

2018-09-10 Thread Tom Lane

Amit Kapila  writes:
> On Mon, Sep 10, 2018 at 8:58 AM Tom Lane  wrote:
>> In particular, SerializeParamList does this:
>> 
>> /* Write flags. */
>> memcpy(*start_address, &prm->pflags, sizeof(uint16));
>> *start_address += sizeof(uint16);
>> 
>> immediately followed by this:
>> 
>> datumSerialize(prm->value, prm->isnull, typByVal, typLen,
>> start_address);

> datumSerialize does this:

> memcpy(*start_address, &header, sizeof(int));
> *start_address += sizeof(int);

> before calling EOH_flatten_into, so it seems to me it should be 4-byte 
> aligned.

But that doesn't undo the fact that you're now on an odd halfword
boundary.  In the case I observed, EA_flatten_into was trying to
store an int32 through a pointer whose hex value ended in E, which
is explained by the "+= sizeof(uint16)".

> Yeah, I think as suggested by you, start_address should be maxaligned.

A localized fix would be for datumSerialize to temporarily palloc the
space for EOH_flatten_into to write into, and then it could memcpy
that to whatever random address it'd been passed.  Seems kind of
inefficient though, especially since you'd likely have to do the same
thing on the receiving side.

A larger issue is that even on machines where misalignment is permitted,
memcpy'ing to/from misaligned addresses is rather inefficient.

I think it'd be better to redesign the whole thing so that every field
is maxaligned, and you can e.g. just store and retrieve integer fields
as integers without a memcpy.  The "wasted" padding space doesn't seem
very significant given the short-lived nature of the serialized data.

However, that's not exactly a trivial change.  Probably the best way
forward is to adopt the extra-palloc solution as a back-patchable
fix, and then work on a more performant solution in HEAD only.

regards, tom lane

Re: Standby reads fail when autovacuum take AEL during truncation

2018-09-10 Thread Adrien NAYRAT


On 9/7/18 7:36 PM, Alexander Korotkov wrote:

Please, take a look at following threads:
1.https://www.postgresql.org/message-id/c9374921e50a5e8fb1ecf04eb8c6ebc3%40postgrespro.ru
2.https://www.postgresql.org/message-id/CAHGQGwE5UqFqSq1%3DkV3QtTUtXphTdyHA-8rAj4A%3DY%2Be4kyp3BQ%40mail.gmail.com


Thanks, I will to these threads.

Re: cache lookup failed for constraint when alter table referred by partition table

2018-09-10 Thread Alvaro Herrera

On 2018-Sep-10, Justin Pryzby wrote:

> Adding Alvaro 
> 
> On Fri, Sep 07, 2018 at 04:02:13PM +0530, Rajkumar Raghuwanshi wrote:
> > postgres=# CREATE TABLE non_part (a INT,PRIMARY KEY(a));
> > postgres=# CREATE TABLE part (a INT REFERENCES non_part(a)) PARTITION BY 
> > RANGE(a);
> > postgres=# CREATE TABLE part_p1 PARTITION OF part FOR VALUES FROM 
> > (MINVALUE) TO (MAXVALUE);
> > postgres=# ALTER TABLE non_part ALTER COLUMN a TYPE bigint;
> > *ERROR:  cache lookup failed for constraint 16398*
> 
> I want to suggest adding to open items.
> https://wiki.postgresql.org/index.php?title=PostgreSQL_11_Open_Items

Thanks, looking.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Re: How to find local logical replication origin?

2018-09-10 Thread Petr Jelinek

Hi,

On 10/09/18 05:30, Jinhua Luo wrote:
> Because I found in the source codes that if not explicitly set (e.g.
> via pg_replication_origin_session_setup), the
> replorigin_session_origin included in the wal is InvalidRepOriginId,
> correct?

This is correct, unless explicitly set, it will be InvalidRepOriginId.

> Jinhua Luo  于2018年9月9日周日 下午10:16写道：
>>
>> Could I assume all local originated changes is with InvalidRepOriginId?
>> Jinhua Luo  于2018年9月8日周六 下午5:41写道：
>>>
>>> Hi All,
>>>
>>> What's the local logical replication origin, which could be used to
>>> filter local changes in the replication slot?
>>>
>>> In other words, I'm curious that what's the default replication
>>> origin? Because normal DML locally does not set any origin explicitly,
>>> correct?
> 

-- 
  Petr Jelinek  http://www.2ndQuadrant.com/
  PostgreSQL Development, 24x7 Support, Training & Services

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Michael Paquier

On Mon, Sep 10, 2018 at 12:08:07PM +0300, Sergei Kornilov wrote:
>> This parameter is defined in postgresql.conf
> Huh, i believe it be in future.
> Currently it is recovery.conf parameter, and yes - it can be set (or
> changed) only at database start. 

Thanks Sergei for the correction.  Indeed you need to read that as
recovery.conf, not postgresql.conf.
--
Michael


signature.asc
Description: PGP signature

Re: [PATCH][PROPOSAL] Add enum releation option type

2018-09-10 Thread Aleksandr Parfenov

Hi Nikolay,

I did a quick look at yout patch and have some questions/points to
discuss. I like the idea of the patch and think that enum reloptions
can be useful. Especially for some frequently checked values, as it was
mentioned before.

There are few typos in comments, like 'validateing'.

I have two questions about naming of variables/structures:

1) variable opt_enum in parse_one_reloption named in different way
other similar variables named (without underscore).

2) enum 
gist_option_buffering_numeric_values/gist_option_buffering_value_numbers.
Firstly, it has two names. Secondly, can we name it
gist_option_buffering, why not?

As you mentioned in previous mail, you prefer to keep enum and
relopt_enum_element_definition array in the same .h file. I'm not sure,
but I think it is done to keep everything related to enum in one place
to avoid inconsistency in case of changes in some part (e.g. change of
enum without change of array). On the other hand, array content created
without array creation itself in .h file. Can we move actual array
creation into same .h file? What is the point to separate array content
definition and array definition?

-- 
Aleksandr Parfenov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

Re: [HACKERS] Restricting maximum keep segments by repslots

2018-09-10 Thread Kyotaro HORIGUCHI

Hello.

At Thu, 6 Sep 2018 22:32:21 +0200, Peter Eisentraut 
 wrote in 
<29bbd79d-696b-509e-578a-0fc38a3b9...@2ndquadrant.com>
> This documentation
> 
> +   
> +Specify the maximum size of WAL files
> +that replication
> +slots are allowed to retain in the
> pg_wal
> +directory at checkpoint time.
> +If max_slot_wal_keep_size is zero (the default),
> +replication slots retain unlimited size of WAL files.
> +   
> 
> doesn't say anything about what happens when the limit is exceeded.
> Does the system halt until the WAL is fetched from the slots?  Do the
> slots get invalidated?

Thanks for pointing that. That's a major cause of confusion. Does
the following make sense?

> Specify the maximum size of WAL files that  linkend="streaming-replication-slots">replication slots
> are allowed to retain in the pg_wal
> directory at checkpoint time.  If
> max_slot_wal_keep_size is zero (the
> default), replication slots retain unlimited size of WAL files.
+ If restart_lsn of a replication slot gets behind more than that
+ bytes from the current LSN, the standby using the slot may not
+ be able to reconnect due to removal of required WAL records.

And the following sentense is wrong now. I'll remove it in the
coming version 9.

> 
>  This parameter is used being rounded down to the multiples of WAL file
>  size.
> 


> Also, I don't think 0 is a good value for the default behavior.  0 would
> mean that a slot is not allowed to retain any more WAL than already
> exists anyway.  Maybe we don't want to support that directly, but it's a
> valid configuration.  So maybe use -1 for infinity.

In realtion to the reply just sent to Sawada-san, remain of a
slot can be at most 16MB in the 0 case with the default segment
size. So you're right in this sense. Will fix in the coming
version. Thanks.

=# show max_slot_wal_keep_size;
 max_slot_wal_keep_size 

 0
(1 row)
=# select pg_current_wal_lsn(), restart_lsn, remain, pg_size_pretty(remain) as 
remain from pg_replication_slots ;
 pg_current_wal_lsn | restart_lsn |  remain  | remain 
+-+--+
 0/400  | 0/400   | 16777216 | 16 MB
(1 row)

=# select pg_current_wal_lsn(), restart_lsn, remain, pg_size_pretty(remain) as 
remain from pg_replication_slots ;
 pg_current_wal_lsn | restart_lsn | remain | remain 
+-++
 0/4FF46D8  | 0/4FF46D8   |  47400 | 46 kB
(1 row)

regards.

-- 
Kyotaro Horiguchi
NTT Open Source Software Center

Re: Latest HEAD fails to build

2018-09-10 Thread Rafia Sabih

On Mon, Sep 10, 2018 at 4:10 PM, Christoph Berg  wrote:

> Re: Rafia Sabih 2018-09-10  8hfpxsrr6mxab54ebju...@mail.gmail.com>
> > Thanks for the interest and help you offered. But I didn't quite get it,
> I
> > tried maintaner-clean and distclean but it didn't work.
>
> git clean -xdf
>

Thanks Christoph, it worked.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Re: Latest HEAD fails to build

2018-09-10 Thread Christoph Berg

Re: Rafia Sabih 2018-09-10 

> Thanks for the interest and help you offered. But I didn't quite get it, I
> tried maintaner-clean and distclean but it didn't work.

git clean -xdf

Christoph

Re: Adding a note to protocol.sgml regarding CopyData

2018-09-10 Thread Tatsuo Ishii

>> Hello Bradley & Tatsuo-san,
>> 
 ... references to the protocol version lacks homogeneity.
 ... I'd suggest to keep "the vX.0 protocol" for a short version,
 and "the version X.0 protocol" for long ...
>>>
>>> I agree. Change made.
>> 
>> Patch applies cleanly. Doc build ok.
>> 
>> One part talks about "terminating line", the other of "termination
>> line". I wonder whether one is better than the other?
> 
> I am not a native English speaker so I maybe wrong... for me, current
> usage of "terminating line", and "termination line" looks correct. The
> former refers to concrete example thus uses "the", while the latter
> refers to more general case thus uses "an".
> 
> BTW, I think the patch should apply to master and REL_11_STABLE
> branches at least. But should this be applied to other back branches
> as well?

I have marked this as "ready for committer".

Best regards,
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese:http://www.sraoss.co.jp

Re: [HACKERS] Restricting maximum keep segments by repslots

2018-09-10 Thread Kyotaro HORIGUCHI

Hello.

At Thu, 6 Sep 2018 19:55:39 +0900, Masahiko Sawada  
wrote in 
> On Thu, Sep 6, 2018 at 4:10 PM, Kyotaro HORIGUCHI
>  wrote:
> > Thank you for the comment.
> >
> > At Wed, 5 Sep 2018 14:31:10 +0900, Masahiko Sawada  
> > wrote in 
> > 
> >> On Tue, Sep 4, 2018 at 7:52 PM, Kyotaro HORIGUCHI
> >> Thank you for updating! Here is the review comment for v8 patch.
> >>
> >> +/*
> >> + * This slot still has all required segments. Calculate how 
> >> many
> >> + * LSN bytes the slot has until it loses restart_lsn.
> >> + */
> >> +fragbytes = wal_segment_size - (currLSN % wal_segment_size);
> >> +XLogSegNoOffsetToRecPtr(restartSeg + limitSegs - currSeg,
> >> fragbytes,
> >> +wal_segment_size, *restBytes);
> >>
> >> For the calculation of fragbytes, I think we should calculate the
> >> fragment bytes of restartLSN instead. The the formula "restartSeg +
> >> limitSegs - currSeg" means the # of segment between restartLSN and the
> >> limit by the new parameter. I don't think that the summation of it and
> >> fragment bytes of currenLSN is correct. As the following result
> >> (max_slot_wal_keep_size is 128MB) shows, the remain column shows the
> >> actual remains + 16MB (get_bytes function returns the value of
> >> max_slot_wal_keep_size in bytes).
> >
> > Since a oldest segment is removed after the current LSN moves to
> > the next segmen, current LSN naturally determines the fragment
> > bytes. Maybe you're concerning that the number of segments looks
> > too much by one segment.
> >
> > One arguable point of the feature is how max_slot_wal_keep_size
> > works exactly. I assume that even though the name is named as
> > "max_", we actually expect that "at least that bytes are
> > kept". So, for example, with 16MB of segment size and 50MB of
> > max_s_w_k_s, I designed this so that the size of preserved WAL
> > doesn't go below 50MB, actually (rounding up to multples of 16MB
> > of 50MB), and loses the oldest segment when it reaches 64MB +
> > 16MB = 80MB as you saw.
> >
> > # I believe that the difference is not so significant since we
> > # have around hunderd or several hundreds of segments in common
> > # cases.
> >
> > Do you mean that we should define the GUC parameter literally as
> > "we won't have exactly that many bytes of WAL segmetns"? That is,
> > we have at most 48MB preserved WAL records for 50MB of
> > max_s_w_k_s setting. This is the same to how max_wal_size is
> > counted but I don't think max_slot_wal_keep_size will be regarded
> > in the same way.
> 
> I might be missing something but what I'm expecting to this feature is
> to restrict the how much WAL we can keep at a maximum for replication
> slots. In other words, the distance between the current LSN and the
> minimum restart_lsn of replication slots doesn't over the value of
> max_slot_wal_keep_size.

Yes, it's one possible design, the same with "we won't have more
than exactly that many bytes of WAL segmetns" above ("more than"
is added, which I meant). But anyway we cannot keep the limit
strictly since WAL segments are removed only at checkpoint
time. So If doing so, we can reach the lost state before the
max_slot_wal_keep_size is filled up meanwhile WAL can exceed the
size by a WAL flood. We can define it precisely at most as "wal
segments are preserved at most aorund the value".  So I choosed
the definition so that we can tell about this as "we don't
guarantee more than that bytes".

# Uuuu. sorry for possiblly hard-to-read sentence..

>  It's similar to wal_keep_segments except for
> that this feature affects only replication slots. And

It defines the *extra* segments to be kept, that is, if we set it
to 2, at least 3 segments are present. If we set
max_slot_wal_keep_size to 32MB (= 2 segs here), we have at most 3
segments since 32MB range before the current LSN almost always
spans over 3 segments. Doesn't this seemingly in a similar way
with wal_keep_segments?

If the current LSN is at the very last of a segment and
restart_lsn is catching up to the current LSN, the "remain" is
equal to max_slot_wal_keep_size as the guaranteed size. If very
beginning of a segments, it gets extra 16MB.

> wal_keep_segments cannot restrict WAL that replication slots are
> holding. For example, with 16MB of segment size and 50MB of
> max_slot_wal_keep_size, we can keep at most 50MB WAL for replication
> slots. However, once we consumed more than 50MB WAL while not
> advancing any restart_lsn the required WAL might be lost by the next
> checkpoint, which depends on the min_wal_size.

I don't get the last phrase. With small min_wal_size, we don't
recycle most of the "removed" segments. If large, we recycle more
of them. It doesn't affect up to where the checkpoint removes WAL
files. But it is right that LSN advance with
max_slot_wal_keep_size bytes immediately leands to breaking a
slot and it is intended behavior.

>

Re: Query is over 2x slower with jit=on

2018-09-10 Thread Amit Khandekar

On 22 August 2018 at 22:21, Andres Freund  wrote:
> On 2018-08-22 18:39:18 +0200, Andreas Joseph Krogh wrote:
>> Just to be clear; The query really runs slower (wall-clock time), it's not
>> just the timing.
>
> I bet it's not actually running slower, it "just" takes longer to start
> up due to the JITing in each worker. I suspect what we should do is to
> multiple the cost limits by the number of workers, to model that.  But
> without the fixed instrumentation that's harder to see...

Attached is a patch that accumulates the per-worker jit counters into
the leader process.

Similar to SharedExecutorInstrumentation, I added another shared
memory segment SharedJitInstrumenation, which is used to accumulate
each of the workers' jit instrumentation into the leader's.

Earlier I was thinking we can have a common
SharedExecutorInstrumentation structure that will have both
instrumentions, one usual instrumentation, and the other jit
instrumentation. But then thought that fields in
SharedExecutorInstrumentation are mostly not relevant to
JitInstrumentation. So kept a separate segment in the shared memory.

It may happen that the jit context does not get created at the backend
because there was no jit compile done, but at the workers it got
created. In that case, before retrieving the jit data from the
workers, the estate->es_jit needs to be allocated at the backend. For
that, I had to have a new function jit_create_context(), and for that
a new jit provider callback function create_context(). I used the
exsiting llvm_create_context() for this callback function.

JitContext now has those counters in a separate JitInstrumentation
structure, so that the same structure can be shared by
SharedJitInstrumentation and JitContext.

-

I think we better show per-worker jit info also. The current patch
does not show that. I think it would be easy to continue on the patch
to show per-worker info also. Under the Gather node, we can show
per-worker jit counters. I think this would be useful too, besides the
cumulative figures in the leader process. Comments ?

-- 
Thanks,
-Amit Khandekar
EnterpriseDB Corporation
The Postgres Database Company

jit_instr_account_workers.patch
Description: Binary data

Re: Avoid extra Sort nodes between WindowAggs when sorting can be reused

2018-09-10 Thread Alexander Kuzmenkov

The last version looked OK, so I'm marking this patch as ready for 
committer in the commitfest app.


--
Alexander Kuzmenkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Bug report: Dramatic increase in conflict with recovery after upgrading 10.2->10.5

2018-09-10 Thread Chris Travers

I haven't done significant debugging on this yet but if anyone else has
seen this problem and can point me at anything this would be good.
Otherwise I believe we will expect to debug and try to fix this problem
soon.

After upgrading to PostgreSQL 10.5 from 10.2, we went from one type of
query having about one recovery conflict per month to about three or so
every two days.  The queries in question should usually complete very fast
and should never hit the 30 sec max standby delay.

At present I believe this to likely be a regression.  But if nobody else
knows otherwise, I should know more in a couple days.

-- 
Best Regards,
Chris Travers
Head of Database

Tel: +49 162 9037 210 | Skype: einhverfr | www.adjust.com
Saarbrücker Straße 37a, 10405 Berlin

Re: Latest HEAD fails to build

2018-09-10 Thread Daniel Gustafsson

> On 10 Sep 2018, at 11:35, Rafia Sabih  wrote:
> On Mon, Sep 10, 2018 at 2:56 PM, Daniel Gustafsson  > wrote:
> > On 10 Sep 2018, at 11:22, Rafia Sabih  > > wrote:
> > 
> > Hi all,
> > 
> > After pulling the the latest commit -- 
> > e3d77ea6b4e425093db23be492f236896dd7b501, I am getting following error on 
> > compiling,
> > cp: cannot stat ‘./dynloader.h’: No such file or directory
> > 
> > Things were working fine till ac27c74def5d8544530b13d5901308a342f072ac 
> > atleast.
> > 
> > Anybody having a clue about that…
> 
> I ran into the same thing this morning.  I think you have a leftover build
> artefact in src/include/dynloader.h from a build prior to it’s removal.  Try 
> to
> remove the file and make install again.
> 
> Hi Daniel,
> 
> Thanks for the interest and help you offered. But I didn't quite get it, I 
> tried maintaner-clean and distclean but it didn't work.

I believe it is because since 842cb9fa62fc995980 distclean and maintainer-clean
don’t know about that file and thus won't try to remove it either (I can
confirm that distclean didn’t work for me either).  I’ve only skimmed the
change though so I might be missing something.

cheers ./daniel

Re: Latest HEAD fails to build

2018-09-10 Thread Rafia Sabih

On Mon, Sep 10, 2018 at 2:56 PM, Daniel Gustafsson  wrote:

> > On 10 Sep 2018, at 11:22, Rafia Sabih 
> wrote:
> >
> > Hi all,
> >
> > After pulling the the latest commit -- 
> > e3d77ea6b4e425093db23be492f236896dd7b501,
> I am getting following error on compiling,
> > cp: cannot stat ‘./dynloader.h’: No such file or directory
> >
> > Things were working fine till ac27c74def5d8544530b13d5901308a342f072ac
> atleast.
> >
> > Anybody having a clue about that…
>
> I ran into the same thing this morning.  I think you have a leftover build
> artefact in src/include/dynloader.h from a build prior to it’s removal.
> Try to
> remove the file and make install again.
>
> Hi Daniel,

Thanks for the interest and help you offered. But I didn't quite get it, I
tried maintaner-clean and distclean but it didn't work.

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Re: cache lookup failed for constraint when alter table referred by partition table

2018-09-10 Thread Justin Pryzby

Adding Alvaro 

On Fri, Sep 07, 2018 at 04:02:13PM +0530, Rajkumar Raghuwanshi wrote:
> postgres=# CREATE TABLE non_part (a INT,PRIMARY KEY(a));
> postgres=# CREATE TABLE part (a INT REFERENCES non_part(a)) PARTITION BY 
> RANGE(a);
> postgres=# CREATE TABLE part_p1 PARTITION OF part FOR VALUES FROM (MINVALUE) 
> TO (MAXVALUE);
> postgres=# ALTER TABLE non_part ALTER COLUMN a TYPE bigint;
> *ERROR:  cache lookup failed for constraint 16398*

I want to suggest adding to open items.
https://wiki.postgresql.org/index.php?title=PostgreSQL_11_Open_Items

..since it's documented as an "Major enhancement" in PG11:
https://www.postgresql.org/docs/11/static/release-11.html

Justin

Re: Latest HEAD fails to build

2018-09-10 Thread Daniel Gustafsson

> On 10 Sep 2018, at 11:22, Rafia Sabih  wrote:
> 
> Hi all,
> 
> After pulling the the latest commit -- 
> e3d77ea6b4e425093db23be492f236896dd7b501, I am getting following error on 
> compiling,
> cp: cannot stat ‘./dynloader.h’: No such file or directory
> 
> Things were working fine till ac27c74def5d8544530b13d5901308a342f072ac 
> atleast.
> 
> Anybody having a clue about that…

I ran into the same thing this morning.  I think you have a leftover build
artefact in src/include/dynloader.h from a build prior to it’s removal.  Try to
remove the file and make install again.

cheers ./daniel

Latest HEAD fails to build

2018-09-10 Thread Rafia Sabih

Hi all,

After pulling the the latest commit --
e3d77ea6b4e425093db23be492f236896dd7b501, I am getting following error on
compiling,
cp: cannot stat ‘./dynloader.h’: No such file or directory

Things were working fine till ac27c74def5d8544530b13d5901308a342f072ac
atleast.

Anybody having a clue about that...

-- 
Regards,
Rafia Sabih
EnterpriseDB: http://www.enterprisedb.com/

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Sergei Kornilov

Hi
> This parameter is defined in postgresql.conf
Huh, i believe it be in future.
Currently it is recovery.conf parameter, and yes - it can be set (or changed) 
only at database start.

regards, Sergei

Re: Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Michael Paquier

On Mon, Sep 10, 2018 at 04:32:39PM +0800, Jinhua Luo wrote:
> If I need to change the master address from which the slave
> replicates, must I restart the postgresql? Or just reload is ok?

This parameter is defined in postgresql.conf, you need to restart the
instance.
--
Michael


signature.asc
Description: PGP signature

Can I just reload the slave to change primary_conninfo?

2018-09-10 Thread Jinhua Luo

Hi All,

If I need to change the master address from which the slave
replicates, must I restart the postgresql? Or just reload is ok?

Re: Pluggable Storage - Andres's take

2018-09-10 Thread Haribabu Kommi

On Wed, Sep 5, 2018 at 2:04 PM Haribabu Kommi 
wrote:

>
> On Tue, Sep 4, 2018 at 10:33 AM Andres Freund  wrote:
>
>> Hi,
>>
>> Thanks for the patches!
>>
>> On 2018-09-03 19:06:27 +1000, Haribabu Kommi wrote:
>> > I found couple of places where the zheap is using some extra logic in
>> > verifying
>> > whether it is zheap AM or not, based on that it used to took some extra
>> > decisions.
>> > I am analyzing all the extra code that is done, whether any callbacks
>> can
>> > handle it
>> > or not? and how? I can come back with more details later.
>>
>> Yea, I think some of them will need to stay (particularly around
>> integrating undo) and som other ones we'll need to abstract.
>>
>
> OK. I will list all the areas that I found with my observation of how to
> abstract or leaving it and then implement around it.
>

The following are the change where the code is specific to checking whether
it is a zheap relation or not?

Overall I found that It needs 3 new API's at the following locations.
1. RelationSetNewRelfilenode
2. heap_create_init_fork
3. estimate_rel_size
4. Facility to provide handler options like (skip WAL and etc).
_hash_vacuum_one_page:
xlrec.flags = RelationStorageIsZHeap(heapRel) ?
XLOG_HASH_VACUUM_RELATION_STORAGE_ZHEAP : 0;

_bt_delitems_delete:
xlrec_delete.flags = RelationStorageIsZHeap(heapRel) ?
XLOG_BTREE_DELETE_RELATION_STORAGE_ZHEAP : 0;

Storing the type of the handler and while checking for these new types
adding a
new API for special handing can remove the need of the above code.
RelationAddExtraBlocks:
if (RelationStorageIsZHeap(relation))
{
ZheapInitPage(page, BufferGetPageSize(buffer));
freespace = PageGetZHeapFreeSpace(page);
}

Adding a new API for PageInt and PageGetHeapFreeSpace to redirect the calls
to specific
table am handlers.

visibilitymap_set:
if (RelationStorageIsZHeap(rel))
{
recptr = log_zheap_visible(rel->rd_node, heapBuf, vmBuf,
   cutoff_xid, flags);
/*
* We do not have a page wise visibility flag in zheap.
* So no need to set LSN on zheap page.
*/
}

Handler options may solve need of above code.

validate_index_heapscan:
/* Set up for predicate or expression evaluation */
/* For zheap relations, the tuple is locally allocated, so free it. */
ExecStoreHeapTuple(heapTuple, slot, RelationStorageIsZHeap(heapRelation));

This will solve after changing the validate_index_heapscan function to
slotify.

RelationTruncate:
/* Create the meta page for zheap */
if (RelationStorageIsZHeap(rel))
RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence,
  InvalidTransactionId,
  InvalidMultiXactId);
if (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED &&
rel->rd_rel->relkind != 'p')
{
heap_create_init_fork(rel);
if (RelationStorageIsZHeap(rel))
ZheapInitMetaPage(rel, INIT_FORKNUM);
}

new API in RelationSetNewRelfilenode and heap_create_init_fork can solve it.
cluster:
if (RelationStorageIsZHeap(rel))
ereport(ERROR,
(errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
errmsg("cannot cluster a zheap table")));

No change required.

copyFrom:
/*
* In zheap, we don't support the optimization for HEAP_INSERT_SKIP_WAL.
* See zheap_prepare_insert for details.
* PBORKED / ZBORKED: abstract
*/
if (!RelationStorageIsZHeap(cstate->rel) && !XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;
How about requesting the table am handler to provide options and use them
here?
ExecuteTruncateGuts:

// PBORKED: Need to abstract this
minmulti = GetOldestMultiXactId();

/*
* Need the full transaction-safe pushups.
*
* Create a new empty storage file for the relation, and assign it
* as the relfilenode value. The old storage file is scheduled for
* deletion at commit.
*
* PBORKED: needs to be a callback
*/
if (RelationStorageIsZHeap(rel))
RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence,
  InvalidTransactionId, InvalidMultiXactId);
else
RelationSetNewRelfilenode(rel, rel->rd_rel->relpersistence,
  RecentXmin, minmulti);
if (rel->rd_rel->relpersistence == RELPERSISTENCE_UNLOGGED)
{
heap_create_init_fork(rel);
if (RelationStorageIsZHeap(rel))
ZheapInitMetaPage(rel, INIT_FORKNUM);
}

New API inside RelationSetNewRelfilenode can handle it.
ATRewriteCatalogs:
/* Inherit the storage_engine reloption from the parent table. */
if (RelationStorageIsZHeap(rel))
{
static char *validnsps[] = HEAP_RELOPT_NAMESPACES;
DefElem *storage_engine;

storage_engine = makeDefElemExtended("toast", "storage_engine",
(Node *) makeString("zheap"),
DEFELEM_UNSPEC, -1);
reloptions = transformRelOptions((Datum) 0,
list_make1(storage_engine),
"toast",
validnsps, true, false);
}

I don't think anything can be done in API.

ATRewriteTable:
/*
* In zheap, we don't support the optimization for HEAP_INSERT_SKIP_WAL.
* See zheap_prepare_insert for details.
*
* ZFIXME / PFIXME: We probably need a different abstraction for this.
*/
if (!RelationStorageIsZHeap(newrel) && !XLogIsNeeded())
hi_options |= HEAP_INSERT_SKIP_WAL;

Options can solve this also.

estimate_rel_size:
if (curpages < 10 &&
(rel->rd_rel->relpages == 0 ||
(Rela

Re: CREATE ROUTINE MAPPING

2018-09-10 Thread Hannu Krosing

Hi Corey

Have you looked at pl/proxy ?

It does this and then some (sharding)

It actually started out as a set of pl/pythonu functions, but then got
formalized into a full extension language for defining remote (potentially
sharded) function calls


Best Regards
Hannu Krosng



On Fri, 12 Jan 2018 at 03:38, Corey Huinker  wrote:

> A few months ago, I was researching ways for formalizing calling functions
> on one postgres instance from another. RPC, basically. In doing so, I
> stumbled across an obscure part of the the SQL Standard called ROUTINE
> MAPPING, which is exactly what I'm looking for.
>
> The syntax specified is, roughly:
>
> CREATE ROUTINE MAPPING local_routine_name FOR remote_routine_spec
> SERVER my_server [ OPTIONS( ... ) ]
>
>
> Which isn't too different from CREATE USER MAPPING.
>
> The idea here is that if I had a local query:
>
> SELECT t.x, remote_func1(),  remote_func2(t.y)
>
> FROM remote_table t
>
> WHERE t.active = true;
>
>
> that would become this query on the remote side:
>
> SELECT t.x, local_func1(), local_func2(t.y)
>
> FROM local_table t
>
> WHERE t.active = true;
>
>
>
> That was probably the main intention of this feature, but I see a
> different possibility there. Consider the cases:
>
> SELECT remote_func(1,'a');
>
>
> and
>
> SELECT * FROM remote_srf(10, true);
>
>
> Now we could have written remote_func() and remote_srf() in plpythonu, and
> it could access whatever remote data that we wanted to see, but that
> exposes our local server to the untrusted pl/python module as well as
> python process overhead.
>
> We could create a specialized foreign data wrapper that requires a WHERE
> clause to include all the require parameters as predicates, essentially
> making every function a table, but that's awkward and unclear to an end
> user.
>
> Having the ability to import functions from other servers allows us to
> write foreign servers that expose functions to the local database, and
> those foreign servers handle the bloat and risks associated with accessing
> that remote data.
>
> Moreover, it would allow hosted environments (AWS, etc) that restrict the
> extensions that can be added to the database to still connect to those
> foreign data sources.
>
> I'm hoping to submit a patch for this someday, but it touches on several
> areas of the codebase where I have no familiarity, so I've put forth to
> spark interest in the feature, to see if any similar work is underway, or
> if anyone can offer guidance.
>
> Thanks in advance.
>

Re: CREATE ROUTINE MAPPING

2018-09-10 Thread Kyotaro HORIGUCHI

Hello.

At Tue, 4 Sep 2018 09:34:21 +0900, Masahiko Sawada  
wrote in 
> On Tue, Sep 4, 2018 at 5:48 AM, David Fetter  wrote:
> > On Fri, Aug 31, 2018 at 05:18:26PM +0900, Masahiko Sawada wrote:
> >> On Thu, Jan 25, 2018 at 2:13 PM, David Fetter  wrote:
> >> > On Thu, Jan 18, 2018 at 04:09:13PM -0500, Corey Huinker wrote:
> >> >> >
> >> >> >
> >> >> > >
> >> >> > > But other situations seem un-handle-able to me:
> >> >> > >
> >> >> > > SELECT remote_func1(l.x) FROM local_table l WHERE l.active = true;
> >> >> >
> >> >> > Do we have any way, or any plan to make a way, to push the set (SELECT
> >> >> > x FROM local_table WHERE active = true) to the remote side for
> >> >> > execution there?  Obviously, there are foreign DBs that couldn't
> >> >> > support this, but I'm guessing they wouldn't have much by way of UDFs
> >> >> > either.
> >> >> >
> >> >>
> >> >> No. The remote query has to be generated at planning time, so it can't 
> >> >> make
> >> >> predicates out of anything that can't be resolved into constants by the
> >> >> planner itself. The complexities of doing so would be excessive, far 
> >> >> better
> >> >> to let the application developer split the queries up because they know
> >> >> better which parts have to resolve first.
> >> >
> >> > So Corey and I, with lots of inputs from Andrew Gierth and Matheus
> >> > Oliveira, have come up with a sketch of how to do this, to wit:
> >> >
> >> > - Extend CREATE FUNCTION to take either FOREIGN and SERVER or AS and
> >> >   LANGUAGE as parameters, but not both. This seems simpler, at least
> >> >   in a proof of concept, than creating SQL standard compliant grammar
> >> >   out of whole cloth.  The SQL standard grammar could be layered in
> >> >   later via the rewriter if this turns out to work.
> >>
> >> I'm also interested in this feature. While studying this feature, I
> >> understood that this feature just pair a local function with a remote
> >> function, not means that creates a kind of virtual function that can
> >> be invoked on only foreign servers. For example, if we execute the
> >> following SQL the local_func() is invoked in local because the col1
> >> column of local_table is referenced by it.

Do you mean that ISO/IEC 9075-9:2016 (right?) is defining that
(and we must follow it)?  Or does it comes by referring to
something like [1]? As far as I see David's mail upthread,
OPTIONS is not precisely defined.

[1] http://cs.unibo.it/~montesi/CBD/Articoli/SQL-MED.pdf

Unfortunately I don't have access to the document nor concrete
use cases. With a rough idea of "remote mapping", I can guess the
followng four use cases.  Each example syntax is just a guess
without any consideration on implementability or other
restrictions. The patch looks currently covering B.

A. Just notify a function can be just pushed down.

  ex. SELECT foo(1, 'bar');  Remote: SELECT foo(1, 'bar');

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem;
   (or same as B)

B. Replace function name with the remote equivalent.

  ex. SELECT foo(1, 'bar');  Remote: SELECT hoge(1, 'bar');

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
   OPTIONS (remote_func_name 'hoge'));

C. Adjust function specification with remote.

  ex. SELECT foo(1, 'bar');  Remote: SELECT hoge('bar', 1, true);

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
   OPTIONS (remote_expression 'hoge($2,$1,true)');
  
D. Replace with an equivalent remote expression.

  ex. SELECT foo(1, 'bar');  Remote: SELECT ('bar' || to_char(1 % 10));

   CREATE ROUTINE MAPPING foomap FOR foo(int, text) SERVER rem
  OPTIONS (remote_expression '$2 || to_char($1 % 10)');


I haven't looked the patch in depth, but the core side looks
generic and the FDW side is extensible to A, C and D. I think B
is enough as a starter. I don't mean that we should implement all
of them. They are just possibilities.


I have some comments on the patch.

It doesn't seem working. Am I missing something?

create server sv1 foreign data wrapper postgres_fdw options (host '/tmp', port 
'5432', dbname 'postgres');
create table lt (a int);
create foreign table ft (a int) server sv1 options (table_name 'lt');
create function lhoge(int) returns int as 'begin return $1 * 2; end;' language 
plpgsql;
create routine mapping rm1 for function lhoge(int) server sv1 options 
(remote_func_name 'rhoge');
explain verbose select * from ft where a = lhoge(3);
QUERY PLAN
--
 Foreign Scan on public.ft  (cost=100.00..936.31 rows=15 width=4)
   Output: a
   Filter: (ft.a = lhoge(3))
   Remote SQL: SELECT a FROM public.lt
(4 rows)


Perhaps it cannot push down simple SQL local functions. (I'm not
sure we should do that.)

Can't we specify remote schema in remote_func_name just as
(remote_func_name "fooschema.funcname")?

Can't we provide the syntax without making MAPPING reserved?

 Chainging the syn

59 matches

Mail list logo