Re: errbacktrace

2019-06-28 Thread Jaime Casanova
On Tue, 25 Jun 2019 at 06:08, Peter Eisentraut <
peter.eisentr...@2ndquadrant.com> wrote:

> New thread continuing from
> <
> https://www.postgresql.org/message-id/d4903af2-e7b7-b551-71f8-3e4a6bdc2...@2ndquadrant.com
> >.
>
> Here is a extended version of Álvaro's patch that adds an errbacktrace()
> function.  You can do two things with this:
>
> - Manually attach it to an ereport() call site that you want to debug.
>
> - Set a configuration parameter like backtrace_function = 'int8in' to
> debug ereport()/elog() calls in a specific function.
>
> There was also mention of settings that would automatically produce
> backtraces for PANICs etc.  Those could surely be added if there is
> enough interest.
>
> For the implementation, I support both backtrace() provided by the OS as
> well as using libunwind.  The former seems to be supported by a number
> of platforms, including glibc, macOS, and FreeBSD, so maybe we don't
> need the libunwind suport.  I haven't found any difference in quality in
> the backtraces between the two approaches, but surely that is highly
> dependent on the exact configuration.
>
> I would welcome testing in all direction with this, to see how well it
> works in different circumstances.
>
>
Hi Peter,

This is certainly a very useful thing. Sadly, it doesn't seem to compile
when trying to use libunwind.
I tried it in a Debian 9 machine with gcc 6.3.0 and debian says i installed
libunwind8 (1.1)

./configure --prefix=/home/jcasanov/Documentos/pgdg/pgbuild/pg13
--enable-debug --enable-profiling --enable-cassert --enable-depend
--with-libunwind

at make i get these errors:
"""
utils/error/elog.o: En la función `set_backtrace':
/home/jcasanov/Documentos/pgdg/projects/postgresql/src/backend/utils/error/elog.c:847:
referencia a `_Ux86_64_getcontext' sin definir
/home/jcasanov/Documentos/pgdg/projects/postgresql/src/backend/utils/error/elog.c:848:
referencia a `_Ux86_64_init_local' sin definir
/home/jcasanov/Documentos/pgdg/projects/postgresql/src/backend/utils/error/elog.c:850:
referencia a `_Ux86_64_step' sin definir
/home/jcasanov/Documentos/pgdg/projects/postgresql/src/backend/utils/error/elog.c:861:
referencia a `_Ux86_64_get_reg' sin definir
/home/jcasanov/Documentos/pgdg/projects/postgresql/src/backend/utils/error/elog.c:862:
referencia a `_Ux86_64_get_proc_name' sin definir
collect2: error: ld returned 1 exit status
make[2]: *** [postgres] Error 1
make[1]: *** [all-backend-recurse] Error 2
make: *** [all-src-recurse] Error 2
"""
-- 
Jaime Casanova  www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Re: SQL/JSON: JSON_TABLE

2019-06-28 Thread Pavel Stehule
Hi

so 29. 6. 2019 v 7:26 odesílatel Nikita Glukhov 
napsal:

> Attached 36th version of patches rebased onto jsonpath v36.
>

I cannot to apply these patches on master. Please, can you check these
patches?

Regards

Pavel

>
> --
> Nikita Glukhov
> Postgres Professional: http://www.postgrespro.com
> The Russian Postgres Company
>
>


Re: C testing for Postgres

2019-06-28 Thread Jesse Zhang
On Fri, Jun 28, 2019 at 10:37 AM Adam Berlin  wrote:
>
> If we were to use this tool, would the community want to vendor the
> framework in the Postgres repository, or keep it in a separate
> repository that produces a versioned shared library?
>

If the library is going to actively evolve, we should bring it into the
tree. For a project like this, a "versioned shared library" is a massive
pain in the rear for both the consumer of such libraries and for their
maintainers.

Cheers,
Jesse




Re: cleanup & refactoring on reindexdb.c

2019-06-28 Thread Michael Paquier
On Fri, Jun 28, 2019 at 09:25:00AM +0200, Julien Rouhaud wrote:
> The patch does not apply anymore, so here's a rebased version.

Thanks for the rebase (and the reminder..).  I'll look at that once
v13 opens for business.
--
Michael


signature.asc
Description: PGP signature


Re: Superfluous libpq-be.h include in GSSAPI code

2019-06-28 Thread Michael Paquier
On Fri, Jun 28, 2019 at 08:47:33PM +0200, Julien Rouhaud wrote:
> On Fri, Jun 28, 2019 at 4:37 PM Daniel Gustafsson  wrote:
>> backend/libpq/be-secure-gssapi.c is including both libpq-be.h and libpq.h,
>> which makes libpq-be.h superfluous as it gets included via libpq.h.  The
>> attached patch removes the inclusion of libpq-be.h to make be-secure-gssapi.c
>> behave like other files which need both headers.
> 
> LGTM.

Thanks, committed.  I looked at the area in case but did not notice
anything else strange.

(We have in hba.h a kludge with hbaPort to avoid including libpq-be.h,
I got to wonder if we could do something about that..)
--
Michael


signature.asc
Description: PGP signature


Re: [bug fix] Produce a crash dump before main() on Windows

2019-06-28 Thread Amit Kapila
On Tue, Nov 6, 2018 at 10:24 AM Haribabu Kommi  wrote:
>
> On Thu, Jul 26, 2018 at 3:52 PM Tsunakawa, Takayuki 
>  wrote:
>
>
> Thanks for confirmation of that PostgreSQL runs as service.
>
> Based on the following details, we can decide whether this fix is required or 
> not.
> 1. Starting of Postgres server using pg_ctl without service is of production 
> use or not?
> 2. Without this fix, how difficult is the problem to find out that something 
> is preventing the
> server to start? In case if it is easy to find out, may be better to provide 
> some troubleshoot
> guide for windows users can help.
>
> I am in favor of doc fix if it easy to find the problem instead of assuming 
> the user usage.
>

Tsunakawa/Haribabu - By reading this thread briefly, it seems we need
some more inputs from other developers on whether to fix this or not,
so ideally the status of this patch should be 'Needs Review'.  Why it
is in 'Waiting on Author' state?  If something is required from
Author, then do we expect to see the updated patch in the next few
days?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [HACKERS] WAL logging problem in 9.4.3?

2019-06-28 Thread Amit Kapila
On Tue, May 28, 2019 at 4:33 AM Noah Misch  wrote:
>
> On Mon, May 27, 2019 at 02:08:26PM +0900, Kyotaro HORIGUCHI wrote:
> > At Fri, 24 May 2019 19:33:32 -0700, Noah Misch  wrote in 
> > <20190525023332.ge1624...@rfd.leadboat.com>
> > > On Mon, May 20, 2019 at 03:54:30PM +0900, Kyotaro HORIGUCHI wrote:
> > > > Following this direction, the attached PoC works *at least for*
> > > > the wal_optimization TAP tests, but doing pending flush not in
> > > > smgr but in relcache.
> > >
> > > This task, syncing files created in the current transaction, is not the 
> > > kind
> > > of task normally assigned to a cache.  We already have a module, 
> > > storage.c,
> > > that maintains state about files created in the current transaction.  Why 
> > > did
> > > you use relcache instead of storage.c?
> >
> > The reason was at-commit sync needs buffer flush beforehand. But
> > FlushRelationBufferWithoutRelCache() in v11 can do
> > that. storage.c is reasonable as the place.
>
> Okay.  I do want this to work in 9.5 and later, but I'm not aware of a reason
> relcache.c would be a better code location in older branches.  Unless you
> think of a reason to prefer relcache.c, please use storage.c.
>
> > > On Tue, May 21, 2019 at 09:29:48PM +0900, Kyotaro HORIGUCHI wrote:
> > > > This is a tidier version of the patch.
> > >
> > > > - Move the substantial work to table/index AMs.
> > > >
> > > >   Each AM can decide whether to support WAL skip or not.
> > > >   Currently heap and nbtree support it.
> > >
> > > Why would an AM find it important to disable WAL skip?
> >
> > The reason is currently it's AM's responsibility to decide
> > whether to skip WAL or not.
>
> I see.  Skipping the sync would be a mere optimization; no AM would require it
> for correctness.  An AM might want RelationNeedsWAL() to keep returning true
> despite the sync happening, perhaps because it persists data somewhere other
> than the forks of pg_class.relfilenode.  Since the index and table APIs
> already assume one relfilenode captures all persistent data, I'm not seeing a
> use case for an AM overriding this behavior.  Let's take away the AM's
> responsibility for this decision, making the system simpler.  A future patch
> could let AM code decide, if someone find a real-world use case for
> AM-specific logic around when to skip WAL.
>

It seems there is some feedback for this patch and the CF is going to
start in 2 days.  Are you planning to work on this patch for next CF,
if not then it is better to bump this?  It is not a good idea to see
the patch in "waiting on author" in the beginning of CF unless the
author is actively working on the patch and is going to produce a
version in next few days.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Avoid full GIN index scan when possible

2019-06-28 Thread Tomas Vondra

On Fri, Jun 28, 2019 at 04:16:23PM -0400, Tom Lane wrote:

Tomas Vondra  writes:

On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:

I not only don't want that function in indxpath.c, I don't even want
it to be known/called from there.  If we need the ability for the index
AM to editorialize on the list of indexable quals (which I'm not very
convinced of yet), let's make an AM interface function to do it.



Wouldn't it be better to have a function that inspects a single qual and
says whether it's "optimizable" or not? That could be part of the AM
implementation, and we'd call it and it'd be us messing with the list.


Uh ... we already determined that the qual is indexable (ie is a member
of the index's opclass), or allowed the index AM to derive an indexable
clause from it, so I'm not sure what you envision would happen
additionally there.  If I understand what Julien is concerned about
--- and I may not --- it's that the set of indexable clauses *as a whole*
may have or lack properties of interest.  So I'm thinking the answer
involves some callback that can do something to the whole list, not
qual-at-a-time.  We've already got facilities for the latter case.



I'm not sure I understand the problem either.

I don't think "indexable" is the thing we care about here - in Julien's
original example the qual with '%a%' is indexable. And we probably want
to keep it that way.

The problem is that evaluating some of the quals may be inefficient with
a given index - but only if there are other quals. In Julien's example
it makes sense to just drop the '%a%' qual, but only when there are some
quals that work with the trigram index. But if there are no such 'good'
quals, it may be better to keep al least the bad ones.

So I think you're right we need to look at the list as a whole.


But that kinda resembles stuff we already have - selectivity/cost. So
why shouldn't this be considered as part of costing?


Yeah, I'm not entirely convinced that we need anything new here.
The cost estimate function can detect such situations, and so can
the index AM at scan start --- for example, btree checks for
contradictory quals at scan start.  There's a certain amount of
duplicative effort involved there perhaps, but you also have to
keep in mind that we don't know the values of run-time-determined
comparison values until scan start.  So if you want certainty rather
than just a cost estimate, you may have to do these sorts of checks
at scan start.



Right, that's why I suggested doing this as part of costing, but you're
right scan start would be another option. I assume it should affect cost
estimates in some way, so the cost function would be my first choice.

But does the cost function really has enough info to make such decision?
For example, ignoring quals is valid only if we recheck them later. For
GIN that's not an issue thanks to the bitmap index scan.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Avoid full GIN index scan when possible

2019-06-28 Thread Nikita Glukhov

Hi!

On 29.06.2019 1:23, Julien Rouhaud wrote:

But that kinda resembles stuff we already have - selectivity/cost. So
why shouldn't this be considered as part of costing?

Yeah, I'm not entirely convinced that we need anything new here.
The cost estimate function can detect such situations, and so can
the index AM at scan start --- for example, btree checks for
contradictory quals at scan start.  There's a certain amount of
duplicative effort involved there perhaps, but you also have to
keep in mind that we don't know the values of run-time-determined
comparison values until scan start.  So if you want certainty rather
than just a cost estimate, you may have to do these sorts of checks
at scan start.

Ah, I didn't know about _bt_preprocess_keys().  I'm not familiar with
this code, so please bear with me.  IIUC the idea would be to add
additional logic in gingetbitmap() / ginNewScanKey() to drop some
quals at runtime.  But that would mean that additional logic would
also be required in BitmapHeapScan, or that all the returned bitmap
should be artificially marked as lossy to enforce a recheck?



We have a similar solution for this problem.  The idea is to avoid full index
scan inside GIN itself when we have some GIN entries, and forcibly recheck
all tuples if triconsistent() returns GIN_MAYBE for the keys that emitted no
GIN entries.

The attached patch in its current shape contain at least two ugly places:

1. We still need to initialize empty scan key to call triconsistent(), but
   then we have to remove it from the list of scan keys.  Simple refactoring
   of ginFillScanKey() can be helpful here.
 
2. We need to replace GIN_SEARCH_MODE_EVERYTHING with GIN_SEARCH_MODE_ALL

   if there are no GIN entries and some key requested GIN_SEARCH_MODE_ALL
   because we need to skip NULLs in GIN_SEARCH_MODE_ALL.  Simplest example here
   is "array @> '{}'": triconsistent() returns GIN_TRUE, recheck is not forced,
   and GIN_SEARCH_MODE_EVERYTHING returns NULLs that are not rechecked.  Maybe
   it would be better to introduce new GIN_SEARCH_MODE_EVERYTHING_NON_NULL.



Example:

CREATE TABLE test AS SELECT i::text AS t FROM generate_series(0, 99) i;

CREATE INDEX ON test USING gin (t gin_trgm_ops);

-- master
EXPLAIN ANALYZE SELECT * FROM test WHERE LIKE '%1234%' AND t LIKE '%1%';
  QUERY PLAN
--
 Bitmap Heap Scan on test  (cost=11777.99..16421.73 rows=7999 width=32) (actual 
time=65.431..65.857 rows=300 loops=1)
   Recheck Cond: ((t ~~ '%1234%'::text) AND (t ~~ '%1%'::text))
   Rows Removed by Index Recheck: 2
   Heap Blocks: exact=114
   ->  Bitmap Index Scan on test_t_idx  (cost=0.00..11775.99 rows=7999 width=0) 
(actual time=65.380..65.380 rows=302 loops=1)
 Index Cond: ((t ~~ '%1234%'::text) AND (t ~~ '%1%'::text))
 Planning Time: 0.151 ms
 Execution Time: 65.900 ms
(8 rows)


-- patched
EXPLAIN ANALYZE SELECT * FROM test WHERE t LIKE '%1234%' AND t LIKE '%1%';
  QUERY PLAN
---
 Bitmap Heap Scan on test  (cost=20.43..176.79 rows=42 width=6) (actual 
time=0.287..0.424 rows=300 loops=1)
   Recheck Cond: ((t ~~ '%1234%'::text) AND (t ~~ '%1%'::text))
   Rows Removed by Index Recheck: 2
   Heap Blocks: exact=114
   ->  Bitmap Index Scan on test_t_idx  (cost=0.00..20.42 rows=42 width=0) 
(actual time=0.271..0.271 rows=302 loops=1)
 Index Cond: ((t ~~ '%1234%'::text) AND (t ~~ '%1%'::text))
 Planning Time: 0.080 ms
 Execution Time: 0.450 ms
(8 rows)

--
Nikita Glukhov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

>From 6e54ed98501e98ee0608702ce2e4adc56fcf354a Mon Sep 17 00:00:00 2001
From: Nikita Glukhov 
Date: Fri, 28 Jun 2019 23:18:39 +0300
Subject: [PATCH] Avoid full GIN index scan

---
 src/backend/access/gin/ginget.c  |  2 ++
 src/backend/access/gin/ginscan.c | 24 +++-
 src/backend/utils/adt/selfuncs.c | 12 +++-
 src/include/access/gin_private.h |  1 +
 4 files changed, 37 insertions(+), 2 deletions(-)

diff --git a/src/backend/access/gin/ginget.c b/src/backend/access/gin/ginget.c
index b18ae2b..317fa1f 100644
--- a/src/backend/access/gin/ginget.c
+++ b/src/backend/access/gin/ginget.c
@@ -1891,6 +1891,8 @@ gingetbitmap(IndexScanDesc scan, TIDBitmap *tbm)
 		if (!scanGetItem(scan, iptr, , ))
 			break;
 
+		recheck |= so->forcedRecheck;
+
 		if (ItemPointerIsLossyPage())
 			tbm_add_page(tbm, ItemPointerGetBlockNumber());
 		else
diff --git a/src/backend/access/gin/ginscan.c b/src/backend/access/gin/ginscan.c
index 74d9821..a32a6ff 100644
--- a/src/backend/access/gin/ginscan.c
+++ b/src/backend/access/gin/ginscan.c
@@ -141,7 +141,8 @@ ginFillScanKey(GinScanOpaque so, OffsetNumber attnum,
 	

Re: [PATCH] Implement uuid_version()

2019-06-28 Thread Peter Geoghegan
On Mon, Apr 8, 2019 at 11:04 PM Peter Eisentraut
 wrote:
> Yeah, I think implementing a v4 generator in core would be trivial and
> address almost everyone's requirements.

FWIW, I think that we could do better with nbtree page splits given
sequential UUIDs of one form or another [1]. We could teach
nbtsplitloc.c to pack leaf pages full of UUIDs in the event of the
user using sequential UUIDs. With a circular UUID prefix, I think
you'll run into an issue similar to the issue that was addressed by
the "split after new tuple" optimization -- most leaf pages end up 50%
full. I've not verified this, but I can't see why it would be any
different to other multimodal key space with sequential insertions
that are grouped. Detecting this in UUIDs may or may not require
opclass infrastructure. Either way, I'm not likely to work on it until
there is a clear target, such as a core or contrib sequential UUID
generator. Though I am looking at various ways to improve
nbtsplitloc.c for Postgres 13 -- I suspect that additional wins are
possible.

Any sequential UUID scheme will already have far fewer problems with
indexing today, since random UUIDs are *dreadful*, but I can imagine
doing quite a lot better still. Application developers love UUIDs. We
should try to meet them where they are.

[1] https://www.2ndquadrant.com/en/blog/sequential-uuid-generators/
-- 
Peter Geoghegan




Re: Avoid full GIN index scan when possible

2019-06-28 Thread Julien Rouhaud
On Fri, Jun 28, 2019 at 10:16 PM Tom Lane  wrote:
>
> Tomas Vondra  writes:
> > On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
> >> I not only don't want that function in indxpath.c, I don't even want
> >> it to be known/called from there.  If we need the ability for the index
> >> AM to editorialize on the list of indexable quals (which I'm not very
> >> convinced of yet), let's make an AM interface function to do it.
>
> > Wouldn't it be better to have a function that inspects a single qual and
> > says whether it's "optimizable" or not? That could be part of the AM
> > implementation, and we'd call it and it'd be us messing with the list.
>
> Uh ... we already determined that the qual is indexable (ie is a member
> of the index's opclass), or allowed the index AM to derive an indexable
> clause from it, so I'm not sure what you envision would happen
> additionally there.  If I understand what Julien is concerned about
> --- and I may not --- it's that the set of indexable clauses *as a whole*
> may have or lack properties of interest.  So I'm thinking the answer
> involves some callback that can do something to the whole list, not
> qual-at-a-time.  We've already got facilities for the latter case.

Yes, the root issue here is that with gin it's entirely possible that
"WHERE sometable.col op value1" is way more efficient than "WHERE
sometable.col op value AND sometable.col op value2", where both qual
are determined indexable by the opclass.  The only way to avoid that
is indeed to inspect the whole list, as done in this poor POC.

This is a problem actually hit in production, and as far as I know
there's no easy way from the application POV to prevent unexpected
slowdown.  Maybe Marc will have more details about the actual problem
and how expensive such a case was compared to the normal ones.

> > But that kinda resembles stuff we already have - selectivity/cost. So
> > why shouldn't this be considered as part of costing?
>
> Yeah, I'm not entirely convinced that we need anything new here.
> The cost estimate function can detect such situations, and so can
> the index AM at scan start --- for example, btree checks for
> contradictory quals at scan start.  There's a certain amount of
> duplicative effort involved there perhaps, but you also have to
> keep in mind that we don't know the values of run-time-determined
> comparison values until scan start.  So if you want certainty rather
> than just a cost estimate, you may have to do these sorts of checks
> at scan start.

Ah, I didn't know about _bt_preprocess_keys().  I'm not familiar with
this code, so please bear with me.  IIUC the idea would be to add
additional logic in gingetbitmap() / ginNewScanKey() to drop some
quals at runtime.  But that would mean that additional logic would
also be required in BitmapHeapScan, or that all the returned bitmap
should be artificially marked as lossy to enforce a recheck?




Re: C testing for Postgres

2019-06-28 Thread David Fetter
On Fri, Jun 28, 2019 at 09:42:54AM -0400, Adam Berlin wrote:
> Here are my takeaways from the previous discussions:
> 
> * there *is* interest in testing

Yep.

> * we shouldn't take it too far
> * there are already tests being written under `src/test/modules`, but
> without a consistent way of describing expectations and displaying results

This is a giant problem.

> * no tool was chosen

If there's a way to get this in the tree, assuming people agree it
should be there, that'd be fantastic.

Our current system has been creaking for years.

Best,
David.
-- 
David Fetter  http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate




Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread David Steele

On 6/28/19 1:15 PM, Tom Lane wrote:

Stephen Frost  writes:

sh, don't look now, but there might be a "Resend email" button in
the archives now that you can click to have an email sent to you...


Oooh, lovely.


(thank you Magnus)


+many


Thank you, Magnus, this is really helpful!

--
-David
da...@pgmasters.net




Re: BUG #15293: Stored Procedure Triggered by Logical Replication is Unable to use Notification Events

2019-06-28 Thread Alvaro Herrera
On 2019-Feb-22, Robert Welin wrote:

> I have reproduced this bug on PostgreSQL version 10.7 and 11.2 using the
> steps described in Michael Powers' original report. The issue also still
> seems to be present even with the patch provided by Sergei Kornilov.
> 
> Are there plans to address this issue any time soon or is there some way
> I can assist in fixing it? It would be great to have notifications from
> logical replication.

This issue seems largely forgotten about :-(

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Comment typo in tableam.h

2019-06-28 Thread Amit Kapila
On Mon, Jun 24, 2019 at 11:26 PM Ashwin Agrawal  wrote:
>
> On Mon, Jun 3, 2019 at 5:24 PM Ashwin Agrawal  wrote:
>>
>> There were few more minor typos I had collected for table am, passing them 
>> along here.
>>
>> Some of the required callback functions are missing Assert checking (minor 
>> thing), adding them in separate patch.
>
>
> Curious to know if need to register such small typo fixing and assertion 
> adding patchs to commit-fest as well ?
>

Normally, such things are handled out of CF, but to avoid forgetting,
we can register it.  However, this particular item suits more to 'Open
Items'[1].  You can remove the objectionable part of the comment,
other things in your patch look good to me.  If nobody else picks it
up, I will take care of it.

[1] - https://wiki.postgresql.org/wiki/PostgreSQL_12_Open_Items

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Avoid full GIN index scan when possible

2019-06-28 Thread Tom Lane
Tomas Vondra  writes:
> On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:
>> I not only don't want that function in indxpath.c, I don't even want
>> it to be known/called from there.  If we need the ability for the index
>> AM to editorialize on the list of indexable quals (which I'm not very
>> convinced of yet), let's make an AM interface function to do it.

> Wouldn't it be better to have a function that inspects a single qual and
> says whether it's "optimizable" or not? That could be part of the AM
> implementation, and we'd call it and it'd be us messing with the list.

Uh ... we already determined that the qual is indexable (ie is a member
of the index's opclass), or allowed the index AM to derive an indexable
clause from it, so I'm not sure what you envision would happen
additionally there.  If I understand what Julien is concerned about
--- and I may not --- it's that the set of indexable clauses *as a whole*
may have or lack properties of interest.  So I'm thinking the answer
involves some callback that can do something to the whole list, not
qual-at-a-time.  We've already got facilities for the latter case.

> But that kinda resembles stuff we already have - selectivity/cost. So
> why shouldn't this be considered as part of costing?

Yeah, I'm not entirely convinced that we need anything new here.
The cost estimate function can detect such situations, and so can
the index AM at scan start --- for example, btree checks for
contradictory quals at scan start.  There's a certain amount of
duplicative effort involved there perhaps, but you also have to
keep in mind that we don't know the values of run-time-determined
comparison values until scan start.  So if you want certainty rather
than just a cost estimate, you may have to do these sorts of checks
at scan start.

regards, tom lane




Re: COPY FREEZE and setting PD_ALL_VISIBLE/visibility map bits

2019-06-28 Thread Amit Kapila
On Thu, Jun 27, 2019 at 11:02 AM Pavan Deolasee
 wrote:
>
>>> On 2019-04-07 18:04:27 -0700, Andres Freund wrote:
>>> > Here's a *prototype* patch for this.  It only implements what I
>>> > described for heap_multi_insert, not for plain inserts. I wanted to see
>>> > what others think before investing additional time.
>>>
>>> Pavan, are you planning to work on this for v13 CF1? Or have you lost
>>> interest on the topic?
>>
>>
>> Yes, I plan to work on it.
>>
>
> I am sorry, but I am not able to find time to get back to this because of 
> other high priority items. If it still remains unaddressed in the next few 
> weeks, I will pick it up again. But for now, I am happy if someone wants to 
> pick and finish the work.
>

Fair enough, I have marked the entry [1] in the coming CF as "Returned
with Feedback".  I hope that is okay with you.

[1] - https://commitfest.postgresql.org/23/2009/

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Avoid full GIN index scan when possible

2019-06-28 Thread Tomas Vondra

On Fri, Jun 28, 2019 at 03:03:19PM -0400, Tom Lane wrote:

Julien Rouhaud  writes:

On Fri, Jun 28, 2019 at 6:10 PM Tomas Vondra
 wrote:

2) I'm not sure it's a good idea to add dependency on a specific AM type
into indxpath.c. At the moment there are only two places, both referring
to BTREE_AM_OID, do we really hard-code another OID here?

I wonder if this could be generalized to another support proc in the
inde AM API, with just GIN implementing it.



Yes, this patch was more a POC than anything, to discuss the approach
before spending too much time on infrastructure.  I considered another
support function, but I'm still unclear of how useful it'd be for
custom AM (as I don't see any use for that for the vanilla one I
think), or whether if this should be opclass specific or not.


I just spent a lot of sweat to get rid of (most of) indxpath.c's knowledge
about specific AMs' capabilities; I'd be very sad if we started to put any
back.  Aside from being a modularity violation, it's going to fall foul
of the principle that if index AM X wants something, some index AM Y is
going to want it too, eventually.

Also, I'm quite unhappy about including index_selfuncs.h into indxpath.c
at all, never mind whether you got the alphabetical ordering right.
I have doubts still about how we ought to refactor the mess that is
*selfuncs.c, but this isn't going in the right direction.



Right.


3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
as it's only for functions computing selectivity estimates (and funcs
directly related to that). And the new function is not related to that
at all, so why not to define it in indxpath.c directly?


I not only don't want that function in indxpath.c, I don't even want
it to be known/called from there.  If we need the ability for the index
AM to editorialize on the list of indexable quals (which I'm not very
convinced of yet), let's make an AM interface function to do it.



Wouldn't it be better to have a function that inspects a single qual and
says whether it's "optimizable" or not? That could be part of the AM
implementation, and we'd call it and it'd be us messing with the list.

That being said, is this really a binary thing - if you have a value
that matches 99% of rows, that probably is not much better than a full
scan.  It may be more difficult to decide (compared to the 'short
trigram' case), but perhaps we should allow that too? Essentially,
instead of 'optimizable' returning true/false, it might return value
between 0.0 and 1.0, as a measure of 'optimizability'.

But that kinda resembles stuff we already have - selectivity/cost. So
why shouldn't this be considered as part of costing? That is, could
gincostestimate look at the index quals and decide what will be used for
scanning the index? Of course, this would make the logic GIN-specific,
and other index AMs would have to implement their own version ...


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Avoid full GIN index scan when possible

2019-06-28 Thread Tom Lane
Julien Rouhaud  writes:
> On Fri, Jun 28, 2019 at 6:10 PM Tomas Vondra
>  wrote:
>> 2) I'm not sure it's a good idea to add dependency on a specific AM type
>> into indxpath.c. At the moment there are only two places, both referring
>> to BTREE_AM_OID, do we really hard-code another OID here?
>> 
>> I wonder if this could be generalized to another support proc in the
>> inde AM API, with just GIN implementing it.

> Yes, this patch was more a POC than anything, to discuss the approach
> before spending too much time on infrastructure.  I considered another
> support function, but I'm still unclear of how useful it'd be for
> custom AM (as I don't see any use for that for the vanilla one I
> think), or whether if this should be opclass specific or not.

I just spent a lot of sweat to get rid of (most of) indxpath.c's knowledge
about specific AMs' capabilities; I'd be very sad if we started to put any
back.  Aside from being a modularity violation, it's going to fall foul
of the principle that if index AM X wants something, some index AM Y is
going to want it too, eventually.

Also, I'm quite unhappy about including index_selfuncs.h into indxpath.c
at all, never mind whether you got the alphabetical ordering right.
I have doubts still about how we ought to refactor the mess that is
*selfuncs.c, but this isn't going in the right direction.

>> 3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
>> as it's only for functions computing selectivity estimates (and funcs
>> directly related to that). And the new function is not related to that
>> at all, so why not to define it in indxpath.c directly?

I not only don't want that function in indxpath.c, I don't even want
it to be known/called from there.  If we need the ability for the index
AM to editorialize on the list of indexable quals (which I'm not very
convinced of yet), let's make an AM interface function to do it.

BTW, I have no idea what you think you're doing here by messing with
outer_relids, but it's almost certainly wrong, and if it isn't wrong
then it needs a comment explaining itself.

regards, tom lane




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Peter Geoghegan
On Tue, Mar 19, 2019 at 12:38 PM legrand legrand
 wrote:
> Would it make sense to add it in auto explain ?
> I don't know for explain itself, but maybe ...

I think that it should appear in EXPLAIN. pg_stat_statements already
cannot have a query hash of zero, so it might be okay to display it
only when its value is non-zero.

-- 
Peter Geoghegan




Re: Superfluous libpq-be.h include in GSSAPI code

2019-06-28 Thread Julien Rouhaud
On Fri, Jun 28, 2019 at 4:37 PM Daniel Gustafsson  wrote:
>
> backend/libpq/be-secure-gssapi.c is including both libpq-be.h and libpq.h,
> which makes libpq-be.h superfluous as it gets included via libpq.h.  The
> attached patch removes the inclusion of libpq-be.h to make be-secure-gssapi.c
> behave like other files which need both headers.

LGTM.




Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Peter Geoghegan
On Tue, Mar 19, 2019 at 12:00 PM Robert Haas  wrote:
> On the other hand, it also appears that a lot of people would be very,
> very happy to just be able to see the query ID field that already
> exists, both in pg_stat_statements in pg_stat_activity, and we
> shouldn't throw up unnecessary impediments in the way of making that
> happen, at least IMHO.

+1.

pg_stat_statements will already lose all the statistics that it
aggregated in the event of a hard crash. The trade-off that the query
jumbling logic makes is not a bad one, all things considered.

-- 
Peter Geoghegan




Re: [HACKERS] Regression tests vs existing users in an installation

2019-06-28 Thread Tom Lane
OK, here's a completed patch to add checking for naming-rule violations.

I updated regress.sgml to clarify the naming rules (and failed to
resist the temptation to update a lot of other somewhat-obsolete
statements there, too).

Also worth noting is that I added an IsReservedName check to
pg_replication_origin_create(), ie it will no longer allow
user-selected origin names starting with "pg_".  This seems
like a good idea considering that we generate internal origin
names that look like that.  I have not done anything equivalent
for subscription names, but should we consider doing so?

I touched the TAP tests only to the extent necessary to make them pass
cleanly --- mostly I had to fool with pg_dump/t/010_dump_connstr.pl
because it whines if the restore step emits any warnings.  Perhaps
there's another way to address that with less invasive changes to that
test script; but I couldn't think of one other than ignoring warnings,
which didn't seem like a great idea.

This patch doesn't address the issue that we think rolenames.sql is
unsafe to run in "make installcheck" mode.  That seems like a separable
problem, and we'd have to adjust that script as shown here anyway.

Barring objections, I'll push this shortly (to HEAD only) and turn on
the enabling switch on one or two of my buildfarm critters.

regards, tom lane

diff --git a/doc/src/sgml/regress.sgml b/doc/src/sgml/regress.sgml
index 448e2d1..7b68213 100644
--- a/doc/src/sgml/regress.sgml
+++ b/doc/src/sgml/regress.sgml
@@ -47,7 +47,7 @@ make check
 
 
 ===
- All 115 tests passed.
+ All 193 tests passed.
 ===
 
 
@@ -98,7 +98,7 @@ make MAX_CONNECTIONS=10 check
 
   
To run the tests after installation (see ),
-   initialize a data area and start the
+   initialize a data directory and start the
server as explained in , then type:
 
 make installcheck
@@ -116,10 +116,10 @@ make installcheck-parallel
 
   
The tests will also transiently create some cluster-wide objects, such as
-   roles and tablespaces.  These objects will have names beginning with
-   regress_.  Beware of using installcheck
-   mode in installations that have any actual users or tablespaces named
-   that way.
+   roles, tablespaces, and subscriptions.  These objects will have names
+   beginning with regress_.  Beware of
+   using installcheck mode with an installation that has
+   any actual global objects named that way.
   
   
 
@@ -130,7 +130,7 @@ make installcheck-parallel
The make check and make installcheck commands
run only the core regression tests, which test built-in
functionality of the PostgreSQL server.  The source
-   distribution also contains additional test suites, most of them having
+   distribution contains many additional test suites, most of them having
to do with add-on functionality such as optional procedural languages.
   
 
@@ -146,9 +146,24 @@ make installcheck-world
already-installed server, respectively, just as previously explained
for make check and make installcheck.  Other
considerations are the same as previously explained for each method.
-   Note that make check-world builds a separate temporary
-   installation tree for each tested module, so it requires a great deal
-   more time and disk space than make installcheck-world.
+   Note that make check-world builds a separate instance
+   (temporary data directory) for each tested module, so it requires more
+   time and disk space than make installcheck-world.
+  
+
+  
+   On a modern machine with multiple CPU cores and no tight operating-system
+   limits, you can make things go substantially faster with parallelism.
+   The recipe that most PostgreSQL developers actually use for running all
+   tests is something like
+
+make check-world -j8 >/dev/null
+
+   with a -j limit near to or a bit more than the number
+   of available cores.  Discarding stdout
+   eliminates chatter that's not interesting when you just want to verify
+   success.  (In case of failure, the stderr
+   messages are usually enough to determine where to look closer.)
   
 
   
@@ -166,8 +181,7 @@ make installcheck-world
   

 
- Regression tests for optional procedural languages (other than
- PL/pgSQL, which is tested by the core tests).
+ Regression tests for optional procedural languages.
  These are located under src/pl.
 

@@ -186,27 +200,49 @@ make installcheck-world


 
+ Tests for core-supported authentication methods,
+ located in src/test/authentication.
+ (See below for additional authentication-related tests.)
+
+   
+   
+
  Tests stressing behavior of concurrent sessions,
  located in src/test/isolation.
 


 
- Tests of client programs under src/bin.  See
- also .
+ Tests for crash recovery and physical replication,
+ located in src/test/recovery.
+
+   
+   
+
+ Tests for logical 

Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Luis Carril
 >Restoring content of FDW table via pg_restore or psql can be dangerous - 
 >there I see a risk, and can be nice to allow it only >with some form of 
 >safeguard.
>
>I think so important questions is motivation for dumping FDW - a) clonning 
>(has sense for me and it is safe), b) real backup >(requires writeable FDW) - 
>has sense too, but I see a possibility of unwanted problems.

What about providing a list of FDW servers instead of an all or nothing option? 
In that way the user really has to do a conscious decision to dump the content 
of the foreign tables for a specific server, this would allow distinction if 
multiple FDW are being used in the same DB. Also I think it is responsability 
of the user to know if the FDW that are being used are read-only or not.

Cheers
Luis M Carril




Re: C testing for Postgres

2019-06-28 Thread Adam Berlin
Here are my takeaways from the previous discussions:

* there *is* interest in testing
* we shouldn't take it too far
* there are already tests being written under `src/test/modules`, but
without a consistent way of describing expectations and displaying results
* no tool was chosen

If we were to use this tool, would the community want to vendor the
framework in the Postgres repository, or keep it in a separate repository
that produces a versioned shared library?

On Fri, Jun 28, 2019 at 5:57 AM Dmitry Dolgov <9erthali...@gmail.com> wrote:

> > On Fri, Jun 28, 2019 at 11:38 AM Adam Berlin  wrote:
> >
> > During the unconference at PGCon in Ottawa, I asked about writing C-based
> > tests for Postgres. There was interest in trying a tool and also some
> > hesitation to depend on a third-party library. So, I wrote a tool that
> I'd
> > like to contribute to Postgres. I’ve been calling it cexpect [1].
>
> Cool, thanks!
>
> > Rather than post a patch, I'd rather start a conversation first. I'm
> guessing
> > there are some improvements that we'd want to make (for example: the
> > Makefile) before commiting a patch. Let's iterate on improvements before
> > creating a formal patch.
>
> Just to mention, there were similar discussions already in the past ([1],
> [2]),
> with some concerns being raised, but looks like without any visible
> results.
>
> [1]:
> https://www.postgresql.org/message-id/flat/CAEepm%3D2heu%2B5zwB65jWap3XY-UP6PpJZiKLQRSV2UQH9BmVRXQ%40mail.gmail.com
> [2]:
> https://www.postgresql.org/message-id/flat/Pine.LNX.4.58.0410111044030.14840%40linuxworld.com.au
>


Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread Tom Lane
Stephen Frost  writes:
> sh, don't look now, but there might be a "Resend email" button in
> the archives now that you can click to have an email sent to you...

Oooh, lovely.

> (thank you Magnus)

+many

regards, tom lane




Re: SimpleLruTruncate() mutual exclusion

2019-06-28 Thread Noah Misch
On Sun, Feb 17, 2019 at 11:31:03PM -0800, Noah Misch wrote:
> I'm forking this thread from
> https://postgr.es/m/flat/20190202083822.gc32...@gust.leadboat.com, which
> reported a race condition involving the "apparent wraparound" safety check in
> SimpleLruTruncate():
> 
> On Wed, Feb 13, 2019 at 11:26:23PM -0800, Noah Misch wrote:
> > 1. The result of the test is valid only until we release the SLRU 
> > ControlLock,
> >which we do before SlruScanDirCbDeleteCutoff() uses the cutoff to 
> > evaluate
> >segments for deletion.  Once we release that lock, latest_page_number can
> >advance.  This creates a TOCTOU race condition, allowing excess deletion:
> > 
> >[local] test=# table trunc_clog_concurrency ;
> >ERROR:  could not access status of transaction 2149484247
> >DETAIL:  Could not open file "pg_xact/0801": No such file or directory.
> 
> > Fixes are available:
> 
> > b. Arrange so only one backend runs vac_truncate_clog() at a time.  Other 
> > than
> >AsyncCtl, every SLRU truncation appears in vac_truncate_clog(), in a
> >checkpoint, or in the startup process.  Hence, also arrange for only one
> >backend to call SimpleLruTruncate(AsyncCtl) at a time.
> 
> More specifically, restrict vac_update_datfrozenxid() to one backend per
> database, and restrict vac_truncate_clog() and asyncQueueAdvanceTail() to one
> backend per cluster.  This, attached, was rather straightforward.

Rebased.  The conflicts were limited to comments and documentation.

> I wonder about performance in a database with millions of small relations,
> particularly considering my intent to back-patch this.  In such databases,
> vac_update_datfrozenxid() can be a major part of the VACUUM's cost.  Two
> things work in our favor.  First, vac_update_datfrozenxid() runs once per
> VACUUM command, not once per relation.  Second, Autovacuum has this logic:
> 
>* ... we skip
>* this if (1) we found no work to do and (2) we skipped at least one
>* table due to concurrent autovacuum activity.  In that case, the other
>* worker has already done it, or will do so when it finishes.
>*/
>   if (did_vacuum || !found_concurrent_worker)
>   vac_update_datfrozenxid();
> 
> That makes me relatively unworried.  I did consider some alternatives:
> 
> - Run vac_update_datfrozenxid()'s pg_class scan before taking a lock.  If we
>   find the need for pg_database updates, take the lock and scan pg_class again
>   to get final numbers.  This doubles the work in cases that end up taking the
>   lock, so I'm not betting it being a net win.
> 
> - Use LockWaiterCount() to skip vac_update_datfrozenxid() if some other
>   backend is already waiting.  This is similar to Autovacuum's
>   found_concurrent_worker test.  It is tempting.  I'm not proposing it,
>   because it changes the states possible when manual VACUUM completes.  Today,
>   you can predict post-VACUUM datfrozenxid from post-VACUUM relfrozenxid
>   values.  If manual VACUUM could skip vac_update_datfrozenxid() this way,
>   datfrozenxid could lag until some concurrent VACUUM finishes.
commit 647b5e3 (HEAD)
Author: Noah Misch 
AuthorDate: Fri Jun 28 10:00:53 2019 -0700
Commit: Noah Misch 
CommitDate: Fri Jun 28 10:00:53 2019 -0700

Prevent concurrent SimpleLruTruncate() for any given SLRU.

The SimpleLruTruncate() header comment states the new coding rule.  This
closes race conditions that presented rare opportunities for data loss.
Symptoms and recommendations to users mirror the previous commit[1].
Back-patch to 9.4 (all supported versions).

Reviewed by TBD.

Discussion: https://postgr.es/m/TBD

[1] slru-truncate-modulo-v2.patch of 
https://postgr.es/m/20190217040913.ga1305...@rfd.leadboat.com

diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index bf72d0c..dc82008 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -885,7 +885,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34 
  0:00 postgres: ser
 
   

-LWLock
+LWLock
 ShmemIndexLock
 Waiting to find or allocate space in shared memory.

@@ -1080,6 +1080,16 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  the oldest transaction id available to it.
 
 
+ WrapLimitsVacuumLock
+ Waiting to update limits on transaction id and multixact
+ consumption.
+
+
+ AsyncQueueTailLock
+ Waiting to update limit on notification message
+ storage.
+
+
  clog
  Waiting for I/O on a clog (transaction status) buffer.
 
@@ -1169,7 +1179,7 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   
11:34   0:00 postgres: ser
  counters during Parallel Hash plan execution.
 
 
- Lock
+ Lock
  relation

Re: Avoid full GIN index scan when possible

2019-06-28 Thread Julien Rouhaud
On Fri, Jun 28, 2019 at 6:10 PM Tomas Vondra
 wrote:
>
> I've briefly looked at the patch today. I think the idea is worthwhile,

Thanks!

> 2) I'm not sure it's a good idea to add dependency on a specific AM type
> into indxpath.c. At the moment there are only two places, both referring
> to BTREE_AM_OID, do we really hard-code another OID here?
>
> I wonder if this could be generalized to another support proc in the
> inde AM API, with just GIN implementing it.

Yes, this patch was more a POC than anything, to discuss the approach
before spending too much time on infrastructure.  I considered another
support function, but I'm still unclear of how useful it'd be for
custom AM (as I don't see any use for that for the vanilla one I
think), or whether if this should be opclass specific or not.

> 3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
> as it's only for functions computing selectivity estimates (and funcs
> directly related to that). And the new function is not related to that
> at all, so why not to define it in indxpath.c directly?

I kept this function in selfuncs.c as it's using some private
functions (gincost_opexpr and gincost_scalararrayopexpr) used by
gincostestimate.  That seemed the simplest approach at this stage.
BTW there's also an ongoing discussion to move the (am)estimate
functions in AM-specific files [1], so that'll directly impact this
too.

> 4) The gin_get_optimizable_quals is quite misleading. Firstly, it's not
> very obvious what "optimizable" means in this context, but that's a
> minor issue. The bigger issue is that it's a lie - when there are no
> "optimizable" clauses (so all clauses would require full scan) the
> function returns the original list, which is by definition completely
> non-optimizable.

The comment is hopefully clearer about what this function does, but
definitely this name is terrible.  I'll try to come up with a better
one.

[1] https://www.postgresql.org/message-id/4079.1561661677%40sss.pgh.pa.us




Re: Avoid full GIN index scan when possible

2019-06-28 Thread Tomas Vondra

Hi,

I've briefly looked at the patch today. I think the idea is worthwhile,
but I found a couple of issues with the patch:


1) The index_selfuncs.h header is included in the wrong place, it should
be included before lsyscache.h (because 'i' < 'l').


2) I'm not sure it's a good idea to add dependency on a specific AM type
into indxpath.c. At the moment there are only two places, both referring
to BTREE_AM_OID, do we really hard-code another OID here?

I wonder if this could be generalized to another support proc in the
inde AM API, with just GIN implementing it.


3) selfuncs.c is hardly the right place for gin_get_optimizable_quals,
as it's only for functions computing selectivity estimates (and funcs
directly related to that). And the new function is not related to that
at all, so why not to define it in indxpath.c directly?

Of course, if it gets into the index AM API then this would disappear.


4) The gin_get_optimizable_quals is quite misleading. Firstly, it's not
very obvious what "optimizable" means in this context, but that's a
minor issue. The bigger issue is that it's a lie - when there are no
"optimizable" clauses (so all clauses would require full scan) the
function returns the original list, which is by definition completely
non-optimizable.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 





Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Pavel Stehule
pá 28. 6. 2019 v 17:30 odesílatel Tom Lane  napsal:

> Daniel Gustafsson  writes:
> >> On 28 Jun 2019, at 16:49, Luis Carril  wrote:
> >> pg_dump ignores the dumping of data in foreign tables
> >> on purpose, this patch makes it optional as the user maybe
> >> wants to manage the data in the foreign servers directly from
> >> Postgres. Opinions?
>
> > Wouldn’t that have the potential to make restores awkward for FDWs that
> aren’t
> > writeable?
>
> Yeah, I think the feature as-proposed is a shotgun that's much more likely
> to cause problems than solve them.  Almost certainly, what people would
> really need is the ability to dump individual foreign tables' data not
> everything.  (I also note that the main reason for "dump everything",
> namely to get a guaranteed-consistent snapshot, isn't really valid for
> foreign tables anyhow.)
>

I agree so major usage is dumping data. But can be interesting some
transformation from foreign table to classic table (when schema was created
by IMPORT FOREIGN SCHEMA).


> I'm tempted to suggest that the way to approach this is to say that if you
> explicitly select some foreign table(s) with "-t", then we'll dump their
> data, unless you suppress that with "-s".  No new switch needed.
>
> Another way of looking at it, which responds more directly to Daniel's
> point about non-writable FDWs, could be to have a switch that says "dump
> foreign tables' data if their FDW is one of these".
>

Restoring content of FDW table via pg_restore or psql can be dangerous -
there I see a risk, and can be nice to allow it only with some form of
safeguard.

I think so important questions is motivation for dumping FDW - a) clonning
(has sense for me and it is safe), b) real backup (requires writeable FDW)
- has sense too, but I see a possibility of unwanted problems.

Regards

Pavel

>
> regards, tom lane
>
>
>


Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread Stephen Frost
Greetings,

* Thomas Munro (thomas.mu...@gmail.com) wrote:
> If the thread for a CF entry began
> before you were subscribed, you might be able to download the whole
> thread as a mailbox file and import it into your email client so that
> you can reply to the thread; if you can't do that (it can be
> tricky/impossible on some email clients), ping me and I'll CC you so
> you can reply.

sh, don't look now, but there might be a "Resend email" button in
the archives now that you can click to have an email sent to you...

Note that you have to be logged in, and the email will go to the email
address that you're logging into the community auth system with.

(thank you Magnus)

Thanks!

Stephen


signature.asc
Description: PGP signature


Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Julien Rouhaud
On Fri, Jun 28, 2019 at 4:39 PM Julien Rouhaud  wrote:
>
> On Tue, Mar 19, 2019 at 3:51 PM Julien Rouhaud  wrote:
> >
> > On Mon, Mar 18, 2019 at 7:33 PM Julien Rouhaud  wrote:
> > >
> > > On Mon, Mar 18, 2019 at 6:23 PM Yun Li  wrote:
> > > >
> > > > Let's take one step back. Since queryId is stored in core as Julien 
> > > > pointed out, can we just add that global to the pg_stat_get_activity 
> > > > and ultimately exposed in pg_stat_activity view? Then no matter whether 
> > > > PGSS  is on or off, or however the customer extensions are updating 
> > > > that filed, we expose that field in that view then enable user to 
> > > > leverage that id to join with pgss or their extension. Will this sounds 
> > > > a good idea?
> > >
> > > I'd greatly welcome expose queryid exposure in pg_stat_activity, and
> > > also in log_line_prefix.  I'm afraid that it's too late for pg12
> > > inclusion, but I'll be happy to provide a patch for that for pg13.
> >
> > Here's a prototype patch for queryid exposure in pg_stat_activity and
> > log_line prefix.
>
> Patch doesn't apply anymore, PFA rebased v2.

Sorry, I missed the new pg_stat_gssapi view.
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..d68b492c25 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6353,6 +6353,11 @@ local0.*/var/log/postgresql
  session processes
  no
 
+
+ %Q
+ queryid: identifier of session's current query, if any
+ yes
+
 
  %%
  Literal %
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index bf72d0c303..d4e3d70933 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -824,6 +824,19 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  xid
  The current backend's xmin horizon.
 
+
+ queryid
+ bigint
+ Identifier this backend's most recent query. If
+  state is active this field
+  shows the identifier of the currently executing query. In all other
+  states, it shows the identifier of last query that was executed, unless
+  an error occured which will reset this field to 0.  By default, query
+  identifiers are not computed, so this field will always display 0, unless
+  an additional module that compute query identifiers, such as , is configured.
+ 
+
 
  query
  text
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ea4c85e395..f30098c2cd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -749,6 +749,7 @@ CREATE VIEW pg_stat_activity AS
 S.state,
 S.backend_xid,
 s.backend_xmin,
+S.queryid,
 S.query,
 S.backend_type
 FROM pg_stat_get_activity(NULL) AS S
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27f0345515..44c9525a59 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -143,6 +143,8 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
 void
 ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorStart_hook)
 		(*ExecutorStart_hook) (queryDesc, eflags);
 	else
@@ -303,6 +305,8 @@ ExecutorRun(QueryDesc *queryDesc,
 			ScanDirection direction, uint64 count,
 			bool execute_once)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorRun_hook)
 		(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
 	else
@@ -402,6 +406,8 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 void
 ExecutorFinish(QueryDesc *queryDesc)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorFinish_hook)
 		(*ExecutorFinish_hook) (queryDesc);
 	else
@@ -462,6 +468,8 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 void
 ExecutorEnd(QueryDesc *queryDesc)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorEnd_hook)
 		(*ExecutorEnd_hook) (queryDesc);
 	else
@@ -541,6 +549,8 @@ ExecutorRewind(QueryDesc *queryDesc)
 	/* It's probably not sensible to rescan updating queries */
 	Assert(queryDesc->operation == CMD_SELECT);
 
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	/*
 	 * Switch into per-query memory context
 	 */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 8eedb613a1..1d8c859a88 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -24,6 +24,7 @@
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
+#include "storage/proc.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
 #include "utils/builtins.h"
@@ -1940,6 +1941,7 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr 

Re: [HACKERS] Regression tests vs existing users in an installation

2019-06-28 Thread Stephen Frost
Greetings,

* Tom Lane (t...@sss.pgh.pa.us) wrote:
> Furthermore, while you can do "make install" and "make installcheck"
> in this directory or its children, it is HIGHLY NOT ADVISED to do so
> with a server containing valuable data.  Some of these tests may have
> undesirable side-effects on roles or other global objects within the
> tested server.
> 
> Defining things this way also makes it a non-problem that
> src/test/modules/test_pg_dump creates global objects and doesn't drop
> them.

Sounds like a good approach to me and I'm happy that it'd address the
test_pg_dump case too.

> Now, this doesn't in itself fix the problem that my proposed patch will
> emit warnings about the rolenames test script creating "Public" and so on.
> We could fix that by maintaining a variant expected-file that includes
> those warnings, but probably a less painful answer is just to jack
> client_min_messages up to ERROR for that short segment of the test script.

Seems alright.

> We could make the new subdirectory be something specific like
> "src/test/modules/test_rolenames", but I think very likely we'll be
> wanting some additional test scripts that we likewise deem unsafe to
> run during "installcheck".  So I'd rather choose a more generic module
> name, but I'm not sure what ... "unsafe_tests"?

Agreed but haven't got any particularly good suggestions on names..

Thanks,

Stephen


signature.asc
Description: PGP signature


Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Tom Lane
Daniel Gustafsson  writes:
>> On 28 Jun 2019, at 16:49, Luis Carril  wrote:
>> pg_dump ignores the dumping of data in foreign tables
>> on purpose, this patch makes it optional as the user maybe 
>> wants to manage the data in the foreign servers directly from 
>> Postgres. Opinions?

> Wouldn’t that have the potential to make restores awkward for FDWs that aren’t
> writeable?

Yeah, I think the feature as-proposed is a shotgun that's much more likely
to cause problems than solve them.  Almost certainly, what people would
really need is the ability to dump individual foreign tables' data not
everything.  (I also note that the main reason for "dump everything",
namely to get a guaranteed-consistent snapshot, isn't really valid for
foreign tables anyhow.)

I'm tempted to suggest that the way to approach this is to say that if you
explicitly select some foreign table(s) with "-t", then we'll dump their
data, unless you suppress that with "-s".  No new switch needed.

Another way of looking at it, which responds more directly to Daniel's
point about non-writable FDWs, could be to have a switch that says "dump
foreign tables' data if their FDW is one of these".

regards, tom lane




Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Pavel Stehule
pá 28. 6. 2019 v 17:17 odesílatel Daniel Gustafsson 
napsal:

> > On 28 Jun 2019, at 16:49, Luis Carril  wrote:
>
> >   pg_dump ignores the dumping of data in foreign tables
> >   on purpose, this patch makes it optional as the user maybe
> >   wants to manage the data in the foreign servers directly from
> >   Postgres. Opinions?
>
> Wouldn’t that have the potential to make restores awkward for FDWs that
> aren’t
> writeable?  Basically, how can the risk of foot-gunning be minimized to
> avoid
> users ending up with dumps that are hard to restore?
>

It can be used for migrations, porting, testing (where FDW sources are not
accessible).

pg_dump has not any safeguards against bad usage. But this feature has
sense only if foreign tables are dumped as classic tables - so some special
option is necessary

Pavel

>
> cheers ./daniel
>
>


Re: subscriptionCheck failures on nightjar

2019-06-28 Thread Tom Lane
I wrote:
> My animal dromedary just reproduced this failure, which we've previously
> only seen on nightjar.
> https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2019-06-26%2023%3A57%3A45

Twice:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=dromedary=2019-06-28%2006%3A50%3A41

Since this is a live (if old) system, not some weird qemu emulation,
we can no longer pretend that it might not be reachable in the field.
I've added an open item.

regards, tom lane




Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Pavel Stehule
Hi

pá 28. 6. 2019 v 16:50 odesílatel Luis Carril 
napsal:

> Hello,
>   pg_dump ignores the dumping of data in foreign tables
>   on purpose, this patch makes it optional as the user maybe
>   wants to manage the data in the foreign servers directly from
>   Postgres. Opinions?
>

It has sense for me

Pavel

>
> Cheers
> Luis M Carril
>


Re: Option to dump foreign data in pg_dump

2019-06-28 Thread Daniel Gustafsson
> On 28 Jun 2019, at 16:49, Luis Carril  wrote:

>   pg_dump ignores the dumping of data in foreign tables
>   on purpose, this patch makes it optional as the user maybe 
>   wants to manage the data in the foreign servers directly from 
>   Postgres. Opinions?

Wouldn’t that have the potential to make restores awkward for FDWs that aren’t
writeable?  Basically, how can the risk of foot-gunning be minimized to avoid
users ending up with dumps that are hard to restore?

cheers ./daniel



proposal - patch: psql - sort_by_size

2019-06-28 Thread Pavel Stehule
Hi

I returned to possibility to sort output of \d* and \l by size. There was
more a experiments in this area, but without success. Last patch was
example of over engineering, and now, I try to implement this feature
simply how it is possible. I don't think so we need too complex solution -
if somebody needs specific report, then it is not hard to run psql with
"-E" option, get and modify used query (and use a power of SQL). But
displaying databases objects sorted by size is very common case.

This proposal is based on new psql variable "SORT_BY_SIZE". This variable
will be off by default. The value of this variable is used only in verbose
mode (when the size is displayed - I don't see any benefit sort of size
without showing size). Usage is very simple and implementation too:

\dt -- sorted by schema, name
\dt+ -- still sorted  by schema, name

\set SORT_BY_SIZE on
\dt -- sorted by schema, name (size is not calculated and is not visible)
\dt+ -- sorted by size

\dt+ public.* -- sorted by size from schema public

Comments, notes?

Regards

Pavel
diff --git a/doc/src/sgml/ref/psql-ref.sgml b/doc/src/sgml/ref/psql-ref.sgml
index c6c20de243..14fae1abd3 100644
--- a/doc/src/sgml/ref/psql-ref.sgml
+++ b/doc/src/sgml/ref/psql-ref.sgml
@@ -3959,6 +3959,17 @@ bar
 
   
 
+  
+SORT_BY_SIZE
+
+
+Setting this variable to on causes so results of
+\d* commands will be sorted by size, when size
+is displayed.
+
+
+  
+
   
SQLSTATE

diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index 97167d2c4b..be149391a1 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -223,6 +223,7 @@ describeTablespaces(const char *pattern, bool verbose)
 	PQExpBufferData buf;
 	PGresult   *res;
 	printQueryOpt myopt = pset.popt;
+	const char *sizefunc = NULL;
 
 	if (pset.sversion < 8)
 	{
@@ -265,9 +266,12 @@ describeTablespaces(const char *pattern, bool verbose)
 		  gettext_noop("Options"));
 
 	if (verbose && pset.sversion >= 90200)
+	{
 		appendPQExpBuffer(,
 		  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_tablespace_size(oid)) AS \"%s\"",
 		  gettext_noop("Size"));
+		sizefunc = "pg_catalog.pg_tablespace_size(oid)";
+	}
 
 	if (verbose && pset.sversion >= 80200)
 		appendPQExpBuffer(,
@@ -281,7 +285,10 @@ describeTablespaces(const char *pattern, bool verbose)
 		  NULL, "spcname", NULL,
 		  NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+	if (pset.sort_by_size && sizefunc)
+		appendPQExpBuffer(, "ORDER BY %s DESC;", sizefunc);
+	else
+		appendPQExpBufferStr(, "ORDER BY 1;");
 
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
@@ -863,6 +870,7 @@ listAllDbs(const char *pattern, bool verbose)
 	PGresult   *res;
 	PQExpBufferData buf;
 	printQueryOpt myopt = pset.popt;
+	const char *sizefunc = NULL;
 
 	initPQExpBuffer();
 
@@ -882,12 +890,15 @@ listAllDbs(const char *pattern, bool verbose)
 	appendPQExpBufferStr(, "   ");
 	printACLColumn(, "d.datacl");
 	if (verbose && pset.sversion >= 80200)
+	{
 		appendPQExpBuffer(,
 		  ",\n   CASE WHEN pg_catalog.has_database_privilege(d.datname, 'CONNECT')\n"
 		  "THEN pg_catalog.pg_size_pretty(pg_catalog.pg_database_size(d.datname))\n"
 		  "ELSE 'No Access'\n"
 		  "   END as \"%s\"",
 		  gettext_noop("Size"));
+		sizefunc = "pg_catalog.pg_database_size(d.datname)";
+	}
 	if (verbose && pset.sversion >= 8)
 		appendPQExpBuffer(,
 		  ",\n   t.spcname as \"%s\"",
@@ -906,7 +917,10 @@ listAllDbs(const char *pattern, bool verbose)
 		processSQLNamePattern(pset.db, , pattern, false, false,
 			  NULL, "d.datname", NULL, NULL);
 
-	appendPQExpBufferStr(, "ORDER BY 1;");
+	if (pset.sort_by_size && sizefunc)
+		appendPQExpBuffer(, "ORDER BY %s DESC;", sizefunc);
+	else
+		appendPQExpBufferStr(, "ORDER BY 1;");
 	res = PSQLexec(buf.data);
 	termPQExpBuffer();
 	if (!res)
@@ -3628,6 +3642,7 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 	bool		showMatViews = strchr(tabtypes, 'm') != NULL;
 	bool		showSeq = strchr(tabtypes, 's') != NULL;
 	bool		showForeign = strchr(tabtypes, 'E') != NULL;
+	const char *sizefunc = NULL;
 
 	PQExpBufferData buf;
 	PGresult   *res;
@@ -3685,13 +3700,19 @@ listTables(const char *tabtypes, const char *pattern, bool verbose, bool showSys
 		 * size of a table, including FSM, VM and TOAST tables.
 		 */
 		if (pset.sversion >= 9)
+		{
 			appendPQExpBuffer(,
 			  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_table_size(c.oid)) as \"%s\"",
 			  gettext_noop("Size"));
+			sizefunc = "pg_catalog.pg_table_size(c.oid)";
+		}
 		else if (pset.sversion >= 80100)
+		{
 			appendPQExpBuffer(,
 			  ",\n  pg_catalog.pg_size_pretty(pg_catalog.pg_relation_size(c.oid)) as \"%s\"",
 			  gettext_noop("Size"));
+			sizefunc = "pg_catalog.pg_relation_size(c.oid)";
+		}
 
 		appendPQExpBuffer(,
 	

Option to dump foreign data in pg_dump

2019-06-28 Thread Luis Carril
Hello,
  pg_dump ignores the dumping of data in foreign tables
  on purpose, this patch makes it optional as the user maybe
  wants to manage the data in the foreign servers directly from
  Postgres. Opinions?

Cheers
Luis M Carril
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index db30b54a92..a21027a61d 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -159,6 +159,7 @@ typedef struct _dumpOptions
 	int			use_setsessauth;
 	int			enable_row_security;
 	int			load_via_partition_root;
+	int			include_foreign_data;
 
 	/* default, if no "inclusion" switches appear, is to dump everything */
 	bool		include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8909a45d61..8854f70305 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -391,6 +391,7 @@ main(int argc, char **argv)
 		{"no-sync", no_argument, NULL, 7},
 		{"on-conflict-do-nothing", no_argument, _nothing, 1},
 		{"rows-per-insert", required_argument, NULL, 10},
+		{"include-foreign-data", no_argument, _foreign_data, 1},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -1038,6 +1039,7 @@ help(const char *progname)
 	printf(_("  --use-set-session-authorization\n"
 			 "   use SET SESSION AUTHORIZATION commands instead of\n"
 			 "   ALTER OWNER commands to set ownership\n"));
+	printf(_("  --include-foreign-data   include data of foreign tables in dump\n"));
 
 	printf(_("\nConnection options:\n"));
 	printf(_("  -d, --dbname=DBNAME  database to dump\n"));
@@ -1812,7 +1814,7 @@ dumpTableData_copy(Archive *fout, void *dcontext)
 	 */
 	column_list = fmtCopyColumnList(tbinfo, clistBuf);
 
-	if (tdinfo->filtercond)
+	if (tdinfo->filtercond || tbinfo->relkind == RELKIND_FOREIGN_TABLE)
 	{
 		/* Note: this syntax is only supported in 8.2 and up */
 		appendPQExpBufferStr(q, "COPY (SELECT ");
@@ -1824,9 +1826,11 @@ dumpTableData_copy(Archive *fout, void *dcontext)
 		}
 		else
 			appendPQExpBufferStr(q, "* ");
-		appendPQExpBuffer(q, "FROM %s %s) TO stdout;",
-		  fmtQualifiedDumpable(tbinfo),
-		  tdinfo->filtercond);
+
+		appendPQExpBuffer(q, "FROM %s", fmtQualifiedDumpable(tbinfo));
+		if (tdinfo->filtercond)
+			appendPQExpBuffer(q, " %s", tdinfo->filtercond);
+		appendPQExpBuffer(q, ") TO stdout;");
 	}
 	else
 	{
@@ -2343,7 +2347,7 @@ makeTableDataInfo(DumpOptions *dopt, TableInfo *tbinfo)
 	if (tbinfo->relkind == RELKIND_VIEW)
 		return;
 	/* Skip FOREIGN TABLEs (no data to dump) */
-	if (tbinfo->relkind == RELKIND_FOREIGN_TABLE)
+	if (tbinfo->relkind == RELKIND_FOREIGN_TABLE && !dopt->include_foreign_data)
 		return;
 	/* Skip partitioned tables (data in partitions) */
 	if (tbinfo->relkind == RELKIND_PARTITIONED_TABLE)
diff --git a/src/bin/pg_dump/pg_dumpall.c b/src/bin/pg_dump/pg_dumpall.c
index 24719cefe2..14d62e9781 100644
--- a/src/bin/pg_dump/pg_dumpall.c
+++ b/src/bin/pg_dump/pg_dumpall.c
@@ -82,6 +82,7 @@ static int	no_role_passwords = 0;
 static int	server_version;
 static int	load_via_partition_root = 0;
 static int	on_conflict_do_nothing = 0;
+static int	include_foreign_data = 0;
 
 static char role_catalog[10];
 #define PG_AUTHID "pg_authid"
@@ -147,6 +148,7 @@ main(int argc, char *argv[])
 		{"no-unlogged-table-data", no_argument, _unlogged_table_data, 1},
 		{"on-conflict-do-nothing", no_argument, _conflict_do_nothing, 1},
 		{"rows-per-insert", required_argument, NULL, 7},
+		{"include-foreign-data", no_argument, _foreign_data, 1},
 
 		{NULL, 0, NULL, 0}
 	};
@@ -432,6 +434,8 @@ main(int argc, char *argv[])
 		appendPQExpBufferStr(pgdumpopts, " --no-unlogged-table-data");
 	if (on_conflict_do_nothing)
 		appendPQExpBufferStr(pgdumpopts, " --on-conflict-do-nothing");
+	if (include_foreign_data)
+		appendPQExpBufferStr(pgdumpopts, " --include-foreign-data");
 
 	/*
 	 * If there was a database specified on the command line, use that,
@@ -658,6 +662,7 @@ help(void)
 	printf(_("  --use-set-session-authorization\n"
 			 "   use SET SESSION AUTHORIZATION commands instead of\n"
 			 "   ALTER OWNER commands to set ownership\n"));
+	printf(_("  --include-foreign-data   include data of foreign tables in dump\n"));
 
 	printf(_("\nConnection options:\n"));
 	printf(_("  -d, --dbname=CONNSTR connect using connection string\n"));


Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?

2019-06-28 Thread Julien Rouhaud
On Tue, Mar 19, 2019 at 3:51 PM Julien Rouhaud  wrote:
>
> On Mon, Mar 18, 2019 at 7:33 PM Julien Rouhaud  wrote:
> >
> > On Mon, Mar 18, 2019 at 6:23 PM Yun Li  wrote:
> > >
> > > Let's take one step back. Since queryId is stored in core as Julien 
> > > pointed out, can we just add that global to the pg_stat_get_activity and 
> > > ultimately exposed in pg_stat_activity view? Then no matter whether PGSS  
> > > is on or off, or however the customer extensions are updating that filed, 
> > > we expose that field in that view then enable user to leverage that id to 
> > > join with pgss or their extension. Will this sounds a good idea?
> >
> > I'd greatly welcome expose queryid exposure in pg_stat_activity, and
> > also in log_line_prefix.  I'm afraid that it's too late for pg12
> > inclusion, but I'll be happy to provide a patch for that for pg13.
>
> Here's a prototype patch for queryid exposure in pg_stat_activity and
> log_line prefix.

Patch doesn't apply anymore, PFA rebased v2.
diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..d68b492c25 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -6353,6 +6353,11 @@ local0.*/var/log/postgresql
  session processes
  no
 
+
+ %Q
+ queryid: identifier of session's current query, if any
+ yes
+
 
  %%
  Literal %
diff --git a/doc/src/sgml/monitoring.sgml b/doc/src/sgml/monitoring.sgml
index bf72d0c303..d4e3d70933 100644
--- a/doc/src/sgml/monitoring.sgml
+++ b/doc/src/sgml/monitoring.sgml
@@ -824,6 +824,19 @@ postgres   27093  0.0  0.0  30096  2752 ?Ss   11:34   0:00 postgres: ser
  xid
  The current backend's xmin horizon.
 
+
+ queryid
+ bigint
+ Identifier this backend's most recent query. If
+  state is active this field
+  shows the identifier of the currently executing query. In all other
+  states, it shows the identifier of last query that was executed, unless
+  an error occured which will reset this field to 0.  By default, query
+  identifiers are not computed, so this field will always display 0, unless
+  an additional module that compute query identifiers, such as , is configured.
+ 
+
 
  query
  text
diff --git a/src/backend/catalog/system_views.sql b/src/backend/catalog/system_views.sql
index ea4c85e395..f30098c2cd 100644
--- a/src/backend/catalog/system_views.sql
+++ b/src/backend/catalog/system_views.sql
@@ -749,6 +749,7 @@ CREATE VIEW pg_stat_activity AS
 S.state,
 S.backend_xid,
 s.backend_xmin,
+S.queryid,
 S.query,
 S.backend_type
 FROM pg_stat_get_activity(NULL) AS S
diff --git a/src/backend/executor/execMain.c b/src/backend/executor/execMain.c
index 27f0345515..44c9525a59 100644
--- a/src/backend/executor/execMain.c
+++ b/src/backend/executor/execMain.c
@@ -143,6 +143,8 @@ static void EvalPlanQualStart(EPQState *epqstate, EState *parentestate,
 void
 ExecutorStart(QueryDesc *queryDesc, int eflags)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorStart_hook)
 		(*ExecutorStart_hook) (queryDesc, eflags);
 	else
@@ -303,6 +305,8 @@ ExecutorRun(QueryDesc *queryDesc,
 			ScanDirection direction, uint64 count,
 			bool execute_once)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorRun_hook)
 		(*ExecutorRun_hook) (queryDesc, direction, count, execute_once);
 	else
@@ -402,6 +406,8 @@ standard_ExecutorRun(QueryDesc *queryDesc,
 void
 ExecutorFinish(QueryDesc *queryDesc)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorFinish_hook)
 		(*ExecutorFinish_hook) (queryDesc);
 	else
@@ -462,6 +468,8 @@ standard_ExecutorFinish(QueryDesc *queryDesc)
 void
 ExecutorEnd(QueryDesc *queryDesc)
 {
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	if (ExecutorEnd_hook)
 		(*ExecutorEnd_hook) (queryDesc);
 	else
@@ -541,6 +549,8 @@ ExecutorRewind(QueryDesc *queryDesc)
 	/* It's probably not sensible to rescan updating queries */
 	Assert(queryDesc->operation == CMD_SELECT);
 
+	pg_atomic_write_u64(>queryId, queryDesc->plannedstmt->queryId);
+
 	/*
 	 * Switch into per-query memory context
 	 */
diff --git a/src/backend/executor/spi.c b/src/backend/executor/spi.c
index 8eedb613a1..1d8c859a88 100644
--- a/src/backend/executor/spi.c
+++ b/src/backend/executor/spi.c
@@ -24,6 +24,7 @@
 #include "executor/executor.h"
 #include "executor/spi_priv.h"
 #include "miscadmin.h"
+#include "storage/proc.h"
 #include "tcop/pquery.h"
 #include "tcop/utility.h"
 #include "utils/builtins.h"
@@ -1940,6 +1941,7 @@ _SPI_prepare_plan(const char *src, SPIPlanPtr plan)
 	List	   *plancache_list;
 	ListCell   *list_item;
 	ErrorContextCallback spierrcontext;
+	uint64		old_queryId = pg_atomic_read_u64(>queryId);
 
 	/*
 

Superfluous libpq-be.h include in GSSAPI code

2019-06-28 Thread Daniel Gustafsson
backend/libpq/be-secure-gssapi.c is including both libpq-be.h and libpq.h,
which makes libpq-be.h superfluous as it gets included via libpq.h.  The
attached patch removes the inclusion of libpq-be.h to make be-secure-gssapi.c
behave like other files which need both headers.

cheers ./daniel



gssapi_header.patch
Description: Binary data


Re: Avoid full GIN index scan when possible

2019-06-28 Thread Julien Rouhaud
On Sun, Mar 24, 2019 at 11:52 AM Julien Rouhaud  wrote:
>
> Marc (in Cc) reported me a problematic query using a GIN index hit in
> production.  The issue is that even if an GIN opclass says that the
> index can be used for an operator, it's still possible that some
> values aren't really compatible and requires a full index scan.
>
> One simple example is with a GIN pg_trgm index (but other opclasses
> have similar restrictions) , doing a LIKE with wildcard on both side,
> where the pattern is shorter than a trigram, e.g. col LIKE '%a%'.  So,
> a where clause of the form:
>
> WHERE col LIKE '%verylongpattern%' AND col LIKE '%a%'
>
> is much more expensive than
>
> WHERE col LKE '%verylongpattern%'
>
> While there's nothing to do if the unhandled const is the only
> predicate, if there are multiple AND-ed predicates and at least one of
> them doesn't require a full index scan, we can avoid it.
>
> Attached patch tries to fix the issue by detecting such cases and
> dropping the unhandled quals in the BitmapIndexScan, letting the
> recheck in BitmapHeapScan do the proper filtering.  I'm not happy to
> call the extractQuery support functions an additional time, but i
> didn't find a cleaner way.  This is of course intended for pg13.

Patch doesn't apply anymore (thanks cfbot).  Rebased patch attached.
diff --git a/contrib/pg_trgm/expected/pg_trgm.out b/contrib/pg_trgm/expected/pg_trgm.out
index b3e709f496..6d96fca65f 100644
--- a/contrib/pg_trgm/expected/pg_trgm.out
+++ b/contrib/pg_trgm/expected/pg_trgm.out
@@ -3498,6 +3498,37 @@ select count(*) from test_trgm where t ~ '[qwerty]{2}-?[qwerty]{2}';
   1000
 (1 row)
 
+explain (costs off)
+  select * from test_trgm where t like '%0%' and t like '%azerty%';
+ QUERY PLAN  
+-
+ Bitmap Heap Scan on test_trgm
+   Recheck Cond: (t ~~ '%azerty%'::text)
+   Filter: (t ~~ '%0%'::text)
+   ->  Bitmap Index Scan on trgm_idx
+ Index Cond: (t ~~ '%azerty%'::text)
+(5 rows)
+
+select * from test_trgm where t like '%0%' and t like '%azerty%';
+ t 
+---
+(0 rows)
+
+explain (costs off)
+  select * from test_trgm where t like '%0%' and t like '%az%';
+QUERY PLAN
+--
+ Bitmap Heap Scan on test_trgm
+   Recheck Cond: ((t ~~ '%0%'::text) AND (t ~~ '%az%'::text))
+   ->  Bitmap Index Scan on trgm_idx
+ Index Cond: ((t ~~ '%0%'::text) AND (t ~~ '%az%'::text))
+(4 rows)
+
+select * from test_trgm where t like '%0%' and t like '%az%';
+ t 
+---
+(0 rows)
+
 create table test2(t text COLLATE "C");
 insert into test2 values ('abcdef');
 insert into test2 values ('quark');
diff --git a/contrib/pg_trgm/sql/pg_trgm.sql b/contrib/pg_trgm/sql/pg_trgm.sql
index 08459e64c3..9cdb6fda14 100644
--- a/contrib/pg_trgm/sql/pg_trgm.sql
+++ b/contrib/pg_trgm/sql/pg_trgm.sql
@@ -55,6 +55,13 @@ select t,similarity(t,'gwertyu0988') as sml from test_trgm where t % 'gwertyu098
 select t,similarity(t,'gwertyu1988') as sml from test_trgm where t % 'gwertyu1988' order by sml desc, t;
 select count(*) from test_trgm where t ~ '[qwerty]{2}-?[qwerty]{2}';
 
+explain (costs off)
+  select * from test_trgm where t like '%0%' and t like '%azerty%';
+select * from test_trgm where t like '%0%' and t like '%azerty%';
+explain (costs off)
+  select * from test_trgm where t like '%0%' and t like '%az%';
+select * from test_trgm where t like '%0%' and t like '%az%';
+
 create table test2(t text COLLATE "C");
 insert into test2 values ('abcdef');
 insert into test2 values ('quark');
diff --git a/src/backend/optimizer/path/indxpath.c b/src/backend/optimizer/path/indxpath.c
index c208e9bfb0..82fdf44905 100644
--- a/src/backend/optimizer/path/indxpath.c
+++ b/src/backend/optimizer/path/indxpath.c
@@ -33,6 +33,7 @@
 #include "optimizer/prep.h"
 #include "optimizer/restrictinfo.h"
 #include "utils/lsyscache.h"
+#include "utils/index_selfuncs.h"
 #include "utils/selfuncs.h"
 
 
@@ -973,6 +974,24 @@ build_index_paths(PlannerInfo *root, RelOptInfo *rel,
 			return NIL;
 	}
 
+	/*
+	 * GIN access method can require a full index scan.  However, if there are
+	 * multiple AND-ed quals and at least one of them doesn't require a full
+	 * index scan, we can avoid the full scan by dropping all the quals
+	 * requiring it and let the recheck do the proper filtering.
+	 */
+	if (index_clauses != NIL && list_length(index_clauses) > 1 &&
+			index->relam == GIN_AM_OID)
+	{
+		Relids old_outer_relids = bms_copy(outer_relids);
+
+		bms_free(outer_relids);
+		outer_relids = bms_copy(rel->lateral_relids);
+
+		index_clauses = gin_get_optimizable_quals(root, index, index_clauses,
+_relids, old_outer_relids);
+	}
+
 	/* We do not want the index's rel itself listed in outer_relids */
 	outer_relids = bms_del_member(outer_relids, rel->relid);
 	/* Enforce convention that outer_relids is exactly NULL if empty */
diff --git 

Re: Conflict handling for COPY FROM

2019-06-28 Thread Alvaro Herrera
On 2019-Jun-28, Surafel Temesgen wrote:

> On Wed, Feb 20, 2019 at 7:04 PM Andres Freund  wrote:

> > On February 20, 2019 6:05:53 AM PST, Andrew Dunstan <
> > andrew.duns...@2ndquadrant.com> wrote:

> > >Why log to a file at all? We do have, you know, a database handy, where
> > >we might more usefully log errors. You could usefully log the offending
> > >row as an array of text, possibly.
> >
> > Or even just return it as a row. CopyBoth is relatively widely supported
> > these days.
>
> i think generating warning about it also sufficiently meet its propose of
> notifying user about skipped record with existing logging facility
> and we use it for similar propose in other place too. The different
> i see is the number of warning that can be generated

Warnings seem useless for this purpose.  I'm with Andres: returning rows
would make this a fine feature.  If the user wants the rows in a table
as Andrew suggests, she can use wrap the whole thing in an insert.

That would make the feature much more usable because you can do further
processing with the rows that conflict, if any is necessary (or throw
them away if not).  Putting them in warnings will just make the screen
scroll fast.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SQL/JSON path issues/questions

2019-06-28 Thread Alexander Korotkov
On Fri, Jun 28, 2019 at 9:01 AM Oleg Bartunov  wrote:
> On Fri, Jun 28, 2019 at 8:10 AM Alvaro Herrera  
> wrote:
> >
> > On 2019-Jun-28, Alexander Korotkov wrote:
> >
> > > On Tue, Jun 25, 2019 at 6:38 PM Liudmila Mantrova
> > >  wrote:
> > > > Thank you for the catch! Please see the modified version of patch 0004
> > > > attached.
> > >
> > > I tried to review and revise the part related to filters, but I failed
> > > because I don't understand the notions used in the documentation.
> > >
> > > What is the difference between filter expression and filter condition?
> > >  I can guess that filter expression contains question mark,
> > > parentheses and filter condition inside.  But this sentence is in
> > > contradiction with my guess: "A filter expression must be enclosed in
> > > parentheses and preceded by a question mark".  So, filter expression
> > > is inside the parentheses.  Then what is filter condition?  The same?
> >
> > The SQL standard defines "JSON filter expressions" (in 9.39 of the 2016
> > edition).  It does not use either term "filter condition" nor bare
> > "filter"; it uses "JSON path predicate" which is the part of the JSON
> > filter expression that is preceded by the question mark and enclosed by
> > parens.
>
> Yes, this is what I used in my talk
> http://www.sai.msu.su/~megera/postgres/talks/jsonpath-ibiza-2019.pdf
>
> >
> > Maybe we should stick with the standard terminology ...
>
> Sure.

+1

> As for the jsonpath documentation, I think we should remember, that
> jsonpath is a part of SQL/JSON, and in the following releases we will
> expand documentation to include SQL/JSON functions, so I suggest to
> have one chapter SQL/JSON with following structure:
> 1.  Introduction
> 1.1 SQL/JSON data model
> 1.2 SQL/JSON path language
> 1.3  -- to be added
> 2. PostgreSQL implementation
> 2.1  jsonpath data type   -- link from json data types
> 2.2 jsonpath  functions and operators -- link from functions
> 2.3 Indexing
>
> I plan to work on a separate chapter "JSON handling in PostgreSQL" for
> PG13,  which includes
> JSON(b) data types, functions, indexing and SQL/JSON.

It would be great if you manage to do this.

--
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company




Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions

2019-06-28 Thread Amit Kapila
On Tue, Jun 25, 2019 at 12:42 AM Stephen Frost  wrote:
>
> Greetings,
>
> * Tom Lane (t...@sss.pgh.pa.us) wrote:
> > Robert Haas  writes:
> > > On Fri, Jun 21, 2019 at 12:55 PM Tom Lane  wrote:
> > >> Ah, got it.  So it seems like the correct behavior might be for
> > >> ALTER SYSTEM to
> > >> (a) run through the whole file and remove any conflicting lines;
> > >> (b) append new setting at the end.
> >
> > > That is exactly the behavior for which I am arguing.  Stephen also
> > > wants a warning, but I disagree, because the warning is totally
> > > non-actionable.  It tells you that some tool, at some point in the
> > > past, did something bad. You can't do anything about that, and you
> > > wouldn't need to except for the arbitrary decision to label duplicate
> > > lines as bad in the first place.
> >
> > Agreed; there's no particular reason to consider the situation as wrong.
> > guc.c has always had the policy that dups are fine and the last one wins.
> > The very design of ALTER SYSTEM owes its workability to that policy, so
> > we can hardly say that A.S. should have a different policy internally.
> >

Both are similar but not sure if they are the same because in A.S we
are planning to remove the duplicate entries from file whereas I think
in other places that rule is used to just ignore the duplicates and
allow the last one to win.   Now, I think there is merit in giving
WARNING in this case as we are intentionally removing something which
user has added it.  However,  it is not clear what user is going to do
with that WARNING unless we have a system where we detect such a
situation, give WARNING and then allow the user to proceed in this
case with some option like FORCE.

> > The problem here is simply that ALTER SYSTEM is failing to consider the
> > possibility that there are dups in postgresql.auto.conf, and that seems
> > like little more than an oversight to be fixed.
> >
> > There's more than one way we could implement a fix, perhaps, but I don't
> > really see a reason to work harder than is sketched above.
>
> Why bother removing the duplicate lines?
>
> If ALTER SYSTEM should remove them, why shouldn't other tools?
>
> > (BTW, has anyone checked whether ALTER SYSTEM RESET is prepared to remove
> > multiple lines for the same var?)
>
> No, it doesn't handle that today either, as discussed earlier in this
> thread.
>

Right, it doesn't handle that today, but I think we can deal it along
with Alter System Set ...

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Stop ALTER SYSTEM from making bad assumptions

2019-06-28 Thread Amit Kapila
On Tue, Jun 25, 2019 at 7:31 AM Ian Barwick  wrote:
>
>  > In particular, in order to consider it unexpected, you have to suppose
>  >> that the content rules for postgresql.auto.conf are different from those
>  >> for postgresql.conf (wherein we clearly allow last-one-wins).  Can you
>  >> point to any user-facing documentation that says that?
>  >
>  > The backend and frontend tools don't modify postgresql.conf, and we
>  > don't document how to modify postgresql.auto.conf at *all*, even though
>  > we clearly now expect tool authors to go modifying it so that they can
>  > provide the same capabilities that pg_basebackup does and which they
>  > used to through recovery.conf, so I don't really see that as being
>  > comparable.
>  >
>  > The only thing we used to have to go on was what ALTER SYSTEM did, and
>  > then pg_basebackup went and did something different, and enough so that
>  > they ended up conflicting with each other, leading to this discussion.
>
> Or looking at it from another perspective - previously there was no
> particular use-case for appending to .auto.conf, until it (implicitly)
> became the only way of doing what recovery.conf used to do, and happened to
> expose the issue at hand.
>
> Leaving aside pg_basebackup and the whole issue of writing replication
> configuration, .auto.conf remains a text file which could potentially
> include duplicate entries, no matter how much we stipulate it shouldn't.
> As-is, ALTER SYSTEM fails to deal with this case, which in my opinion
> is a bug and a potential footgun which needs fixing.
>

I think there is an agreement that we should change it to remove
duplicates and add the new entry at the end.  However, we have not
reached an agreement on whether we should throw WARNING after removing
duplicates.

I think it is arguable that it was a bug in the first place in Alter
System as there is no way the duplicate lines can be there in
postgresql.auto.conf file before this feature or if someone ignores
the Warning on top of that file.   Having said that, I am in favor of
this change for the HEAD, but not sure if we should backpatch the same
as well by considering it as a bug-fix.

> (Though we'll still need to define/provide a way of writing configuration
> while the server is not running, which will be guaranteed to be read in last
> when it starts up).
>

Can you once verify if the current way of writing to
postgresql.auto.conf is safe in pg_basebackup?  It should ensure that
if there are any failures, partial wite problem while writing, then
the old file remains intact.  It is not clear to me if that is the
case with the current code of pg_basebackup, however the same is
ensured in Alter System code.  Because, if we haven't ensured it then
it is a problem for which we definitely need some fix.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: Implementing Incremental View Maintenance

2019-06-28 Thread Yugo Nagata
Hi Greg,

On Wed, 3 Apr 2019 17:41:36 -0400
Greg Stark  wrote:

> On Sun, 31 Mar 2019 at 23:22, Yugo Nagata  wrote:
> >
> > Firstly, this will handle simple definition views which includes only
> > selection, projection, and join.  Standard aggregations (count, sum, avg,
> > min, max) are not planned to be implemented in the first patch, but these
> > are commonly used in materialized views, so I'll implement them later on.
> 
> It's fine to not have all the features from day 1 of course. But I
> just picked up this comment and the followup talking about splitting
> AVG into SUM and COUNT and I had a comment. When you do look at
> tackling aggregates I don't think you should restrict yourself to
> these specific standard aggregations. We have all the necessary
> abstractions to handle all aggregations that are feasible, see
> https://www.postgresql.org/docs/devel/xaggr.html#XAGGR-MOVING-AGGREGATES
> 
> What you need to do -- I think -- is store the "moving aggregate
> state" before the final function. Then whenever a row is inserted or
> deleted or updated (or whenever another column is updated which causes
> the value to row to enter or leave the aggregation) apply either
> aggtransfn or aggminvtransfn to the state. I'm not sure if you want to
> apply the final function on every update or only lazily either may be
> better in some usage.

Thank you for your suggestion! I submitted the latest patch just now supporting 
some aggregate functions, but this supports only sum and count, and lacking a 
kind of generalization. However,  I would like to refine this to support more 
general aggregate functions. I think your suggestions is helpful for me to do 
this. Thank you!

Best regards,
Yugo Nagata


-- 
Yugo Nagata 




Re: Conflict handling for COPY FROM

2019-06-28 Thread Surafel Temesgen
On Wed, Feb 20, 2019 at 7:04 PM Andres Freund  wrote:

>
>
> On February 20, 2019 6:05:53 AM PST, Andrew Dunstan <
> andrew.duns...@2ndquadrant.com> wrote:
> >
> >On 2/20/19 8:01 AM, Surafel Temesgen wrote:
> >>
> >>
> >> On Tue, Feb 19, 2019 at 3:47 PM Andres Freund  >> > wrote:
> >>
> >>
> >>
> >> Err, what? Again, that requires super user permissions (in
> >> contrast to copy from/to stdin/out). Backends run as the user
> >> postgres runs under
> >>
> >>
> >>
> >> okay i see it now and modified the patch similarly
> >>
> >>
> >
> >
> >Why log to a file at all? We do have, you know, a database handy, where
> >we might more usefully log errors. You could usefully log the offending
> >row as an array of text, possibly.
>
> Or even just return it as a row. CopyBoth is relatively widely supported
> these days.
>
>
hello,
i think generating warning about it also sufficiently meet its propose of
notifying user about skipped record with existing logging facility
and we use it for similar propose in other place too. The different
i see is the number of warning that can be generated

In addition to the above change in the attached patch i also change
the syntax to ERROR LIMIT because it is no longer only skip
unique and exclusion constrain violation
regards
Surafel
diff --git a/doc/src/sgml/ref/copy.sgml b/doc/src/sgml/ref/copy.sgml
index 5e2992ddac..dc3b943279 100644
--- a/doc/src/sgml/ref/copy.sgml
+++ b/doc/src/sgml/ref/copy.sgml
@@ -44,6 +44,7 @@ COPY { table_name [ ( column_name [, ...] )
 FORCE_NULL ( column_name [, ...] )
 ENCODING 'encoding_name'
+ERROR_LIMIT 'limit_number'
 
  
 
@@ -355,6 +356,21 @@ COPY { table_name [ ( 

 
+   
+ERROR_LIMIT
+
+ 
+  Specifies to ignore error record up to limit_number number.
+ 
+
+   
+
+   
+Currently, only unique or exclusion constraint violation
+and same record formatting error is ignored.
+   
+

 WHERE
 
diff --git a/src/backend/commands/copy.c b/src/backend/commands/copy.c
index f1161f0fee..05a5f29d4c 100644
--- a/src/backend/commands/copy.c
+++ b/src/backend/commands/copy.c
@@ -48,6 +48,7 @@
 #include "port/pg_bswap.h"
 #include "rewrite/rewriteHandler.h"
 #include "storage/fd.h"
+#include "storage/lmgr.h"
 #include "tcop/tcopprot.h"
 #include "utils/builtins.h"
 #include "utils/lsyscache.h"
@@ -154,6 +155,7 @@ typedef struct CopyStateData
 	List	   *convert_select; /* list of column names (can be NIL) */
 	bool	   *convert_select_flags;	/* per-column CSV/TEXT CS flags */
 	Node	   *whereClause;	/* WHERE condition (or NULL) */
+	int			error_limit;	/* total number of error to ignore */
 
 	/* these are just for error messages, see CopyFromErrorCallback */
 	const char *cur_relname;	/* table name for error messages */
@@ -1291,6 +1293,21 @@ ProcessCopyOptions(ParseState *pstate,
 defel->defname),
 		 parser_errposition(pstate, defel->location)));
 		}
+		else if (strcmp(defel->defname, "error_limit") == 0)
+		{
+			if (cstate->error_limit > 0)
+ereport(ERROR,
+		(errcode(ERRCODE_SYNTAX_ERROR),
+		 errmsg("conflicting or redundant options"),
+		 parser_errposition(pstate, defel->location)));
+			cstate->error_limit = defGetInt64(defel);
+			if (cstate->error_limit <= 0)
+ereport(ERROR,
+		(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+		 errmsg("argument to option \"%s\" must be positive integer",
+defel->defname),
+		 parser_errposition(pstate, defel->location)));
+		}
 		else
 			ereport(ERROR,
 	(errcode(ERRCODE_SYNTAX_ERROR),
@@ -1441,6 +1458,10 @@ ProcessCopyOptions(ParseState *pstate,
 		ereport(ERROR,
 (errcode(ERRCODE_FEATURE_NOT_SUPPORTED),
  errmsg("CSV quote character must not appear in the NULL specification")));
+	if (cstate->error_limit && !cstate->is_copy_from)
+		ereport(ERROR,
+(errcode(ERRCODE_SYNTAX_ERROR),
+ errmsg("ERROR LIMIT only available using COPY FROM")));
 }
 
 /*
@@ -2837,7 +2858,10 @@ CopyFrom(CopyState cstate)
 	/* Verify the named relation is a valid target for INSERT */
 	CheckValidResultRel(resultRelInfo, CMD_INSERT);
 
-	ExecOpenIndices(resultRelInfo, false);
+	if (cstate->error_limit)
+		ExecOpenIndices(resultRelInfo, true);
+	else
+		ExecOpenIndices(resultRelInfo, false);
 
 	estate->es_result_relations = resultRelInfo;
 	estate->es_num_result_relations = 1;
@@ -2942,6 +2966,13 @@ CopyFrom(CopyState cstate)
 		 */
 		insertMethod = CIM_SINGLE;
 	}
+	else if (cstate->error_limit)
+	{
+		/*
+		 * Can't support speculative insertion in multi-inserts.
+		 */
+		insertMethod = CIM_SINGLE;
+	}
 	else
 	{
 		/*
@@ -3285,13 +3316,79 @@ CopyFrom(CopyState cstate)
 		 */
 		myslot->tts_tableOid = RelationGetRelid(resultRelInfo->ri_RelationDesc);
 	}
+	else if (cstate->error_limit && resultRelInfo->ri_NumIndices > 0)
+	{
+		/* Perform a speculative insertion. */
+		uint32		specToken;
+		ItemPointerData conflictTid;
+		bool		specConflict;

Re: Implementing Incremental View Maintenance

2019-06-28 Thread Yugo Nagata
Hi,

Attached is a WIP patch of IVM which supports some aggregate functions.

Currently, only count and sum are supported. Avg, min, or max is not supported 
although I think supporting this would not be so hard. 

As a restriction, expressions specified in GROUP BY must appear in the target 
list of views because tuples to be updated in MV are identified by using this 
group keys.


In the case of views without aggregate functions, only the number of tuple 
duplicates (__ivm_count__) are updated at incremental maintenance. On the other 
hand, in the case of vies with aggregations, the aggregated values are also 
updated. The way of update depends the kind of aggregate function.

In the case of sum (or agg functions except to count), NULL in input values is 
ignored, and this returns a null value when no rows are selected. To support 
this specification, the number of not-NULL input values is counted and stored 
in MV as a hidden column whose name is like __ivm_count_sum__, for example.

In the case of count, this returns zero when no rows are selected, and count(*) 
doesn't ignore NULL input. These specification are also supported.

Tuples to be updated in MV are identified by using keys specified by GROUP BY 
clause. However, in the case of aggregation without GROUP BY, there is only one 
tuple in the view, so keys are not uses to identify tuples.


In addition, a race condition which occurred in the previous version is 
prevented in this patch. In the previous version, when two translocations 
change a base tables concurrently, an anormal update of MV was possible because 
a change in one transaction was not visible for another transaction even in 
READ COMMITTED level. 

To prevent this, I fix this to take a lock in early stage of view maintenance 
to wait for concurrent transactions which are updating the same MV end. Also, 
we have to get the latest snapshot before computting delta tables because any 
changes which occurs in other transaction during lock waiting is not visible 
even in READ COMMITTED level.

In REPEATABLE READ or SERIALIZABLE level, don't wait a lock, and raise an error 
immediately to prevent anormal update. These solutions might be ugly, but 
something to prevent anormal update is anyway necessary. There may be better 
way.


Moreover, some regression test are added for aggregate functions support.
This is Hoshiai-san's work.

Although the code is not refined yet and I will need a deal of refactoring
and  reorganizing, I submitted this to share the current status.


* Exapmle (from regression test)

===
(1) creating tables

CREATE TABLE mv_base_a (i int, j int);
INSERT INTO mv_base_a VALUES
  (1,10),
  (2,20),
  (3,30),
  (4,40),
  (5,50);


(2) Views sith SUM() and COUNT() aggregation function

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_agg AS SELECT i, SUM(j), COUNT(i)  
FROM mv_base_a GROUP BY i;
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 |  20 | 1
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

INSERT INTO mv_base_a VALUES(2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 | 120 | 2
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

UPDATE mv_base_a SET j = 200 WHERE (i,j) = (2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 | 220 | 2
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

DELETE FROM mv_base_a WHERE (i,j) = (2,200);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 |  20 | 1
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

ROLLBACK;


(3) Views with COUNT(*) aggregation function

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_agg AS SELECT i, SUM(j),COUNT(*)  
FROM mv_base_a GROUP BY i;
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 |  20 | 1
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

INSERT INTO mv_base_a VALUES(2,100);
SELECT * FROM mv_ivm_agg ORDER BY 1,2,3;
 i | sum | count
---+-+---
 1 |  10 | 1
 2 | 120 | 2
 3 |  30 | 1
 4 |  40 | 1
 5 |  50 | 1
(5 rows)

ROLLBACK;

(4) Views with aggregation function without GROUP clause

BEGIN;
CREATE INCREMENTAL MATERIALIZED VIEW mv_ivm_group AS SELECT SUM(j)FROM 
mv_base_a;
SELECT * FROM mv_ivm_group ORDER BY 1;
 sum
-
 150
(1 row)

INSERT INTO mv_base_a VALUES(6,20);
SELECT * FROM mv_ivm_group ORDER BY 1;
 sum
-
 170
(1 row)
===



On Thu, 20 Jun 2019 16:44:10 +0900
Yugo Nagata  wrote:

> Hi hackers,
> 
> Thank you for your many questions and feedbacks at PGCon 2019.
> Attached is the patch rebased for the current master branch.
> 
> Regards,
> Yugo Nagata
> 
> On Tue, 14 May 2019 15:46:48 +0900
> Yugo 

Re: postgres_fdw: Minor improvement to postgresAcquireSampleRowsFunc

2019-06-28 Thread Etsuro Fujita
Hi Julien,

On Fri, Jun 28, 2019 at 6:54 PM Julien Rouhaud  wrote:
> On Fri, Jun 28, 2019 at 11:39 AM Etsuro Fujita  
> wrote:
> > In postgresAcquireSampleRowsFunc, we 1) determine the fetch size and
> > then 2) construct the fetch command in each iteration of fetching some
> > rows from the remote, but that would be totally redundant.
>
> Indeed.
>
> > Attached
> > is a patch for removing that redundancy.
>
> It all looks good to me!  I marked it as ready for committer.

Cool!  I'll commit the patch if there are no objections.  Thanks for reviewing!

Best regards,
Etsuro Fujita




Re: Implementing Incremental View Maintenance

2019-06-28 Thread Yugo Nagata
Hi Jim,

On Fri, 21 Jun 2019 08:41:11 -0700 (MST)
Jim Finnerty  wrote:

> Hi Yugo,
> 
> I'd like to compare the performance of your MV refresh algorithm versus
> an approach that logs changes into an mv log table, and can then apply the
> changes at some later point in time.  I'd like to handle the materialized
> join view (mjv) case first, specifically a 2-way left outer join, with a UDF
> in the SELECT list of the mjv.

Do you mean you have your implementation of IVM that using log tables?
I'm so interested in this, and  I would appreciate it if you explain the  
detail.
 
> Does your refresh algorithm handle mjv's with connected join graphs that
> consist entirely of inner and left outer joins?

> If so, I'd like to measure the overhead of your refresh algorithm on
> pgbench, modified to include an mjv, versus a (hand coded) incremental
> maintenance algorithm that uses mv log tables populated by ordinary
> triggers.  We may also want to look at capturing the deltas using logical
> replication, which ought to be faster than a trigger-based solution. 

In the current our implementation, outer joins is not yet supported though
we plan to handle this in future. So,we would not be able to compare these 
directly in the same workload in the current status.

However, the current our implementation supports only the way to update 
materialized views in a trigger, and the performance of modifying base tables 
will be lower than the approach  which uses log tables. This is because queries 
to update materialized views are issued in the trigger. This is not only a 
overhead itself, but also takes a lock on a materialized view, which has an ]
impact on concurrent execution performance. 

In the previous our PoC, we implemented IVM using log tables, in which logs are 
captured by triggers and materialized views are update incrementally by a user 
command[1]. However, to implement log table approach, we need a infrastructure 
to maintain these logs. For example, which logs are necessary and which logs 
can be discarded, etc. We thought this is not trivial work, so we decided to 
start from the current approach which doesn't use log tables. We are now 
preparing to implement this in the next step because this is also needed to 
support deferred maintenance of views.

[1] 
https://www.postgresql.eu/events/pgconfeu2018/schedule/session/2195-implementing-incremental-view-maintenance-on-postgresql/

I agree that capturing the deltas using logical decoding will be faster than 
using a trigger although we haven't yet consider this well.

Best regadrds,
Yugo Nagata


-- 
Yugo Nagata 




Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread Peter Eisentraut
On 2019-06-28 09:58, Sergei Kornilov wrote:
> Is this commitfest for small patches and bugfixes, similar to 2018-07 one in 
> last year?

There are no restrictions about what can be submitted to this commit
fest.  Review early and review often!

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: C testing for Postgres

2019-06-28 Thread Dmitry Dolgov
> On Fri, Jun 28, 2019 at 11:38 AM Adam Berlin  wrote:
>
> During the unconference at PGCon in Ottawa, I asked about writing C-based
> tests for Postgres. There was interest in trying a tool and also some
> hesitation to depend on a third-party library. So, I wrote a tool that I'd
> like to contribute to Postgres. I’ve been calling it cexpect [1].

Cool, thanks!

> Rather than post a patch, I'd rather start a conversation first. I'm guessing
> there are some improvements that we'd want to make (for example: the
> Makefile) before commiting a patch. Let's iterate on improvements before
> creating a formal patch.

Just to mention, there were similar discussions already in the past ([1], [2]),
with some concerns being raised, but looks like without any visible results.

[1]: 
https://www.postgresql.org/message-id/flat/CAEepm%3D2heu%2B5zwB65jWap3XY-UP6PpJZiKLQRSV2UQH9BmVRXQ%40mail.gmail.com
[2]: 
https://www.postgresql.org/message-id/flat/Pine.LNX.4.58.0410111044030.14840%40linuxworld.com.au




Re: POC: Cleaning up orphaned files using undo logs

2019-06-28 Thread Amit Kapila
On Sat, Jun 22, 2019 at 2:51 AM Robert Haas  wrote:
>
> On Thu, Jun 20, 2019 at 4:54 AM Dilip Kumar  wrote:
> > The idea behind having the loop inside the undo machinery was that
> > while traversing the blkprev chain, we can read all the undo records
> > on the same undo page under one buffer lock.
>
> That's not a bad goal, although invoking a user-supplied callback
> while holding a buffer lock is a little scary.  If we stick with that,
> it had better be clearly documented.  Perhaps worth noting:
> ReadBuffer() is noticeably more expensive in terms of CPU consumption
> than LockBuffer().  So it may work out that we keep a pin to avoid
> redoing that and worry less about retaking the buffer locks.  But I'm
> not sure: avoiding buffer locks is clearly good, too.
>
> I have a couple of observations after poking at this some more.  One
> is that it's not necessarily adequate to have an interface that
> iterates backward through the undo chain until the callback returns
> true.  Here's an example: suppose you want to walk backward through
> the undo chain until you find the first undo record that corresponds
> to a change you can see, and then return the undo record immediately
> prior to that.  zheap doesn't really need this, because it (mostly?)
> stores the XID of the next record we're going to look up in the
> current record, and the XID of the first record we're going to look up
> in the chain -- so it can tell from the record it's found so far
> whether it should bother looking up the next record. That, however,
> would not necessarily be true for some other AM.
>
> Suppose you just store an undo pointer in the tuple header, as Heikki
> proposed to do for zedstore.  Suppose further that each record has the
> undo pointer for the previous record that modified that TID but not
> necessarily the XID.  Then imagine a TID where we do an insert and a
> bunch of in-place updates.  Then, a scan comes along with an old
> snapshot.  It seems to me that what you need to do is walk backward
> through the chain of undo records until you see one that has an XID
> that is visible to your snapshot, and then the version of the tuple
> that you want is in the payload of the next-newer undo record. So what
> you want to do is something like this:
>
> look up the undo pointer in the tuple.  call that the current undo record.
> loop:
> - look up the undo pointer in the current undo record.  call that the
> previous undo record.
> - if the XID from the previous undo record is visible, then stop; use
> the tuple version from the current undo record.
> - release the current undo record and let the new current undo record
> be the previous undo record.
>
> I'm not sure if this is actually a reasonable design from a
> performance point of view, but it doesn't sound totally crazy, and
> it's just to illustrate the point that there might be cases that are
> too complicated for a loop-until-true model.  In this kind of loop, at
> any given time you are holding onto two undo records, working your way
> back through the undo log, and you just can't make this work with the
> UndoFetchRecord callback interface. Possibly you could have a context
> object that could hold onto one or a few buffer pins:
>
> BeginUndoFetch();
> uur = UndoFetchRecord(, urp); // maybe do this a bunch of times
> FinishUndoFetch();
>
> ...but I'm not sure if that's exactly what we want either. Still, if
> there is a significant savings from avoiding repinning and relocking
> the buffer, we want to make it easy for people to get that advantage
> as often as possible.
>

IIUC, you are proposing to retain the pin and then lock/unlock the
buffer each time in this API.  I think there is no harm in trying out
something on these lines to see if there is any impact on some of the
common scenarios.

> Another point that has occurred to me is that it is probably
> impossible to avoid a fairly large number of duplicate undo fetches.
> For instance, suppose somebody runs an UPDATE on a tuple that has been
> recently updated.  The tuple_update method just gets a TID + snapshot,
> so the AM basically has to go look up the tuple all over again,
> including checking whether the latest version of the tuple is the one
> visible to our snapshot.  So that means repinning and relocking the
> same buffers and decoding the same undo record all over again.  I'm
> not exactly sure what to do about this, but it seems like a potential
> performance problem. I wonder if it's feasible to cache undo lookups
> so that in common cases we can just reuse the result of a previous
> lookup instead of doing a new one, and I wonder whether it's possible
> to make that fast enough that it actually helps...
>

I think it will be helpful if we can have such a cache, but OTOH, we
can try out such optimizations after the first version as well by
analyzing its benefit.  For zheap, in many cases, the version in heap
itself is all-visible and or is visible as per current snapshot and
the same can be 

Re: postgres_fdw: Minor improvement to postgresAcquireSampleRowsFunc

2019-06-28 Thread Julien Rouhaud
On Fri, Jun 28, 2019 at 11:39 AM Etsuro Fujita  wrote:
>
> In postgresAcquireSampleRowsFunc, we 1) determine the fetch size and
> then 2) construct the fetch command in each iteration of fetching some
> rows from the remote, but that would be totally redundant.

Indeed.

> Attached
> is a patch for removing that redundancy.

It all looks good to me!  I marked it as ready for committer.




Re: allow_system_table_mods stuff

2019-06-28 Thread Peter Eisentraut
Here is a new patch after the discussion.

- Rename allow_system_table_mods to allow_system_table_ddl.

(This makes room for a new allow_system_table_dml, but it's not
implemented here.)

- Make allow_system_table_ddl SUSET.

- Add regression test.

- Remove the behavior that allow_system_table_mods allowed
non-superusers to do DML on catalog tables without further access checking.

I think there was general agreement on all these points.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From b44c7c7b7177a87e777b3b2a7fe10ea739bfaa72 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Fri, 28 Jun 2019 11:36:04 +0200
Subject: [PATCH v2] New and improved allow_system_table_ddl setting

Rename allow_system_table_mods to allow_system_table_ddl to clarify
its meaning.  There is also some discussion about introducing a
parallel allow_system_table_dml setting, so this makes room for that.

Make allow_system_table_ddl settable at run time by superusers.  It
was previously postmaster start only.  We don't necessarily want to
make system catalog DDL wide-open, but there are occasionally useful
things to do like setting reloptions or statistics on a busy system
table, and blocking those doesn't help anyone.  Also, this enables
writing a test suite.

Add a regression test file that exercises the kinds of commands that
allow_system_table_ddl allows.

Previously, allow_system_table_mods allowed a non-superuser to do DML
on a system table without further permission checks.  This has been
removed, as it was quite inconsistent with the rest of the meaning of
this setting.  (Since allow_system_table_mods was previously only
accessible with a server restart, it is unlikely that anyone was using
this possibility.)

Discussion: 
https://www.postgresql.org/message-id/flat/8b00ea5e-28a7-88ba-e848-21528b632354%402ndquadrant.com
---
 doc/src/sgml/config.sgml  |  16 +-
 doc/src/sgml/ref/show.sgml|  10 +-
 src/backend/catalog/aclchk.c  |   5 +-
 src/backend/catalog/catalog.c |   2 +-
 src/backend/catalog/heap.c|  16 +-
 src/backend/catalog/index.c   |  16 +-
 src/backend/commands/indexcmds.c  |   2 +-
 src/backend/commands/policy.c |   6 +-
 src/backend/commands/schemacmds.c |   4 +-
 src/backend/commands/tablecmds.c  |  20 +-
 src/backend/commands/tablespace.c |   4 +-
 src/backend/commands/trigger.c|   6 +-
 src/backend/postmaster/postmaster.c   |   2 +-
 src/backend/rewrite/rewriteDefine.c   |   4 +-
 src/backend/tcop/postgres.c   |   2 +-
 src/backend/utils/cache/relmapper.c   |   2 +-
 src/backend/utils/init/globals.c  |   2 +-
 src/backend/utils/misc/guc.c  |   6 +-
 src/include/catalog/heap.h|   4 +-
 src/include/catalog/index.h   |   4 +-
 src/include/miscadmin.h   |   2 +-
 .../regress/expected/alter_system_table.out   | 170 
 src/test/regress/expected/alter_table.out |  10 +-
 src/test/regress/parallel_schedule|   2 +-
 src/test/regress/serial_schedule  |   1 +
 src/test/regress/sql/alter_system_table.sql   | 187 ++
 src/test/regress/sql/alter_table.sql  |   6 +-
 27 files changed, 431 insertions(+), 80 deletions(-)
 create mode 100644 src/test/regress/expected/alter_system_table.out
 create mode 100644 src/test/regress/sql/alter_system_table.sql

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 84341a30e5..7a4e141a2d 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -9288,17 +9288,19 @@ Developer Options
 
 
 
- 
-  allow_system_table_mods (boolean)
+ 
+  allow_system_table_ddl (boolean)
   
-allow_system_table_mods configuration 
parameter
+allow_system_table_ddl configuration 
parameter
   
   
   

-Allows modification of the structure of system tables.
-This is used by initdb.
-This parameter can only be set at server start.
+Allows DDL commands on system tables.  This is otherwise not allowed
+even for superusers.  This is used by initdb.
+Inconsiderate use of this setting can cause irretrievable data loss or
+seriously corrupt the database system.  Only superusers can change
+this setting.

   
  
@@ -9831,7 +9833,7 @@ Short Option Key


 -O
-allow_system_table_mods = on
+allow_system_table_ddl = on


 -p x
diff --git a/doc/src/sgml/ref/show.sgml b/doc/src/sgml/ref/show.sgml
index 945b0491b1..c98dc02444 100644
--- a/doc/src/sgml/ref/show.sgml
+++ b/doc/src/sgml/ref/show.sgml
@@ -167,14 +167,14 @@ Examples

postgres_fdw: Minor improvement to postgresAcquireSampleRowsFunc

2019-06-28 Thread Etsuro Fujita
Hi,

In postgresAcquireSampleRowsFunc, we 1) determine the fetch size and
then 2) construct the fetch command in each iteration of fetching some
rows from the remote, but that would be totally redundant.  Attached
is a patch for removing that redundancy.

I'll add this to the upcoming commitfest.

Best regards,
Etsuro Fujita


postgresAcquireSampleRowsFunc.patch
Description: Binary data


C testing for Postgres

2019-06-28 Thread Adam Berlin
During the unconference at PGCon in Ottawa, I asked about writing C-based
tests for Postgres. There was interest in trying a tool and also some
hesitation to depend on a third-party library. So, I wrote a tool that I'd
like to contribute to Postgres. I’ve been calling it cexpect [1]
.

cexpect is a general-use library for creating test suites in C. It includes:

- a core library for creating and running suites of tests.

- a standard set of test expectations for C

- a default formatter (dots)

- an extensible matcher framework

- an extensible formatter framework


Why add a testing framework for C to Postgres?

An xUnit-style test framework [2] is a tool for writing tests that is not
currently an option for people hacking on Postgres.


   -

   C-based tests could help increase code coverage in parts of the codebase
   that are difficult to reach with a regress-style test (for example: gin
   posting list compression [3]).
   -

   Writing tests for internal components help developers become more
   familiar with a codebase.
   -

   Writing C-based tests go beyond providing regression value by providing
   feedback on design decisions like modularity and dependencies.
   -

   Test suites already existing in `src/test/modules` could benefit from
   having a consistent way to declare expectations, run suites, and display
   results.
   -

   Forks and extensions that write tests could benefit from a testing
   framework provided by the upstream project. An extensible framework will
   allow forks and extensions to create their own matchers and formatters
   without changing the core framework.



The extensible matcher framework has benefits for decoupled, isolated unit
tests, and also  high-level system tests. The complexity of a common
assertions can be packaged into a matcher - abstracting the complexity and
making the assertion easier to reuse.

For example, there could be expectation matchers specifically crafted for
the domain of Postgres:

`expect(xlog, to_have_xlog_record(XLOG_XACT_PREPARE))`

`expect(“postgresadmin”, to_be_an_admin_user())`

The matchers that come with cexpect out of the box are for C datatypes:

`expect(1, is_int_equal_to(1))`

`expect(1 == 2, is_false())`

`expect(“some random string”, is_string_containing(“and”))’

… and more, with the goal of having a matcher for all standard data types.


The extensible formatter framework could be used to create a test-anything
protocol (TAP) output, familiar to Postgres hackers. It could also be used
to insert test results into a database table for later analysis, or create
a consistent way of reporting results from a user-defined function - the
pattern often used for creating tests under `src/test/modules`.


How does it work?

Create an executable that links to the core shared library, include the
core library headers, create a suite, add some tests, and run it.

test.c:


```

#include "cexpect.h"

#include "cexpect_cmatchers.h"

void some_passing_test(Test *test) {

expect(test, 1, is_int_equal_to(1));

}

int main(int argc, char *args[]) {

Suite *suite = create_suite("Example test");

add_test(suite, some_passing_test);

start_cexpect(suite);

}

```

Running test.c:


```bash

export DYLD_LIBRARY_PATH=$path_to_cexpect_library

export compile_flags=”-Wno-int-conversion -Wno-pointer-to-int-cast test.c
-L $path_to_cexpect_library -I $path_to_cexpect_headers”

gcc $compile_flags -l cexpect -o test.o

./test.o

Running suite: Example test

.

Summary:

Ran 1 test(s).

1 passed, 0 failed, 0 pending

```

Rather than post a patch, I'd rather start a conversation first. I'm
guessing there are some improvements that we'd want to make (for example:
the Makefile) before commiting a patch. Let's iterate on improvements
before creating a formal patch.


Thoughts?


Thanks,

Adam Berlin

Software Engineer at Pivotal Greenplum

[1] https://github.com/berlin-ab/cexpect

[2] https://en.wikipedia.org/wiki/XUnit

[3]
https://coverage.postgresql.org/src/backend/access/gin/ginpostinglist.c.gcov.html


Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread Sergei Kornilov
Hello

Is this commitfest for small patches and bugfixes, similar to 2018-07 one in 
last year?

regards, Sergei




Re: Commitfest 2019-07, the first of five* for PostgreSQL 13

2019-06-28 Thread Fabien COELHO



Hello Thomas,


The first Commitfest[1] for the next major release of PostgreSQL
begins in a few days, and runs for the month of July.



There are 218 patches registered[2] right now,


ISTM that there are a couple of duplicates: 2084 & 2150, 2119 & 2180?


I volunteered to be the CF manager for this one, and Jonathan Katz
kindly offered to help me[3].


Thanks for volunteering, and good luck with this task!

--
Fabien.




Re: cleanup & refactoring on reindexdb.c

2019-06-28 Thread Julien Rouhaud
On Mon, May 13, 2019 at 5:09 AM Michael Paquier  wrote:
>
> On Sun, May 12, 2019 at 11:16:28AM +0200, Julien Rouhaud wrote:
> > I attach two patches to fix both (it could be squashed in a single
> > commit as both are straightforward), for upcoming v13.
>
> Squashing both patches together makes the most sense in my opinion as
> the same areas are reworked.  I can notice that you have applied
> pgindent, but the indentation got a bit messed up because the new enum
> ReindexType is missing from typedefs.list.
>
> I have reworked a bit your patch as per the attached, tweaking a
> couple of places like reordering the elements in ReindexType,
> reviewing the indentation, etc.  At the end I can see more reasons to
> use multiple switch/case points as if we add more options in the
> future then we have more code paths to take care of.  These would
> unlikely get forgotten, but there is no point to take this risk
> either, and that would simplify future patches.  It is also possible
> to group some types together when assigning the object name similarly
> to what's on HEAD.

Thanks!  I'm fine with the changes.

The patch does not apply anymore, so here's a rebased version.


reindex-refactor-v3.patch
Description: Binary data


Re: catalog files simplification

2019-06-28 Thread Peter Eisentraut
On 2019-06-12 15:34, Tom Lane wrote:
> A bigger objection might be that this would leave us with no obvious-
> to-the-untrained-eye declaration point for either the struct name or
> the two typedef names.  That might make tools like ctags sad.  Perhaps
> it's not really any worse than today, but it bears investigation.

At least with GNU Global, it finds FormData_pg_foo but not Form_pg_foo.
But you can find the latter using grep.  This patch would hide both of
those even from grep, so maybe it isn't a good idea then.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: SQL/JSON path issues/questions

2019-06-28 Thread Oleg Bartunov
On Fri, Jun 28, 2019 at 8:10 AM Alvaro Herrera  wrote:
>
> On 2019-Jun-28, Alexander Korotkov wrote:
>
> > On Tue, Jun 25, 2019 at 6:38 PM Liudmila Mantrova
> >  wrote:
> > > Thank you for the catch! Please see the modified version of patch 0004
> > > attached.
> >
> > I tried to review and revise the part related to filters, but I failed
> > because I don't understand the notions used in the documentation.
> >
> > What is the difference between filter expression and filter condition?
> >  I can guess that filter expression contains question mark,
> > parentheses and filter condition inside.  But this sentence is in
> > contradiction with my guess: "A filter expression must be enclosed in
> > parentheses and preceded by a question mark".  So, filter expression
> > is inside the parentheses.  Then what is filter condition?  The same?
>
> The SQL standard defines "JSON filter expressions" (in 9.39 of the 2016
> edition).  It does not use either term "filter condition" nor bare
> "filter"; it uses "JSON path predicate" which is the part of the JSON
> filter expression that is preceded by the question mark and enclosed by
> parens.

Yes, this is what I used in my talk
http://www.sai.msu.su/~megera/postgres/talks/jsonpath-ibiza-2019.pdf

>
> Maybe we should stick with the standard terminology ...

Sure.

As for the jsonpath documentation, I think we should remember, that
jsonpath is a part of SQL/JSON, and in the following releases we will
expand documentation to include SQL/JSON functions, so I suggest to
have one chapter SQL/JSON with following structure:
1.  Introduction
1.1 SQL/JSON data model
1.2 SQL/JSON path language
1.3  -- to be added
2. PostgreSQL implementation
2.1  jsonpath data type   -- link from json data types
2.2 jsonpath  functions and operators -- link from functions
2.3 Indexing

I plan to work on a separate chapter "JSON handling in PostgreSQL" for
PG13,  which includes
JSON(b) data types, functions, indexing and SQL/JSON.

>
> --
> Álvaro Herrerahttps://www.2ndQuadrant.com/
> PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>
>


-- 
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company