Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread Michael Paquier
On Tue, Feb 05, 2019 at 05:46:55PM +1300, David Rowley wrote:
> I did leave a couple untouched as there was quite a bit of escaping
> going on already. I didn't think switching between \Q and \E would
> have made those ones any more pleasing to the eye.

-qr/^ALTER TABLE dump_test.test_table ENABLE ROW LEVEL SECURITY;/m,
+qr/^\QALTER TABLE dump_test.test_table ENABLE ROW LEVEL SECURITY;/m,
I would have just appended the \E for consistency at the end of the
strings.  Except that it looks fine.  No need to send an updated
version, it seems that you have all the spots.  I'll do an extra pass
on it tomorrow and see if I can commit it.
--
Michael


signature.asc
Description: PGP signature


Re: Memory contexts reset for trigger invocations

2019-02-04 Thread Haribabu Kommi
On Tue, Feb 5, 2019 at 4:29 PM Andres Freund  wrote:

> Hi,
>
> trigger.c goes through some trouble to free the tuples returned by
> trigger functions. There's plenty codepaths that look roughly like:
> if (oldtuple != newtuple && oldtuple != slottuple)
> heap_freetuple(oldtuple);
> if (newtuple == NULL)
> {
> if (trigtuple != fdw_trigtuple)
> heap_freetuple(trigtuple);
> return NULL;/* "do nothing" */
> }
>
> but we, as far as I can tell, do not reset the memory context in which
> the trigger functions have been called.
>
> Wouldn't it be better to reset an appropriate context after each
> invocation? Yes, that'd require some care to manage the lifetime of
> tuples returned by triggers, but that seems OK?
>

Currently the trigger functions are executed under per tuple memory
context, but the returned tuples are allocated from the executor memory
context in some scenarios.

   /*
* Copy tuple to upper executor memory.  But if user just did
* "return new" or "return old" without changing anything, there's
* no need to copy; we can return the original tuple (which will
* save a few cycles in trigger.c as well as here).
*/
   if (rettup != trigdata->tg_newtuple &&
rettup != trigdata->tg_trigtuple)
rettup = SPI_copytuple(rettup);

we need to take care of these also before switch to a context?

>
Regards,
Haribabu Kommi
Fujitsu Australia


Re: Online verification of checksums

2019-02-04 Thread Andres Freund
Hi,

On 2019-02-05 06:57:06 +0100, Fabien COELHO wrote:
> > > > I'm wondering (possibly again) about the existing early exit if one 
> > > > block
> > > > cannot be read on retry: the command should count this as a kind of bad
> > > > block, proceed on checking other files, and obviously fail in the end, 
> > > > but
> > > > having checked everything else and generated a report. I do not think 
> > > > that
> > > > this condition warrants a full stop. ISTM that under rare race 
> > > > conditions
> > > > (eg, an unlucky concurrent "drop database" or "drop table") this could
> > > > happen when online, although I could not trigger one despite heavy 
> > > > testing,
> > > > so I'm possibly mistaken.
> > > 
> > > This seems like a defensible judgement call either way.
> > 
> > Right now we have a few tests that explicitly check that
> > pg_verify_checksums fail on broken data ("foo" in the file).  Those
> > would then just get skipped AFAICT, which I think is the worse behaviour
> > , but if everybody thinks that should be the way to go, we can
> > drop/adjust those tests and make pg_verify_checksums skip them.
> > 
> > Thoughts?
> 
> My point is that it should fail as it does, only not immediately (early
> exit), but after having checked everything else. This mean avoiding calling
> "exit(1)" here and there (lseek, fopen...), but taking note that something
> bad happened, and call exit only in the end.

I can see both as being valuable (one gives you a more complete picture,
the other a quicker answer in scripts). For me that's the point where
it's the prerogative of the author to make that choice.

Greetings,

Andres Freund



Re: Online verification of checksums

2019-02-04 Thread Fabien COELHO



Hallo Michael,


I'm wondering (possibly again) about the existing early exit if one block
cannot be read on retry: the command should count this as a kind of bad
block, proceed on checking other files, and obviously fail in the end, but
having checked everything else and generated a report. I do not think that
this condition warrants a full stop. ISTM that under rare race conditions
(eg, an unlucky concurrent "drop database" or "drop table") this could
happen when online, although I could not trigger one despite heavy testing,
so I'm possibly mistaken.


This seems like a defensible judgement call either way.


Right now we have a few tests that explicitly check that
pg_verify_checksums fail on broken data ("foo" in the file).  Those
would then just get skipped AFAICT, which I think is the worse behaviour
, but if everybody thinks that should be the way to go, we can
drop/adjust those tests and make pg_verify_checksums skip them.

Thoughts?


My point is that it should fail as it does, only not immediately (early 
exit), but after having checked everything else. This mean avoiding 
calling "exit(1)" here and there (lseek, fopen...), but taking note that 
something bad happened, and call exit only in the end.


--
Fabien.



Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread David G. Johnston
On Monday, February 4, 2019, David Rowley 
wrote:

> On Tue, 5 Feb 2019 at 01:12, Daniel Gustafsson  wrote:
> > We may also want to use the + metacharacter instead of * in a few
> places, since
> > the intent is to always match something, where matching nothing should be
> > considered an error:
> >
> > - qr/^ALTER TEXT SEARCH DICTIONARY
> dump_test.alt_ts_dict1 OWNER TO .*;/m,
> > + qr/^ALTER TEXT SEARCH DICTIONARY
> dump_test\.alt_ts_dict1 OWNER TO .*;/m,
>
> I looked for instances of * alone and didn't see any. I only saw ones
> prefixed with ".", in which case, isn't that matching 1 or more chars
> already?


No.  In Regex the following are equivalent:

.* == .{0,}
.+ == .{1,}
. == .{1}

A “*” by itself would either be an error or, assuming the preceding
character is a space (so it visually looks alone) would be zero or more
consecutive spaces.

In the above “...OWNER TO;” is a valid match.

David J.


Memory contexts reset for trigger invocations

2019-02-04 Thread Andres Freund
Hi,

trigger.c goes through some trouble to free the tuples returned by
trigger functions. There's plenty codepaths that look roughly like:
if (oldtuple != newtuple && oldtuple != slottuple)
heap_freetuple(oldtuple);
if (newtuple == NULL)
{
if (trigtuple != fdw_trigtuple)
heap_freetuple(trigtuple);
return NULL;/* "do nothing" */
}

but we, as far as I can tell, do not reset the memory context in which
the trigger functions have been called.

Wouldn't it be better to reset an appropriate context after each
invocation? Yes, that'd require some care to manage the lifetime of
tuples returned by triggers, but that seems OK?

I get that most tables don't have dozens of triggers, but for more
complicated triggers/functions even a few seem like they'd matter?

Greetings,

Andres Freund



Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread David Rowley
On Tue, 5 Feb 2019 at 01:12, Daniel Gustafsson  wrote:
> We may also want to use the + metacharacter instead of * in a few places, 
> since
> the intent is to always match something, where matching nothing should be
> considered an error:
>
> - qr/^ALTER TEXT SEARCH DICTIONARY dump_test.alt_ts_dict1 
> OWNER TO .*;/m,
> + qr/^ALTER TEXT SEARCH DICTIONARY dump_test\.alt_ts_dict1 
> OWNER TO .*;/m,

I looked for instances of * alone and didn't see any. I only saw ones
prefixed with ".", in which case, isn't that matching 1 or more chars
already?

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread David Rowley
On Tue, 5 Feb 2019 at 14:41, Michael Paquier  wrote:
>  Instead of the approach you are
> proposing, perhaps it would make sense to extend this way of doing
> things then?  For example some tests with CREATE CONVERSION do so.  It
> looks much more portable than having to escape every dot.

I'm not particularly excited either way, but here's a patch with it that way.

I did leave a couple untouched as there was quite a bit of escaping
going on already. I didn't think switching between \Q and \E would
have made those ones any more pleasing to the eye.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


tighten_up_pg_dump_tap_tests_v2.patch
Description: Binary data


Re: Inadequate executor locking of indexes

2019-02-04 Thread David Rowley
On Wed, 28 Nov 2018 at 01:55, David Rowley  wrote:
> If this looks like a good path to go in, then I can produce something
> a bit more finished. I'm just a bit unsure when exactly I can do that
> as I'm on leave and have other commitments to take care of.

This patch is still on my list, so I had another look at what I did
back in November...

I've changed a couple of things:

1. Changed nodeBitmapIndexscan.c now properly uses the RangeTblEntry's
idxlockmode field.
2. Renamed a few variables in finalize_lockmodes().

I'm keen to get some feedback if we should go about fixing things this
way.  One thing that's still on my mind is that the parser is still at
risk of lock upgrade hazards. This patch only fixes the executor.  I
don't quite see how it would be possible to fix the same in the
parser.

I was also looking at each call site that calls ExecOpenIndices(). I
don't think it's great that ExecInitModifyTable() has its own logic to
skip calling that function for DELETE.  I wondered if it shouldn't
somehow depend on what the idxlockmode is set to.  I also saw that
apply_handle_delete() makes a call to ExecOpenIndices(). I don't think
that one is needed, but I didn't test anything to make sure. Maybe
that's for another thread anyway.

Updated patch is attached.

Adding to the March commitfest as a bug fix.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


idxlockmode_and_lock_upgrade_fix_v2.patch
Description: Binary data


Re: Feature: temporary materialized views

2019-02-04 Thread Michael Paquier
On Mon, Feb 04, 2019 at 04:10:09PM +0100, Andreas Karlsson wrote:
> Should I submit it as a separate CF entry or is it easiest if my refactoring
> and Mi Tar's feature are reviewed together?

The refactoring patch is talking about changing the way objects are
created within a CTAS, which is quite different from what is proposed
on this thread, so in order to attract the correct audience a separate
thread with another CF entry seems more appropriate.

Now...  You have on this thread all the audience which already worked
on 874fe3a.  And I am just looking at this patch, evaluating the
behavior change this is introducing.  Still I would recommend a
separate thread as others may want to comment on that particular
point.
--
Michael


signature.asc
Description: PGP signature


Re: Log a sample of transactions

2019-02-04 Thread Michael Paquier
On Mon, Feb 04, 2019 at 04:32:05PM +0100, Adrien NAYRAT wrote:
> Could it be acceptable if I set log_min_duration_statement = 0?

At least this has the merit to make the behavior not rely on
randomness.  I have not looked at the patch in details so I cannot say
for sure, but I am not sure if it is worth the extra cycles to have
tests integrated so they may be removed from the final commit if they
are too expensive for little value.  It is always helpful to have
deterministic tests to give a shot at the feature and ease review of
course.
--
Michael


signature.asc
Description: PGP signature


Re: Allow some recovery parameters to be changed with reload

2019-02-04 Thread Michael Paquier
On Mon, Feb 04, 2019 at 11:58:28AM +0100, Peter Eisentraut wrote:
> I think the recovery parameters
> 
> archive_cleanup_command

Only triggered by the checkpointer.

> promote_trigger_file
> recovery_end_command
> recovery_min_apply_delay

Only looked at by the startup process.

> can be changed from PGC_POSTMASTER to PGC_SIGHUP without any further
> complications (unlike for example primary_conninfo, which is being
> discussed elsewhere).

I agree that this subset is straight-forward and safe to switch.  The
documentation changes look right.
--
Michael


signature.asc
Description: PGP signature


Re: Reduce amount of WAL generated by CREATE INDEX for gist, gin and sp-gist

2019-02-04 Thread Andrey Lepikhov
On 04.02.2019 10:04, Michael Paquier wrote:
> On Tue, Dec 18, 2018 at 10:41:48AM +0500, Andrey Lepikhov wrote:
>> Ok. It is used only for demonstration.
> 
> The latest patch set needs a rebase, so moved to next CF, waiting on
> author as this got no reviews.
The new version in attachment.
> --
> Michael
> 

-- 
Andrey Lepikhov
Postgres Professional
https://postgrespro.com
The Russian Postgres Company
From afc270f37cd082fb6e9f4ad694ea4cb123d98062 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Tue, 5 Feb 2019 08:11:47 +0500
Subject: [PATCH 1/4] Relation-into-WAL-function

---
 src/backend/access/spgist/spgutils.c  |  1 +
 src/backend/access/transam/generic_xlog.c | 39 +++
 src/include/access/generic_xlog.h |  3 ++
 3 files changed, 43 insertions(+)

diff --git a/src/backend/access/spgist/spgutils.c b/src/backend/access/spgist/spgutils.c
index 8e63c1fad2..b782bc2338 100644
--- a/src/backend/access/spgist/spgutils.c
+++ b/src/backend/access/spgist/spgutils.c
@@ -550,6 +550,7 @@ SpGistInitBuffer(Buffer b, uint16 f)
 {
 	Assert(BufferGetPageSize(b) == BLCKSZ);
 	SpGistInitPage(BufferGetPage(b), f);
+	MarkBufferDirty(b);
 }
 
 /*
diff --git a/src/backend/access/transam/generic_xlog.c b/src/backend/access/transam/generic_xlog.c
index 5b00b7275b..643da48345 100644
--- a/src/backend/access/transam/generic_xlog.c
+++ b/src/backend/access/transam/generic_xlog.c
@@ -542,3 +542,42 @@ generic_mask(char *page, BlockNumber blkno)
 
 	mask_unused_space(page);
 }
+
+/*
+ * Function to write generic xlog for every existing block of a relation.
+ * Caller is responsible for locking the relation exclusively.
+ */
+void
+generic_log_relation(Relation rel)
+{
+	BlockNumber blkno;
+	BlockNumber nblocks = RelationGetNumberOfBlocks(rel);
+
+	for (blkno = 0; blkno < nblocks; )
+	{
+		GenericXLogState	*state;
+		Bufferbuffer[MAX_GENERIC_XLOG_PAGES];
+		int	i,
+			blocks_pack;
+
+		CHECK_FOR_INTERRUPTS();
+
+		blocks_pack = ((nblocks-blkno) < MAX_GENERIC_XLOG_PAGES) ?
+		(nblocks-blkno) : MAX_GENERIC_XLOG_PAGES;
+
+		state = GenericXLogStart(rel);
+
+		for (i = 0 ; i < blocks_pack; i++)
+		{
+			buffer[i] = ReadBuffer(rel, blkno);
+			LockBuffer(buffer[i], BUFFER_LOCK_EXCLUSIVE);
+			GenericXLogRegisterBuffer(state, buffer[i], GENERIC_XLOG_FULL_IMAGE);
+			blkno++;
+		}
+
+		GenericXLogFinish(state);
+
+		for (i = 0 ; i < blocks_pack; i++)
+			UnlockReleaseBuffer(buffer[i]);
+	}
+}
diff --git a/src/include/access/generic_xlog.h b/src/include/access/generic_xlog.h
index cb5b5b713a..e3bbf014cc 100644
--- a/src/include/access/generic_xlog.h
+++ b/src/include/access/generic_xlog.h
@@ -42,4 +42,7 @@ extern const char *generic_identify(uint8 info);
 extern void generic_desc(StringInfo buf, XLogReaderState *record);
 extern void generic_mask(char *pagedata, BlockNumber blkno);
 
+/* other utils */
+extern void generic_log_relation(Relation rel);
+
 #endif			/* GENERIC_XLOG_H */
-- 
2.17.1

From 25dc935da9d5502a06fb9dc1eea4912fa0f48be1 Mon Sep 17 00:00:00 2001
From: "Andrey V. Lepikhov" 
Date: Tue, 5 Feb 2019 08:13:29 +0500
Subject: [PATCH 2/4] GIN-Optimal-WAL-Usage

---
 src/backend/access/gin/ginbtree.c |  6 ++---
 src/backend/access/gin/gindatapage.c  | 10 
 src/backend/access/gin/ginentrypage.c |  2 +-
 src/backend/access/gin/gininsert.c| 30 ++--
 src/backend/access/gin/ginutil.c  |  4 ++--
 src/backend/access/gin/ginvacuum.c|  2 +-
 src/backend/access/gin/ginxlog.c  | 33 ---
 src/backend/access/rmgrdesc/gindesc.c |  6 -
 src/include/access/gin.h  |  3 ++-
 src/include/access/ginxlog.h  |  2 --
 10 files changed, 27 insertions(+), 71 deletions(-)

diff --git a/src/backend/access/gin/ginbtree.c b/src/backend/access/gin/ginbtree.c
index 533949e46a..9f82eef8c3 100644
--- a/src/backend/access/gin/ginbtree.c
+++ b/src/backend/access/gin/ginbtree.c
@@ -396,7 +396,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 		/* It will fit, perform the insertion */
 		START_CRIT_SECTION();
 
-		if (RelationNeedsWAL(btree->index))
+		if (RelationNeedsWAL(btree->index) && !btree->isBuild)
 		{
 			XLogBeginInsert();
 			XLogRegisterBuffer(0, stack->buffer, REGBUF_STANDARD);
@@ -417,7 +417,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 			MarkBufferDirty(childbuf);
 		}
 
-		if (RelationNeedsWAL(btree->index))
+		if (RelationNeedsWAL(btree->index) && !btree->isBuild)
 		{
 			XLogRecPtr	recptr;
 			ginxlogInsert xlrec;
@@ -595,7 +595,7 @@ ginPlaceToPage(GinBtree btree, GinBtreeStack *stack,
 		}
 
 		/* write WAL record */
-		if (RelationNeedsWAL(btree->index))
+		if (RelationNeedsWAL(btree->index) && !btree->isBuild)
 		{
 			XLogRecPtr	recptr;
 
diff --git a/src/backend/access/gin/gindatapage.c b/src/backend/access/gin/gindatapage.c
index 3ad8b76710..a73cf944b3 100644
--- a/src/backend/access/gin/gindatapage.c
+++ b/src/backend/access/gin/gindatapage.c
@@ -593,7 +593,7 @@ 

Re: [HACKERS] Block level parallel vacuum

2019-02-04 Thread Haribabu Kommi
On Fri, Feb 1, 2019 at 8:19 AM Masahiko Sawada 
wrote:

> On Wed, Jan 30, 2019 at 2:06 AM Haribabu Kommi 
> wrote:
> >
> >
> >
> >
> > + * Before starting parallel index vacuum and parallel cleanup index we
> launch
> > + * parallel workers. All parallel workers will exit after processed all
> indexes
> >
> > parallel vacuum index and parallel cleanup index?
> >
> >
>
> ISTM we're using like "index vacuuming", "index cleanup" and "FSM
> vacuming" in vacuumlazy.c so maybe "parallel index vacuuming" and
> "parallel index cleanup" would be better?
>

OK.


> > + /*
> > + * If there is already-updated result in the shared memory we
> > + * use it. Otherwise we pass NULL to index AMs and copy the
> > + * result to the shared memory segment.
> > + */
> > + if (lvshared->indstats[idx].updated)
> > + result = &(lvshared->indstats[idx].stats);
> >
> > I didn't really find a need of the flag to differentiate the stats
> pointer from
> > first run to second run? I don't see any problem in passing directing
> the stats
> > and the same stats are updated in the worker side and leader side.
> Anyway no two
> > processes will do the index vacuum at same time. Am I missing something?
> >
> > Even if this flag is to identify whether the stats are updated or not
> before
> > writing them, I don't see a need of it compared to normal vacuum.
> >
>
> The passing stats = NULL to amvacuumcleanup and ambulkdelete means the
> first time execution. For example, btvacuumcleanup skips cleanup if
> it's not NULL.In the normal vacuum we pass NULL to ambulkdelete or
> amvacuumcleanup when the first time calling. And they store the result
> stats to the memory allocated int the local memory. Therefore in the
> parallel vacuum I think that both worker and leader need to move it to
> the shared memory and mark it as updated as different worker could
> vacuum different indexes at the next time.
>

OK, understood the point. But for btbulkdelete whenever the stats are NULL,
it allocates the memory. So I don't see a problem with it.

The only problem is with btvacuumcleanup, when there are no dead tuples
present in the table, the btbulkdelete is not called and directly the
btvacuumcleanup
is called at the end of vacuum, in that scenario, there is code flow
difference
based on the stats. so why can't we use the deadtuples number to
differentiate
instead of adding another flag? And also this scenario is not very often,
so avoiding
memcpy for normal operations would be better. It may be a small gain, just
thought of it.


>
> > + initStringInfo();
> > + appendStringInfo(,
> > + ngettext("launched %d parallel vacuum worker %s (planned: %d",
> > +   "launched %d parallel vacuum workers %s (planned: %d",
> > +   lvstate->pcxt->nworkers_launched),
> > + lvstate->pcxt->nworkers_launched,
> > + for_cleanup ? "for index cleanup" : "for index vacuum",
> > + lvstate->pcxt->nworkers);
> > + if (lvstate->options.nworkers > 0)
> > + appendStringInfo(, ", requested %d", lvstate->options.nworkers);
> >
> > what is the difference between planned workers and requested workers,
> aren't both
> > are same?
>
> The request is the parallel degree that is specified explicitly by
> user whereas the planned is the actual number we planned based on the
> number of indexes the table has. For example, if we do like 'VACUUM
> (PARALLEL 3000) tbl' where the tbl has 4 indexes, the request is 3000
> and the planned is 4. Also if max_parallel_maintenance_workers is 2
> the planned is 2.
>

OK.

Regards,
Haribabu Kommi
Fujitsu Australia


Re: WIP: Avoid creation of the free space map for small tables

2019-02-04 Thread Amit Kapila
On Mon, Feb 4, 2019 at 2:27 PM John Naylor 
wrote:
>
> On Mon, Feb 4, 2019 at 8:41 AM Amit Kapila 
wrote:
> >
> > The change seems to have worked.  All the buildfarm machines that were
> > showing the failure are passed now.
>
> Excellent!
>
> Now that the buildfarm is green as far as this patch goes,
>

There is still one recent failure which I don't think is related to commit
of this patch:

https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=bowerbird=2019-02-04%2016%3A38%3A48

==
pgsql.build/src/bin/pg_ctl/tmp_check/log/004_logrotate_primary.log
===
TRAP: FailedAssertion("!(UsedShmemSegAddr != ((void *)0))", File:
"g:\prog\bf\root\head\pgsql.build\src\backend\port\win32_shmem.c", Line:
513)

I think we need to do something about this random failure, but not as part
of this thread/patch.


> I will
> touch on a few details to round out development in this area:
>
> 1. Earlier, I had a test to ensure that free space towards the front
> of the relation was visible with no FSM. In [1], I rewrote it without
> using vacuum, so we can consider adding it back now if desired. I can
> prepare a patch for this.
>

Yes, this is required.  It is generally a good practise to add test (unless
it takes a lot of time) which covers new code/functionality.

> 2. As a follow-on, since we don't rely on vacuum to remove dead rows,
> we could try putting the fsm.sql test in some existing group in the
> parallel schedule, rather than its own group is it is now.
>

+1.

> 3. While looking at 0d1fe9f74e, it occurred to me that I ignored this
> patch's effects on GetRecordedFreeSpace(), which will return zero for
> tables with no FSM.
>

Right, but what exactly we want to do for it?  Do you want to add a comment
atop of this function?

> The other callers are in:
> contrib/pg_freespacemap/pg_freespacemap.c
> contrib/pgstattuple/pgstatapprox.c
>
> For pg_freespacemap, this doesn't matter, since it's just reporting
> the facts. For pgstattuple_approx(), it might under-estimate the free
> space and over-estimate the number of live tuples.
>

Sure, but without patch also, it can do so, if the vacuum hasn't updated
freespace map.

> This might be fine,
> since it is approximate after all, but maybe a comment would be
> helpful. If this is a problem, we could tweak it to be more precise
> for tables without FSMs.
>

Sounds reasonable to me.

> Thoughts?
>
> 4. The latest patch for the pg_upgrade piece was in [2]
>

It will be good if we get this one as well.  I will look into it once we
are done with the other points you have mentioned.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com


RE: Protect syscache from bloating with negative cache entries

2019-02-04 Thread Ideriha, Takeshi
>From: br...@momjian.us [mailto:br...@momjian.us]
>On Mon, Feb 4, 2019 at 08:23:39AM +, Tsunakawa, Takayuki wrote:
>> Horiguchi-san, Bruce, all, So, why don't we make
>> syscache_memory_target the upper limit on the total size of all
>> catcaches, and rethink the past LRU management?
>
>I was going to say that our experience with LRU has been that the overhead is 
>not
>worth the value, but that was in shared resource cases, which this is not.

One idea is building list with access counter for implementing LRU list based 
on this current patch.
The list is ordered by last access time. When a catcache entry is referenced, 
the list is maintained
, which is just manipulation of pointers at several times.
As Bruce mentioned, it's not shared so there is no cost related to lock 
contention.

When it comes to pruning, the cache older than certain timestamp with zero 
access counter is pruned.
This way would improve performance because it only scans limited range (bounded 
by sys_cache_min_age).  
Current patch scans all hash entries and check each timestamp which would 
decrease the performance as cache size grows.
I'm thinking hopefully implementing this idea and measuring the performance. 

And when we want to set the memory size limit as Tsunakawa san said, the LRU 
list would be suitable.

Regards,
Takeshi Ideriha




RE: Protect syscache from bloating with negative cache entries

2019-02-04 Thread Tsunakawa, Takayuki
From: br...@momjian.us [mailto:br...@momjian.us]
> On Mon, Feb 4, 2019 at 08:23:39AM +, Tsunakawa, Takayuki wrote:
> > Horiguchi-san, Bruce, all, So, why don't we make
> > syscache_memory_target the upper limit on the total size of all
> > catcaches, and rethink the past LRU management?
> 
> I was going to say that our experience with LRU has been that the
> overhead is not worth the value, but that was in shared resource cases,
> which this is not.

That's good news!  Then, let's proceed with the approach involving LRU, 
Horiguchi-san, Ideriha-san.


Regards
Takayuki Tsunakawa






Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread Michael Paquier
On Tue, Feb 05, 2019 at 01:50:50PM +1300, David Rowley wrote:
> I think I've done all the ones that use \E.  Those \Q ones don't need
> any escaping. I assume that's the ones you've found.  I didn't do an
> exhaustive check though.

Oh, good point.  I didn't know that anything between \Q and \E are
interpreted as literal characters.  Instead of the approach you are
proposing, perhaps it would make sense to extend this way of doing
things then?  For example some tests with CREATE CONVERSION do so.  It
looks much more portable than having to escape every dot.

>> test_pg_dump's 001_base.pl needs also a refresh.
> 
> Yeah, I thought about looking, but I thought it might be a bottomless pit.

I just double-checked this one, and the regex patterns there look
all correct.
--
Michael


signature.asc
Description: PGP signature


Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread David Rowley
On Tue, 5 Feb 2019 at 13:15, Michael Paquier  wrote:
> Some tests are missing the update, and it seems to me that
> tightening things up is a good thing, still we ought to do it
> consistently.  Some places missing the update:
> - ALTER OPERATOR FAMILY
> - ALTER OPERATOR CLASS
> - ALTER SEQUENCE
> - ALTER TABLE (ONLY, partitioned table)
> - BLOB load
> - COMMENT ON
> - COPY
> - INSERT INTO
> - CREATE COLLATION

I think I've done all the ones that use \E.  Those \Q ones don't need
any escaping. I assume that's the ones you've found.  I didn't do an
exhaustive check though.

> test_pg_dump's 001_base.pl needs also a refresh.

Yeah, I thought about looking, but I thought it might be a bottomless pit.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services



Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread Michael Paquier
On Mon, Feb 04, 2019 at 01:12:48PM +0100, Daniel Gustafsson wrote:
> +1 for tightening it up, and the patch looks good to me.
> 
> We may also want to use the + metacharacter instead of * in a few places, 
> since
> the intent is to always match something, where matching nothing should be
> considered an error:
> 
> -   qr/^ALTER TEXT SEARCH DICTIONARY dump_test.alt_ts_dict1 OWNER 
> TO .*;/m,
> +   qr/^ALTER TEXT SEARCH DICTIONARY dump_test\.alt_ts_dict1 
> OWNER TO .*;/m,

Some tests are missing the update, and it seems to me that
tightening things up is a good thing, still we ought to do it
consistently.  Some places missing the update:
- ALTER OPERATOR FAMILY
- ALTER OPERATOR CLASS
- ALTER SEQUENCE
- ALTER TABLE (ONLY, partitioned table)
- BLOB load
- COMMENT ON
- COPY
- INSERT INTO
- CREATE COLLATION

test_pg_dump's 001_base.pl needs also a refresh.
--
Michael


signature.asc
Description: PGP signature


Re: propagating replica identity to partitions

2019-02-04 Thread Alvaro Herrera
Actually, index-based replica identities failed in pg_dump: we first
create the index ONLY on the partitioned table (marking it as invalid),
so when we immediately do the ALTER TABLE/REPLICA IDENTITY, it rejects
the invalid index.  If we change the code to allow invalid indexes on
partitioned tables to become replica identities, we hit the problem that
the index didn't exist when processing the partition list.  In order to
fix that, I added a flag so that partitions are allowed not to have the
index, in hopes that the missing index are created in subsequent
commands; those indexes should become valid & identity afterwards.

There's a small emerging problem, which is that if you have an invalid
index marked as replica identity, you cannot create any more partitions;
the reason is that we want to propagate the replica identity to the
partition, but the index is not there (since invalid indexes are not
propagated).  I don't think this case is worth supporting; it can be
fixed but requires some work[1].

New patch attached.

[1] In DefineRelation, we obtain the list of indexes to clone by
RelationGetIndexList, which ignores invalid indexes.  In order to
implement this we'd need to include invalid indexes in that list
(possibly by just not using RelationGetIndexList.)

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v3 1/2] index_get_partition

---
 src/backend/catalog/partition.c  | 35 +++
 src/backend/commands/tablecmds.c | 40 +++-
 src/include/catalog/partition.h  |  1 +
 3 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
 }
 
 /*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+	List	   *idxlist = RelationGetIndexList(partition);
+	ListCell   *l;
+
+	foreach(l, idxlist)
+	{
+		Oid			partIdx = lfirst_oid(l);
+		HeapTuple	tup;
+		Form_pg_class classForm;
+		bool		ispartition;
+
+		tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+		if (!tup)
+			elog(ERROR, "cache lookup failed for relation %u", partIdx);
+		classForm = (Form_pg_class) GETSTRUCT(tup);
+		ispartition = classForm->relispartition;
+		ReleaseSysCache(tup);
+		if (!ispartition)
+			continue;
+		if (get_partition_parent(lfirst_oid(l)) == indexId)
+		{
+			list_free(idxlist);
+			return partIdx;
+		}
+	}
+
+	return InvalidOid;
+}
+
+/*
  * map_partition_varattnos - maps varattno of any Vars in expr from the
  * attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
  * may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 static void
 refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
 {
-	Relation	pg_inherits;
-	ScanKeyData key;
-	HeapTuple	tuple;
-	SysScanDesc scan;
+	Oid			existingIdx;
 
-	pg_inherits = table_open(InheritsRelationId, AccessShareLock);
-	ScanKeyInit(, Anum_pg_inherits_inhparent,
-BTEqualStrategyNumber, F_OIDEQ,
-ObjectIdGetDatum(RelationGetRelid(parentIdx)));
-	scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
-			  NULL, 1, );
-	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
-	{
-		Form_pg_inherits inhForm;
-		Oid			tab;
-
-		inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
-		tab = IndexGetRelation(inhForm->inhrelid, false);
-		if (tab == RelationGetRelid(partitionTbl))
-			ereport(ERROR,
-	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-	 errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
-			RelationGetRelationName(partIdx),
-			RelationGetRelationName(parentIdx)),
-	 errdetail("Another index is already attached for partition \"%s\".",
-			   RelationGetRelationName(partitionTbl;
-	}
-
-	systable_endscan(scan);
-	table_close(pg_inherits, AccessShareLock);
+	existingIdx = index_get_partition(partitionTbl,
+	  RelationGetRelid(parentIdx));
+	if (OidIsValid(existingIdx))
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+		

Re: [HACKERS] Removing [Merge]Append nodes which contain a single subpath

2019-02-04 Thread David Rowley
On Thu, 31 Jan 2019 at 17:22, David Rowley  wrote:
> I've also attached a rebased patch.

Rebased again.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v12-0001-Forgo-generating-single-subpath-Append-and-Merge.patch
Description: Binary data


Re: Protect syscache from bloating with negative cache entries

2019-02-04 Thread br...@momjian.us
On Mon, Feb 4, 2019 at 08:23:39AM +, Tsunakawa, Takayuki wrote:
> Horiguchi-san, Bruce, all, So, why don't we make
> syscache_memory_target the upper limit on the total size of all
> catcaches, and rethink the past LRU management?

I was going to say that our experience with LRU has been that the
overhead is not worth the value, but that was in shared resource cases,
which this is not.

-- 
  Bruce Momjian  http://momjian.us
  EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+  Ancient Roman grave inscription +



Re: Pre-v11 appearances of the word "procedure" in v11 docs

2019-02-04 Thread Peter Eisentraut
Some things from this thread were left for post-11; see attached patch.

Specifically, this changes pg_dump and ruleutils.c to use the FUNCTION
keyword instead of PROCEDURE in trigger and event trigger definitions,
which was postponed because of the required catversion bump.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 787c5029b82c826e69e5522a12c68ddfe253f01a Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 4 Feb 2019 22:38:54 +0100
Subject: [PATCH] Use EXECUTE FUNCTION syntax for triggers more

Change pg_dump and ruleutils.c to use the FUNCTION keyword instead of
PROCEDURE in trigger and event trigger definitions.

This completes the pieces of the transition started in
0a63f996e018ac508c858e87fa39cc254a5db49f that were kept out of
PostgreSQL 11 because of the required catversion change.
---
 src/backend/catalog/information_schema.sql |  4 +-
 src/backend/utils/adt/ruleutils.c  |  2 +-
 src/bin/pg_dump/pg_dump.c  |  4 +-
 src/bin/pg_dump/t/002_pg_dump.pl   |  8 ++--
 src/include/catalog/catversion.h   |  2 +-
 src/test/regress/expected/triggers.out | 56 +++---
 6 files changed, 38 insertions(+), 38 deletions(-)

diff --git a/src/backend/catalog/information_schema.sql 
b/src/backend/catalog/information_schema.sql
index b27ff5fa35..94e482596f 100644
--- a/src/backend/catalog/information_schema.sql
+++ b/src/backend/catalog/information_schema.sql
@@ -2094,12 +2094,12 @@ CREATE VIEW triggers AS
  AS cardinal_number) AS action_order,
CAST(
  CASE WHEN pg_has_role(c.relowner, 'USAGE')
-   THEN (regexp_match(pg_get_triggerdef(t.oid), E'.{35,} WHEN 
\\((.+)\\) EXECUTE PROCEDURE'))[1]
+   THEN (regexp_match(pg_get_triggerdef(t.oid), E'.{35,} WHEN 
\\((.+)\\) EXECUTE FUNCTION'))[1]
ELSE null END
  AS character_data) AS action_condition,
CAST(
  substring(pg_get_triggerdef(t.oid) from
-   position('EXECUTE PROCEDURE' in 
substring(pg_get_triggerdef(t.oid) from 48)) + 47)
+   position('EXECUTE FUNCTION' in 
substring(pg_get_triggerdef(t.oid) from 48)) + 47)
  AS character_data) AS action_statement,
CAST(
  -- hard-wired reference to TRIGGER_TYPE_ROW
diff --git a/src/backend/utils/adt/ruleutils.c 
b/src/backend/utils/adt/ruleutils.c
index 17a28c2651..e1fbe494d5 100644
--- a/src/backend/utils/adt/ruleutils.c
+++ b/src/backend/utils/adt/ruleutils.c
@@ -1044,7 +1044,7 @@ pg_get_triggerdef_worker(Oid trigid, bool pretty)
appendStringInfoString(, ") ");
}
 
-   appendStringInfo(, "EXECUTE PROCEDURE %s(",
+   appendStringInfo(, "EXECUTE FUNCTION %s(",
 
generate_function_name(trigrec->tgfoid, 0,

NIL, argtypes,

false, NULL, EXPR_KIND_NONE));
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3a89ad846a..b6030f56ae 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -17086,7 +17086,7 @@ dumpTrigger(Archive *fout, TriggerInfo *tginfo)
appendPQExpBufferStr(query, "FOR EACH STATEMENT\n   
 ");
 
/* regproc output is already sufficiently quoted */
-   appendPQExpBuffer(query, "EXECUTE PROCEDURE %s(",
+   appendPQExpBuffer(query, "EXECUTE FUNCTION %s(",
  tginfo->tgfname);
 
tgargs = (char *) PQunescapeBytea((unsigned char *) 
tginfo->tgargs,
@@ -17200,7 +17200,7 @@ dumpEventTrigger(Archive *fout, EventTriggerInfo 
*evtinfo)
appendPQExpBufferChar(query, ')');
}
 
-   appendPQExpBufferStr(query, "\n   EXECUTE PROCEDURE ");
+   appendPQExpBufferStr(query, "\n   EXECUTE FUNCTION ");
appendPQExpBufferStr(query, evtinfo->evtfname);
appendPQExpBufferStr(query, "();\n");
 
diff --git a/src/bin/pg_dump/t/002_pg_dump.pl b/src/bin/pg_dump/t/002_pg_dump.pl
index 245fcbf5ce..4ff0c5645b 100644
--- a/src/bin/pg_dump/t/002_pg_dump.pl
+++ b/src/bin/pg_dump/t/002_pg_dump.pl
@@ -1540,11 +1540,11 @@
create_order => 33,
create_sql   => 'CREATE EVENT TRIGGER test_event_trigger
   ON ddl_command_start
-  EXECUTE PROCEDURE 
dump_test.event_trigger_func();',
+  EXECUTE FUNCTION 
dump_test.event_trigger_func();',
regexp => qr/^
\QCREATE EVENT TRIGGER test_event_trigger \E
\QON ddl_command_start\E
-   

Re: New vacuum option to do only freezing

2019-02-04 Thread Bossart, Nathan
On 2/3/19, 1:48 PM, "Masahiko Sawada"  wrote:
> On Sat, Feb 2, 2019 at 7:00 PM Alvaro Herrera  
> wrote:
>>
>> On 2019-Feb-01, Bossart, Nathan wrote:
>>
>> > IMHO we could document this feature at a slightly higher level without
>> > leaving out any really important user-facing behavior.  Here's a quick
>> > attempt to show what I am thinking:
>> >
>> > With this option, VACUUM skips all index cleanup behavior and
>> > only marks tuples as "dead" without reclaiming the storage.
>> > While this can help reclaim transaction IDs faster to avoid
>> > transaction ID wraparound (see Section 24.1.5), it will not
>> > reduce bloat.
>>
>> Hmm ... don't we compact out the storage for dead tuples?  If we do (and
>> I think we should) then this wording is not entirely correct.
>
> Yeah, we remove tuple and leave the dead ItemId. So we actually
> reclaim the almost tuple storage.

Ah, yes.  I was wrong here.  Thank you for clarifying.

> Attached the updated patch and the patch for vacuumdb.

Thanks!  I am hoping to take a deeper look at this patch in the next
couple of days.

Nathan



Re: propagating replica identity to partitions

2019-02-04 Thread Alvaro Herrera
On 2019-Feb-04, Alvaro Herrera wrote:

> I'll now look more carefully at the cases involving indexes; thus far I
> was looking at the ones using FULL.  Those seem to work as intended.

Yeah, that didn't work so well -- it was trying to propagate the command
verbatim to each partition, and obviously the index names did not match.
So this subcommand has to reimplement the recursion internally, as in
the attached.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
>From f205a06c9aa6061dd1bac01ef63017aa6811b5f9 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Tue, 22 Jan 2019 18:00:31 -0300
Subject: [PATCH v2 1/2] index_get_partition

---
 src/backend/catalog/partition.c  | 35 +++
 src/backend/commands/tablecmds.c | 40 +++-
 src/include/catalog/partition.h  |  1 +
 3 files changed, 47 insertions(+), 29 deletions(-)

diff --git a/src/backend/catalog/partition.c b/src/backend/catalog/partition.c
index 0d3bc3a2c7..66012bb28a 100644
--- a/src/backend/catalog/partition.c
+++ b/src/backend/catalog/partition.c
@@ -146,6 +146,41 @@ get_partition_ancestors_worker(Relation inhRel, Oid relid, List **ancestors)
 }
 
 /*
+ * Return the OID of the index, in the given partition, that is a child of the
+ * given index or InvalidOid if there isn't one.
+ */
+Oid
+index_get_partition(Relation partition, Oid indexId)
+{
+	List	   *idxlist = RelationGetIndexList(partition);
+	ListCell   *l;
+
+	foreach(l, idxlist)
+	{
+		Oid			partIdx = lfirst_oid(l);
+		HeapTuple	tup;
+		Form_pg_class classForm;
+		bool		ispartition;
+
+		tup = SearchSysCache1(RELOID, ObjectIdGetDatum(partIdx));
+		if (!tup)
+			elog(ERROR, "cache lookup failed for relation %u", partIdx);
+		classForm = (Form_pg_class) GETSTRUCT(tup);
+		ispartition = classForm->relispartition;
+		ReleaseSysCache(tup);
+		if (!ispartition)
+			continue;
+		if (get_partition_parent(lfirst_oid(l)) == indexId)
+		{
+			list_free(idxlist);
+			return partIdx;
+		}
+	}
+
+	return InvalidOid;
+}
+
+/*
  * map_partition_varattnos - maps varattno of any Vars in expr from the
  * attno's of 'from_rel' to the attno's of 'to_rel' partition, each of which
  * may be either a leaf partition or a partitioned table, but both of which
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 35a9ade059..877bac506f 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -15427,36 +15427,18 @@ ATExecAttachPartitionIdx(List **wqueue, Relation parentIdx, RangeVar *name)
 static void
 refuseDupeIndexAttach(Relation parentIdx, Relation partIdx, Relation partitionTbl)
 {
-	Relation	pg_inherits;
-	ScanKeyData key;
-	HeapTuple	tuple;
-	SysScanDesc scan;
+	Oid			existingIdx;
 
-	pg_inherits = table_open(InheritsRelationId, AccessShareLock);
-	ScanKeyInit(, Anum_pg_inherits_inhparent,
-BTEqualStrategyNumber, F_OIDEQ,
-ObjectIdGetDatum(RelationGetRelid(parentIdx)));
-	scan = systable_beginscan(pg_inherits, InheritsParentIndexId, true,
-			  NULL, 1, );
-	while (HeapTupleIsValid(tuple = systable_getnext(scan)))
-	{
-		Form_pg_inherits inhForm;
-		Oid			tab;
-
-		inhForm = (Form_pg_inherits) GETSTRUCT(tuple);
-		tab = IndexGetRelation(inhForm->inhrelid, false);
-		if (tab == RelationGetRelid(partitionTbl))
-			ereport(ERROR,
-	(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
-	 errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
-			RelationGetRelationName(partIdx),
-			RelationGetRelationName(parentIdx)),
-	 errdetail("Another index is already attached for partition \"%s\".",
-			   RelationGetRelationName(partitionTbl;
-	}
-
-	systable_endscan(scan);
-	table_close(pg_inherits, AccessShareLock);
+	existingIdx = index_get_partition(partitionTbl,
+	  RelationGetRelid(parentIdx));
+	if (OidIsValid(existingIdx))
+		ereport(ERROR,
+(errcode(ERRCODE_OBJECT_NOT_IN_PREREQUISITE_STATE),
+ errmsg("cannot attach index \"%s\" as a partition of index \"%s\"",
+		RelationGetRelationName(partIdx),
+		RelationGetRelationName(parentIdx)),
+ errdetail("Another index is already attached for partition \"%s\".",
+		   RelationGetRelationName(partitionTbl;
 }
 
 /*
diff --git a/src/include/catalog/partition.h b/src/include/catalog/partition.h
index 5685d2fd57..7f0b198bcb 100644
--- a/src/include/catalog/partition.h
+++ b/src/include/catalog/partition.h
@@ -35,6 +35,7 @@ typedef struct PartitionDescData
 
 extern Oid	get_partition_parent(Oid relid);
 extern List *get_partition_ancestors(Oid relid);
+extern Oid	index_get_partition(Relation partition, Oid indexId);
 extern List *map_partition_varattnos(List *expr, int fromrel_varno,
 		Relation to_rel, Relation from_rel,
 		bool *found_whole_row);
-- 
2.11.0

>From 688b801b8799ecd9827ee203bfb690536ac84ff4 Mon Sep 17 00:00:00 2001
From: Alvaro Herrera 
Date: Mon, 4 Feb 

Re: proposal: plpgsql pragma statement

2019-02-04 Thread Pavel Stehule
Hi,

po 4. 2. 2019 v 6:10 odesílatel Michael Paquier 
napsal:

> On Fri, Jan 04, 2019 at 02:17:49PM +0100, Pavel Stehule wrote:
> > It means to write own lexer and preparse source code before I start
> > checking.
> >
> > I think so block level PRAGMA is significantly better solution
>
> Please note that the latest patch is failing to apply, so I have moved
> the patch to next CF, waiting on author.
>

 attached rebased patch

thank you for checking.

Regards

Pavel

--
> Michael
>
diff --git a/doc/src/sgml/plpgsql.sgml b/doc/src/sgml/plpgsql.sgml
index f8c6435c50..54bfb3f137 100644
--- a/doc/src/sgml/plpgsql.sgml
+++ b/doc/src/sgml/plpgsql.sgml
@@ -814,6 +814,49 @@ $$ LANGUAGE plpgsql;
 happen in a plain SQL command.

   
+
+  
+   Block level PRAGMA
+
+   
+A PL/pgSQL function supports pragma on block
+level. Pragma is a compiler directive, that can be used by
+PL/pgSQL compiler, or by any extensions that
+can work with PL/pgSQL code.
+   
+
+
+PRAGMA name;
+PRAGMA name (  argument_name =  value );
+
+
+   
+The pragma can be used for plpgsql_check
+enabling/disabling code checking or for storing additional information:
+
+
+DECLARE
+  PRAGMA plpgsql_check(off);
+BEGIN
+  -- code inside block will not be checked by plpgsql_check
+  ...
+
+
+DECLARE
+  -- force routine volatility detection
+  PRAGMA plpgsql_check(volatility => volatile);
+  PRAGMA plpgsql_check(temporary_table => 'tmp_tab', '(a int, b int, c int)');
+BEGIN
+  ...
+
+
+More details are described in related extension's description.
+   
+
+   
+Unknown pragma is ignored.
+   
+  
   
 
   
diff --git a/src/pl/plpgsql/src/Makefile b/src/pl/plpgsql/src/Makefile
index cc1c2613d3..2aded819f9 100644
--- a/src/pl/plpgsql/src/Makefile
+++ b/src/pl/plpgsql/src/Makefile
@@ -27,7 +27,8 @@ DATA = plpgsql.control plpgsql--1.0.sql plpgsql--unpackaged--1.0.sql
 REGRESS_OPTS = --dbname=$(PL_TESTDB)
 
 REGRESS = plpgsql_call plpgsql_control plpgsql_domain plpgsql_record \
-	plpgsql_cache plpgsql_transaction plpgsql_trigger plpgsql_varprops
+	plpgsql_cache plpgsql_transaction plpgsql_trigger plpgsql_varprops \
+	plpgsql_pragma
 
 # where to find gen_keywordlist.pl and subsidiary files
 TOOLSDIR = $(top_srcdir)/src/tools
diff --git a/src/pl/plpgsql/src/expected/plpgsql_pragma.out b/src/pl/plpgsql/src/expected/plpgsql_pragma.out
new file mode 100644
index 00..ffe5c7664a
--- /dev/null
+++ b/src/pl/plpgsql/src/expected/plpgsql_pragma.out
@@ -0,0 +1,11 @@
+do $$
+DECLARE
+  var int;
+  PRAGMA xxx;
+  PRAGMA do;
+  PRAGMA var; -- name can be any identifier
+  PRAGMA xxx(10, 10.1, '', "a"., off, on); -- supported types
+  PRAGMA xxx(label => value);
+BEGIN
+END;
+$$;
diff --git a/src/pl/plpgsql/src/pl_gram.y b/src/pl/plpgsql/src/pl_gram.y
index 03f7cdce8c..33e6929af9 100644
--- a/src/pl/plpgsql/src/pl_gram.y
+++ b/src/pl/plpgsql/src/pl_gram.y
@@ -111,6 +111,11 @@ static	PLpgSQL_expr	*read_cursor_args(PLpgSQL_var *cursor,
 static	List			*read_raise_options(void);
 static	void			check_raise_parameters(PLpgSQL_stmt_raise *stmt);
 
+/*
+ * local variable for collection pragmas inside one declare block
+ */
+static List		   *pragmas;
+
 %}
 
 %expect 0
@@ -146,6 +151,7 @@ static	void			check_raise_parameters(PLpgSQL_stmt_raise *stmt);
 			char *label;
 			int  n_initvars;
 			int  *initvarnos;
+			List *pragmas;
 		}		declhdr;
 		struct
 		{
@@ -166,6 +172,8 @@ static	void			check_raise_parameters(PLpgSQL_stmt_raise *stmt);
 		PLpgSQL_diag_item		*diagitem;
 		PLpgSQL_stmt_fetch		*fetch;
 		PLpgSQL_case_when		*casewhen;
+		PLpgSQL_pragma			*pragma;
+		PLpgSQL_pragma_arg		*pragma_arg;
 }
 
 %type  decl_sect
@@ -221,6 +229,9 @@ static	void			check_raise_parameters(PLpgSQL_stmt_raise *stmt);
 
 %type 	unreserved_keyword
 
+%type 	pragma_args
+%type  pragma_arg
+%type  pragma_val
 
 /*
  * Basic non-keyword token types.  These are hard-wired into the core lexer.
@@ -321,6 +332,7 @@ static	void			check_raise_parameters(PLpgSQL_stmt_raise *stmt);
 %token 	K_PG_EXCEPTION_CONTEXT
 %token 	K_PG_EXCEPTION_DETAIL
 %token 	K_PG_EXCEPTION_HINT
+%token 	K_PRAGMA
 %token 	K_PRINT_STRICT_PARAMS
 %token 	K_PRIOR
 %token 	K_QUERY
@@ -418,6 +430,7 @@ pl_block		: decl_sect K_BEGIN proc_sect exception_sect K_END opt_label
 		new->label		= $1.label;
 		new->n_initvars = $1.n_initvars;
 		new->initvarnos = $1.initvarnos;
+		new->pragmas	= $1.pragmas;
 		new->body		= $3;
 		new->exceptions	= $4;
 
@@ -436,6 +449,7 @@ decl_sect		: opt_block_label
 		$$.label	  = $1;
 		$$.n_initvars = 0;
 		$$.initvarnos = NULL;
+		$$.pragmas	  = NIL;
 	}
 | opt_block_label decl_start
 	{
@@ -443,6 +457,7 @@ decl_sect		: opt_block_label
 		$$.label	  = $1;
 		$$.n_initvars = 0;
 		$$.initvarnos = NULL;
+		$$.pragmas	  = NIL;
 	}
 | opt_block_label decl_start decl_stmts
 	{
@@ -450,6 +465,9 @@ decl_sect		: opt_block_label
 		$$.label	  = $1;
 		/* Remember 

propagating replica identity to partitions

2019-02-04 Thread Alvaro Herrera
Hello

If you do ALTER TABLE .. REPLICA IDENTITY to a partitioned table, the
command operates on the parent table itself and does not propagate to
partitions.  Why is this?  Maybe not recursing was the right call when
we only had regular inheritance (back in 9.4), but since partitioned
tables got introduced, I think it is completely the other way around:
not recursing is an usability fail.

At the same time, I think that psql failing to display the replica
identity for partitioned tables is just an oversight and not designed
in.

I propose to change the behavior to:

1. When replica identity is changed on a partitioned table, all partitions
   are updated also.  Do we need to care about regular inheritance?
   My inclination is not to touch those; this'd become the first case
   in ATPrepCmd that recurses on partitioning but not inheritance.

2. When a partition is created, the replica identity is set to the
   same that the parent table has.  If it's type index, figure out the
   corresponding index in the partition, set that.  If the index doesn't
   exist, raise an error (i.e. replica identity cannot be set to an
   index until it has propagated to all children).

3. psql should display replica identity for partitioned tables.

Thoughts?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Too rigorous assert in reorderbuffer.c

2019-02-04 Thread Alexey Kondratov

Hi,

On 31.01.2019 9:21, Arseny Sher wrote:

My colleague Alexander Lakhin has noticed an assertion failure in
reorderbuffer.c:1330. Here is a simple snippet reproducing it:

SELECT 'init' FROM pg_create_logical_replication_slot('regression_slot', 
'test_decoding');

create table t(k int);
begin;
savepoint a;
alter table t alter column k type text;
rollback to savepoint a;
alter table t alter column k type bigint;
commit;

SELECT data FROM pg_logical_slot_get_changes('regression_slot', NULL, NULL, 
'include-xids', '0', 'skip-empty-xacts', '1');


I just want to add, that I have accidentally discovered the same issue 
during the testing of the Tomas's large transactions streaming patch 
[1], and had to remove this assert to get things working. I thought that 
it was somehow related to the streaming mode and did not test the same 
query alone.



[1] 
https://www.postgresql.org/message-id/76fc440e-91c3-afe2-b78a-987205b3c758%402ndquadrant.com



Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company




Re: Log a sample of transactions

2019-02-04 Thread Adrien NAYRAT

On 2/4/19 3:26 AM, Michael Paquier wrote:

These would take time to execute, even if you need to measure one
second, and may be hard to make stable on slow machines.



Could it be acceptable if I set log_min_duration_statement = 0?


Moved to next CF.


Thanks



Re: Feature: temporary materialized views

2019-02-04 Thread Andreas Karlsson

On 2/4/19 7:09 AM, Michael Paquier wrote:

On Tue, Jan 22, 2019 at 03:10:17AM +0100, Andreas Karlsson wrote:

On 1/21/19 3:31 AM, Andreas Karlsson wrote:

Here is a a stab at refactoring this so the object creation does not
happen in a callback.


Rebased my patch on top of Andres's pluggable storage patches. Plus some
minor style changes.


Taking a note to look at this refactoring bit, which is different from
the temp matview part.  Moved to next CF for now.


Should I submit it as a separate CF entry or is it easiest if my 
refactoring and Mi Tar's feature are reviewed together?


Andreas



Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions

2019-02-04 Thread Alexey Kondratov

Hi Tomas,

On 14.01.2019 21:23, Tomas Vondra wrote:

Attached is an updated patch series, merging fixes and changes to TAP
tests proposed by Alexey. I've merged the fixes into the appropriate
patches, and I've kept the TAP changes / new tests as separate patches
towards the end of the series.


I had problems applying this patch along with 2pc streaming one to the 
current master, but everything applied well on 97c39498e5. Regression 
tests pass. What I personally do not like in the current TAP tests set 
is that you have added "WITH (streaming=on)" to all tests including old 
non-streaming ones. It seems unclear, which mechanism is tested there: 
streaming, but those transactions probably do not hit memory limit, so 
it depends on default server parameters; or non-streaming, but then what 
is the need for (streaming=on)? I would prefer to add (streaming=on) 
only to the new tests, where it is clearly necessary.



I'm a bit unhappy with two aspects of the current patch series:

1) We now track schema changes in two ways - using the pre-existing
schema_sent flag in RelationSyncEntry, and the (newly added) flag in
ReorderBuffer. While those options are used for regular vs. streamed
transactions, fundamentally it's the same thing and so having two
competing ways seems like a bad idea. Not sure what's the best way to
resolve this, though.


Yes, sure, when I have found problems with streaming of extensive DDL, I 
added new flag in the simplest way, and it worked. Now, old schema_sent 
flag is per relation based, while the new one - is_schema_sent - is per 
top-level transaction based. If I get it correctly, the former seems to 
be more thrifty, since new schema is sent only if we are streaming 
change for relation, whose schema is outdated. In contrast, in the 
latter case we will send new schema even if there will be no new changes 
which belong to this relation.


I guess, it would be better to stick to the old behavior. I will try to 
investigate how to better use it in the streaming mode as well.



2) We've removed quite a few asserts, particularly ensuring sanity of
cmin/cmax values. To some extent that's expected, because by allowing
decoding of in-progress transactions relaxes some of those rules. But
I'd be much happier if some of those asserts could be reinstated, even
if only in a weaker form.



Asserts have been removed from two places: (1) 
HeapTupleSatisfiesHistoricMVCC, which seems inevitable, since we are 
touching the essence of the MVCC visibility rules, when trying to decode 
an in-progress transaction, and (2) ReorderBufferBuildTupleCidHash, 
which is probably not related directly to the topic of the ongoing 
patch, since Arseny Sher faced the same issue with simple repetitive DDL 
decoding [1] recently.


Not many, but I agree, that replacing them with some softer asserts 
would be better, than just removing, especially point 1).



[1] https://www.postgresql.org/message-id/flat/874l9p8hyw.fsf%40ars-thinkpad


Regards

--
Alexey Kondratov

Postgres Professional https://www.postgrespro.com
Russian Postgres Company




Re: Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread Daniel Gustafsson
> On 4 Feb 2019, at 11:54, David Rowley  wrote:
> 
> There are a few regexes in pg_dump's tap tests that are neglecting to
> escape the dot in "schema.table" type expressions. This could result
> in the test passing when it shouldn't.  It seems worth putting that
> right.

+1 for tightening it up, and the patch looks good to me.

We may also want to use the + metacharacter instead of * in a few places, since
the intent is to always match something, where matching nothing should be
considered an error:

- qr/^ALTER TEXT SEARCH DICTIONARY dump_test.alt_ts_dict1 OWNER 
TO .*;/m,
+ qr/^ALTER TEXT SEARCH DICTIONARY dump_test\.alt_ts_dict1 
OWNER TO .*;/m,

cheers ./daniel


Re: Prevent extension creation in temporary schemas

2019-02-04 Thread Chris Travers
This could probably use a quick note in the docs.

Tighten up a few overly lax regexes in pg_dump's tap tests

2019-02-04 Thread David Rowley
There are a few regexes in pg_dump's tap tests that are neglecting to
escape the dot in "schema.table" type expressions. This could result
in the test passing when it shouldn't.  It seems worth putting that
right.

Patch attached.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


tighten_up_pg_dump_tap_tests.patch
Description: Binary data


Allow some recovery parameters to be changed with reload

2019-02-04 Thread Peter Eisentraut
I think the recovery parameters

archive_cleanup_command
promote_trigger_file
recovery_end_command
recovery_min_apply_delay

can be changed from PGC_POSTMASTER to PGC_SIGHUP without any further
complications (unlike for example primary_conninfo, which is being
discussed elsewhere).

See attached patch.

-- 
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
From 6e0777f4799bce027aa339629539cc101ed0f862 Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 4 Feb 2019 09:28:17 +0100
Subject: [PATCH] Allow some recovery parameters to be changed with reload

Change

archive_cleanup_command
promote_trigger_file
recovery_end_command
recovery_min_apply_delay

from PGC_POSTMASTER to PGC_SIGHUP.  This did not require any further
changes.
---
 doc/src/sgml/config.sgml  | 21 +++
 src/backend/utils/misc/guc.c  |  8 +++
 src/backend/utils/misc/postgresql.conf.sample |  4 
 3 files changed, 21 insertions(+), 12 deletions(-)

diff --git a/doc/src/sgml/config.sgml b/doc/src/sgml/config.sgml
index 9b7a7388d5..7e208a4b81 100644
--- a/doc/src/sgml/config.sgml
+++ b/doc/src/sgml/config.sgml
@@ -3081,8 +3081,7 @@ Archive Recovery
 
  This section describes the settings that apply only for the duration of
  the recovery.  They must be reset for any subsequent recovery you wish to
- perform.  They can only be set at server start and cannot be changed once
- recovery has begun.
+ perform.
 
 
 
@@ -3161,6 +3160,10 @@ Archive Recovery
 database server shutdown) or an error by the shell (such as command
 not found), then recovery will abort and the server will not start up.

+
+   
+This parameter can only be set at server start.
+   
   
  
 
@@ -3202,6 +3205,10 @@ Archive Recovery
 terminated by a signal or an error by the shell (such as command not
 found), a fatal error will be raised.

+   
+This parameter can only be set in the 
postgresql.conf
+file or on the server command line.
+   
   
  
 
@@ -3227,6 +3234,10 @@ Archive Recovery
 signal or an error by the shell (such as command not found), the
 database will not proceed with startup.

+   
+This parameter can only be set in the 
postgresql.conf
+file or on the server command line.
+   
   
  
 
@@ -3863,7 +3874,8 @@ Standby Servers
   standby.  Even if this value is not set, you can still promote
   the standby using pg_ctl promote or calling
   pg_promote.
-  This parameter can only be set at server start.
+  This parameter can only be set in the 
postgresql.conf
+  file or on the server command line.
  
 

@@ -4117,7 +4129,8 @@ Standby Servers
 


-This parameter can only be set at server start.
+This parameter can only be set in the 
postgresql.conf
+file or on the server command line.

   
  
diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 8681ada33a..ea5444c6f1 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -2047,7 +2047,7 @@ static struct config_int ConfigureNamesInt[] =
},
 
{
-   {"recovery_min_apply_delay", PGC_POSTMASTER, 
REPLICATION_STANDBY,
+   {"recovery_min_apply_delay", PGC_SIGHUP, REPLICATION_STANDBY,
gettext_noop("Sets the minimum delay for applying 
changes during recovery."),
NULL,
GUC_UNIT_MS
@@ -3398,7 +3398,7 @@ static struct config_string ConfigureNamesString[] =
},
 
{
-   {"archive_cleanup_command", PGC_POSTMASTER, 
WAL_ARCHIVE_RECOVERY,
+   {"archive_cleanup_command", PGC_SIGHUP, WAL_ARCHIVE_RECOVERY,
gettext_noop("Sets the shell command that will be 
executed at every restart point."),
NULL
},
@@ -3408,7 +3408,7 @@ static struct config_string ConfigureNamesString[] =
},
 
{
-   {"recovery_end_command", PGC_POSTMASTER, WAL_ARCHIVE_RECOVERY,
+   {"recovery_end_command", PGC_SIGHUP, WAL_ARCHIVE_RECOVERY,
gettext_noop("Sets the shell command that will be 
executed once at the end of recovery."),
NULL
},
@@ -3474,7 +3474,7 @@ static struct config_string ConfigureNamesString[] =
},
 
{
-   {"promote_trigger_file", PGC_POSTMASTER, REPLICATION_STANDBY,
+   {"promote_trigger_file", PGC_SIGHUP, REPLICATION_STANDBY,
gettext_noop("Specifies a file name whose presence ends 
recovery in the standby."),
NULL
 

Re: FETCH FIRST clause WITH TIES option

2019-02-04 Thread Surafel Temesgen
On Mon, Feb 4, 2019 at 8:29 AM Michael Paquier  wrote:

>
> This patch needs a rebase because of conflicts done recently for
> pluggable storage, so moved to next CF, waiting on author.
> --
>

Thank you . rebased against current master

regards
Surafel
diff --git a/doc/src/sgml/ref/select.sgml b/doc/src/sgml/ref/select.sgml
index 4db8142afa..ed7249daeb 100644
--- a/doc/src/sgml/ref/select.sgml
+++ b/doc/src/sgml/ref/select.sgml
@@ -44,7 +44,7 @@ SELECT [ ALL | DISTINCT [ ON ( expressionexpression [ ASC | DESC | USING operator ] [ NULLS { FIRST | LAST } ] [, ...] ]
 [ LIMIT { count | ALL } ]
 [ OFFSET start [ ROW | ROWS ] ]
-[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY ]
+[ FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } [ ONLY | WITH TIES ] ]
 [ FOR { UPDATE | NO KEY UPDATE | SHARE | KEY SHARE } [ OF table_name [, ...] ] [ NOWAIT | SKIP LOCKED ] [...] ]
 
 where from_item can be one of:
@@ -1397,7 +1397,7 @@ OFFSET start
 which PostgreSQL also supports.  It is:
 
 OFFSET start { ROW | ROWS }
-FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } ONLY
+FETCH { FIRST | NEXT } [ count ] { ROW | ROWS } [ ONLY | WITH TIES ]
 
 In this syntax, the start
 or count value is required by
@@ -1407,7 +1407,10 @@ FETCH { FIRST | NEXT } [ count ] {
 ambiguity.
 If count is
 omitted in a FETCH clause, it defaults to 1.
-ROW
+ROW .
+WITH TIES option is used to return two or more rows
+that tie for last place in the limit results set according to ORDER BY
+clause (ORDER BY clause must be specified in this case).
 and ROWS as well as FIRST
 and NEXT are noise words that don't influence
 the effects of these clauses.
diff --git a/src/backend/executor/nodeLimit.c b/src/backend/executor/nodeLimit.c
index baa669abe8..5e7790c0d6 100644
--- a/src/backend/executor/nodeLimit.c
+++ b/src/backend/executor/nodeLimit.c
@@ -41,6 +41,7 @@ static TupleTableSlot *			/* return: a tuple or NULL */
 ExecLimit(PlanState *pstate)
 {
 	LimitState *node = castNode(LimitState, pstate);
+	ExprContext *econtext = node->ps.ps_ExprContext;
 	ScanDirection direction;
 	TupleTableSlot *slot;
 	PlanState  *outerPlan;
@@ -131,7 +132,8 @@ ExecLimit(PlanState *pstate)
  * the state machine state to record having done so.
  */
 if (!node->noCount &&
-	node->position - node->offset >= node->count)
+	node->position - node->offset >= node->count &&
+	node->limitOption == WITH_ONLY)
 {
 	node->lstate = LIMIT_WINDOWEND;
 
@@ -145,17 +147,64 @@ ExecLimit(PlanState *pstate)
 	return NULL;
 }
 
-/*
- * Get next tuple from subplan, if any.
- */
-slot = ExecProcNode(outerPlan);
-if (TupIsNull(slot))
+else if (!node->noCount &&
+		 node->position - node->offset >= node->count &&
+		 node->limitOption == WITH_TIES)
 {
-	node->lstate = LIMIT_SUBPLANEOF;
-	return NULL;
+	/*
+	 * Get next tuple from subplan, if any.
+	 */
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		node->lstate = LIMIT_SUBPLANEOF;
+		return NULL;
+	}
+	/*
+	 * Test if the new tuple and the last tuple match.
+	 * If so we return the tuple.
+	 */
+	econtext->ecxt_innertuple = slot;
+	econtext->ecxt_outertuple = node->last_slot;
+	if (ExecQualAndReset(node->eqfunction, econtext))
+	{
+		ExecCopySlot(node->last_slot, slot);
+		node->subSlot = slot;
+		node->position++;
+	}
+	else
+	{
+		node->lstate = LIMIT_WINDOWEND;
+
+		/*
+		* If we know we won't need to back up, we can release
+		* resources at this point.
+		*/
+		if (!(node->ps.state->es_top_eflags & EXEC_FLAG_BACKWARD))
+			(void) ExecShutdownNode(outerPlan);
+
+		return NULL;
+	}
+
+}
+else
+{
+	/*
+	 * Get next tuple from subplan, if any.
+	 */
+	slot = ExecProcNode(outerPlan);
+	if (TupIsNull(slot))
+	{
+		node->lstate = LIMIT_SUBPLANEOF;
+		return NULL;
+	}
+	if (node->limitOption == WITH_TIES)
+	{
+		ExecCopySlot(node->last_slot, slot);
+	}
+	node->subSlot = slot;
+	node->position++;
 }
-node->subSlot = slot;
-node->position++;
 			}
 			else
 			{
@@ -311,7 +360,8 @@ recompute_limits(LimitState *node)
 	 * must update the child node anyway, in case this is a rescan and the
 	 * previous time we got a different result.
 	 */
-	ExecSetTupleBound(compute_tuples_needed(node), outerPlanState(node));
+	if(node->limitOption == WITH_ONLY)
+		ExecSetTupleBound(compute_tuples_needed(node), outerPlanState(node));
 }
 
 /*
@@ -374,6 +424,7 @@ ExecInitLimit(Limit *node, EState *estate, int eflags)
 		   (PlanState *) limitstate);
 	limitstate->limitCount = ExecInitExpr((Expr *) node->limitCount,
 		  (PlanState *) limitstate);
+	limitstate->limitOption = node->limitOption;
 
 	/*
 	 * Initialize result type.
@@ -390,6 +441,24 @@ ExecInitLimit(Limit 

Re: WIP: BRIN multi-range indexes

2019-02-04 Thread Alvaro Herrera
On 2019-Feb-04, Tomas Vondra wrote:

> On 2/4/19 6:54 AM, Michael Paquier wrote:
> > On Tue, Oct 02, 2018 at 11:49:05AM +0900, Michael Paquier wrote:
> >> The latest patch set does not apply cleanly.  Could you rebase it?  I
> >> have moved the patch to CF 2018-10 for now, waiting on author.
> > 
> > It's been some time since that request, so I am marking the patch as
> > returned with feedback. 
> 
> But that's not the most recent version of the patch. On 28/12 I've
> submitted an updated / rebased patch.

Moved to next commitfest instead.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: WIP: BRIN multi-range indexes

2019-02-04 Thread Tomas Vondra
On 2/4/19 6:54 AM, Michael Paquier wrote:
> On Tue, Oct 02, 2018 at 11:49:05AM +0900, Michael Paquier wrote:
>> The latest patch set does not apply cleanly.  Could you rebase it?  I
>> have moved the patch to CF 2018-10 for now, waiting on author.
> 
> It's been some time since that request, so I am marking the patch as
> returned with feedback. 

But that's not the most recent version of the patch. On 28/12 I've
submitted an updated / rebased patch.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] PATCH: multivariate histograms and MCV lists

2019-02-04 Thread Tomas Vondra



On 2/4/19 5:53 AM, Michael Paquier wrote:
> On Sun, Feb 03, 2019 at 02:43:24AM -0800, Andres Freund wrote:
>> Are you planning to update the patch, or should the entry be marked as
>> RWF?
> 
> Moved the patch to next CF for now, waiting on author as the last
> review happened not so long ago.

Thanks. Yes, I intend to send a new patch version soon.

regards

-- 
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: Using POPCNT and other advanced bit manipulation instructions

2019-02-04 Thread David Rowley
Thanks for looking at this.

On Fri, 1 Feb 2019 at 11:45, Alvaro Herrera  wrote:
>
> I only have cosmetic suggestions for this patch.  For one thing, I think
> the .c file should be in src/port and its header should be in
> src/include/port/, right beside the likes of pg_bswap.h and pg_crc32c.h.

I've moved the code into src/port and renamed the file to pg_bitutils.c

> For another, I think the arrangement of all those "ifdef
> HAVE_THIS_OR_THAT" in the bitutils.c file is a bit hard to read.  I'd
> lay them out like this:

I've made this change too, although when doing it I realised that I
had forgotten to include the check for CPUID. It's possible that does
not exist but POPCNT does, I guess.  This has made the #ifs a bit more
complex.

> Finally, the part in bitmapset.c is repetitive on the #ifdefs; I'd just
> put at the top of the file something like

Yeah, agreed. Much neater that way.

> Other than those minor changes, I think we should just get this pushed
> and see what the buildfarm thinks.  In the words of a famous PG hacker:
> if a platform ain't in the buildfarm, we don't support it.

I also made a number of other changes to the patch.

1. The patch now only uses the -mpopcnt CFLAG for pg_bitutils.c.  I
thought this was important so we don't expose that flag in pg_config
and possibly end up building extension with popcnt instructions, which
might not be portable to other older hardware.
2. Wrote a new pg_popcnt function that accepts an array of bytes and a
size variable.  This seems useful for the bloomfilter use.

There are still various number_of_ones[] arrays around the codebase.
These exist in tsgistidx.c, _intbig_gist.c and _ltree_gist.c.  It
would be nice to get rid of those too, but one of the usages in each
of those 3 files requires XORing with another bit array before
counting the bits.  I thought about maybe writing a pop_count_xor()
function that accepts 2 byte arrays and a length parameter, but it
seems a bit special case, so I didn't.

Another thing I wasn't sure of was if I should just have
bms_num_members() just call pg_popcount(). It might be worth
benchmarking to see what's faster. My thinking is that pg_popcount
will inline the pg_popcount64() call so it would mean a single
function call rather than one for each bitmapword in the set.

I've compiled and run make check-world on Linux with GCC7.3 and
clang6.0. I've also tested on MSVC to ensure I didn't break windows.
It would be good to get a few more people to compile it and run the
tests.

-- 
 David Rowley   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


v2-0001-Add-basic-support-for-using-the-POPCNT-and-SSE4.2.patch
Description: Binary data


Re: pg_dump multi VALUES INSERT

2019-02-04 Thread Surafel Temesgen
On Sat, Feb 2, 2019 at 11:26 AM Fabien COELHO  wrote:

>
> Hello David,
>
> > Wondering if you have anything else here? I'm happy for the v13
> > version to be marked as ready for committer.
>
> I still have a few comments.
>
> Patch applies cleanly, compiles, global & local make check are ok.
>
> Typos and style in the doc:
>
> "However, since, by default this option generates ..."
> "However, since this option, by default, generates ..."
>
> I'd suggest a more straightforward to my mind and ear: "However, since by
> default the option generates ..., ", although beware that I'm not a
> native English speaker.
>
>
fixed

I'd suggest not to rely on "atoi" because it does not check the argument
> syntax, so basically anything is accepted, eg "1O" is 1;
>

i change it to strtol

>
> On "if (!dopt->do_nothing) $1 else $2;", I'd rather use a straight
> condition "if (dopt->do_nothing) $2 else $1;" (two instances).
>

fixed

regards
Surafel
diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 9e0bb93f08..0ab57067a8 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -661,9 +661,9 @@ PostgreSQL documentation
 ...).  This will make restoration very slow; it is mainly
 useful for making dumps that can be loaded into
 non-PostgreSQL databases.
-However, since this option generates a separate command for each row,
-an error in reloading a row causes only that row to be lost rather
-than the entire table contents.
+However, since by default the option generates a separate command
+for each row, an error in reloading a row causes only that row to be
+lost rather than the entire table contents.

   
  
@@ -764,11 +764,10 @@ PostgreSQL documentation
 than COPY).  This will make restoration very slow;
 it is mainly useful for making dumps that can be loaded into
 non-PostgreSQL databases.
-However, since this option generates a separate command for each row,
-an error in reloading a row causes only that row to be lost rather
-than the entire table contents.
-Note that
-the restore might fail altogether if you have rearranged column order.
+However, since by default the option generates a separate command
+for each row, an error in reloading a row causes only that row to be
+lost rather than the entire table contents.  Note that the restore
+might fail altogether if you have rearranged column order.
 The --column-inserts option is safe against column
 order changes, though even slower.

@@ -914,8 +913,9 @@ PostgreSQL documentation

 Add ON CONFLICT DO NOTHING to
 INSERT commands.
-This option is not valid unless --inserts or
---column-inserts is also specified.
+This option is not valid unless --inserts,
+--column-inserts or
+--rows-per-insert is also specified.

   
  
@@ -938,6 +938,18 @@ PostgreSQL documentation
   
  
 
+ 
+  --rows-per-insert=nrows
+  
+   
+Dump data as INSERT commands (rather than
+COPY).  Controls the maximum number of rows per
+INSERT statement. The value specified must be a
+number greater than zero.
+   
+  
+ 
+
  
--section=sectionname

diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index 4a2e122e2d..7ab27391fb 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -140,10 +140,10 @@ typedef struct _dumpOptions
 	int			dumpSections;	/* bitmask of chosen sections */
 	bool		aclsSkip;
 	const char *lockWaitTimeout;
+	int			dump_inserts;	/* 0 = COPY, otherwise rows per INSERT */
 
 	/* flags for various command-line long options */
 	int			disable_dollar_quoting;
-	int			dump_inserts;
 	int			column_inserts;
 	int			if_exists;
 	int			no_comments;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 3a89ad846a..957687db0f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -307,6 +307,7 @@ main(int argc, char **argv)
 	const char *dumpencoding = NULL;
 	const char *dumpsnapshot = NULL;
 	char	   *use_role = NULL;
+	char   *rowPerInsertEndPtr;
 	int			numWorkers = 1;
 	trivalue	prompt_password = TRI_DEFAULT;
 	int			compressLevel = -1;
@@ -358,7 +359,7 @@ main(int argc, char **argv)
 		{"enable-row-security", no_argument, _row_security, 1},
 		{"exclude-table-data", required_argument, NULL, 4},
 		{"if-exists", no_argument, _exists, 1},
-		{"inserts", no_argument, _inserts, 1},
+		{"inserts", no_argument, NULL, 8},
 		{"lock-wait-timeout", required_argument, NULL, 2},
 		{"no-tablespaces", no_argument, , 1},
 		{"quote-all-identifiers", no_argument, _all_identifiers, 1},
@@ -377,6 +378,7 @@ main(int argc, char **argv)
 		{"no-subscriptions", no_argument, 

Re: What happens if checkpoint haven't completed until the next checkpoint interval or max_wal_size?

2019-02-04 Thread Laurenz Albe
Mohammad Sherafat wrote:
> In the name of god!

It is not considered good style to hurt people's religious feelings
by using the name of god in vain.

> What happens if checkpoint haven't completed until the next checkpoint 
> interval or max_wal_size?

Then you have two checkpoints active at the same time.

Yours,
Laurenz Albe




Re: WIP: Avoid creation of the free space map for small tables

2019-02-04 Thread John Naylor
On Mon, Feb 4, 2019 at 8:41 AM Amit Kapila  wrote:
> > Yeah that can also work, but we still need to be careful about the
> > alignment of that one tuple, otherwise, there will could be different
> > free space on the fifth page.  The probably easier way could be to use
> > an even number of integers in the table say(int, int).  Anyway, for
> > now, I have avoided the dependency on FSM contents without losing on
> > coverage of test.  I have pushed my latest suggestion in the previous
> > email.
> >
>
> The change seems to have worked.  All the buildfarm machines that were
> showing the failure are passed now.

Excellent!

Now that the buildfarm is green as far as this patch goes, I will
touch on a few details to round out development in this area:

1. Earlier, I had a test to ensure that free space towards the front
of the relation was visible with no FSM. In [1], I rewrote it without
using vacuum, so we can consider adding it back now if desired. I can
prepare a patch for this.

2. As a follow-on, since we don't rely on vacuum to remove dead rows,
we could try putting the fsm.sql test in some existing group in the
parallel schedule, rather than its own group is it is now.

3. While looking at 0d1fe9f74e, it occurred to me that I ignored this
patch's effects on GetRecordedFreeSpace(), which will return zero for
tables with no FSM. The other callers are in:
contrib/pg_freespacemap/pg_freespacemap.c
contrib/pgstattuple/pgstatapprox.c

For pg_freespacemap, this doesn't matter, since it's just reporting
the facts. For pgstattuple_approx(), it might under-estimate the free
space and over-estimate the number of live tuples. This might be fine,
since it is approximate after all, but maybe a comment would be
helpful. If this is a problem, we could tweak it to be more precise
for tables without FSMs. Thoughts?

4. The latest patch for the pg_upgrade piece was in [2]

Anything else?

[1] 
https://www.postgresql.org/message-id/CACPNZCvEXLUx10pFvNcOs88RvqemMEjOv7D9MhL3ac86EzjAOA%40mail.gmail.com

[2] 
https://www.postgresql.org/message-id/CACPNZCu4cOdm3uGnNEGXivy7Gz8UWyQjynDpdkPGabQ18_zK6g%40mail.gmail.com

-- 
John Naylorhttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



RE: Protect syscache from bloating with negative cache entries

2019-02-04 Thread Tsunakawa, Takayuki
Horiguchi-san, Bruce, all,

I hesitate to say this, but I think there are the following problems with the 
proposed approach:

1) Tries to prune the catalog tuples only when the hash table is about to 
expand.
If no tuple is found to be eligible for eviction at first and the hash table 
expands, it gets difficult for unnecessary or less frequently accessed tuples 
to be removed because it will become longer and longer until the next hash 
table expansion.  The hash table doubles in size each time.
For example, if many transactions are executed in a short duration that create 
and drop temporary tables and indexes, the hash table could become large 
quickly.

2) syscache_prune_min_age is difficult to set to meet contradictory 
requirements.
e.g., in the above temporary objects case, the user wants to shorten 
syscache_prune_min_age so that the catalog tuples for temporary objects are 
removed.  But that also is likely to result in the necessary catalog tuples for 
non-temporary objects being removed.

3) The DBA cannot control the memory usage.  It's not predictable.
syscache_memory_target doesn't set the limit on memory usage despite the 
impression from its name.  In general, the cache should be able to set the 
upper limit on its size so that the DBA can manage things within a given amount 
of memory.  I think other PostgreSQL parameters are based on that idea -- 
shared_buffers, wal_buffers, work_mem, temp_buffers, etc.

4) The memory usage doesn't decrease once allocated.
The normal allocation memory context, aset.c, which CacheMemoryContextuses, 
doesn't return pfree()d memory to the operating system.  Once 
CacheMemoryContext becomes big, it won't get smaller.

5) Catcaches are managed independently of each other.
Even if there are many unnecessary catalog tuples in one catcache, they are not 
freed to make room for other catcaches.


So, why don't we make syscache_memory_target the upper limit on the total size 
of all catcaches, and rethink the past LRU management?


Regards
Takayuki Tsunakawa