Re: fixing old_snapshot_threshold's time->xid mapping

2020-04-19 Thread Thomas Munro
On Sat, Apr 18, 2020 at 9:27 PM Dilip Kumar  wrote:
> On Sat, Apr 18, 2020 at 11:47 AM Thomas Munro  wrote:
> > I think I found another bug in MaintainOldSnapshotTimeMapping(): if
> > you make time jump by more than old_snapshot_threshold in one go, then
> > the map gets cleared and then no early pruning or snapshot-too-old
> > errors happen.  That's why in 002_too_old.pl it currently advances
> > time by 10 minutes twice, instead of 20 minutes once.  To be
> > continued.
>
> IMHO that doesn't seems to be a problem.  Because even if we jump more
> than old_snapshot_threshold in one go we don't clean complete map
> right.  The latest snapshot timestamp will become the headtimestamp.
> So in TransactionIdLimitedForOldSnapshots if (current_ts -
> old_snapshot_threshold) is still >= head_timestap then we can still do
> early pruning.

Right, thanks.  I got confused about that, and misdiagnosed something
I was seeing.

Here's a new version:

0004: Instead of writing a new kind of TAP test to demonstrate
snapshot-too-old errors, I adjusted the existing isolation tests to
use the same absolute time control technique.  Previously I had
invented a way to do isolation tester-like stuff in TAP tests, which
might be interesting but strange new perl is not necessary for this.

0005: Truncates the time map when the CLOG is truncated.  Its test is
now under src/test/module/snapshot_too_old/t/001_truncate.sql.

These apply on top of Robert's patches, but the only dependency is on
his patch 0001 "Expose oldSnapshotControl.", because now I have stuff
in src/test/module/snapshot_too_old/test_sto.c that wants to mess with
that object too.

Is this an improvement?  I realise that there is still nothing to
actually verify that early pruning has actually happened.  I haven't
thought of a good way to do that yet (stats, page inspection, ...).
From 9640c654e97e8704b041e207f2fd02070d2dd057 Mon Sep 17 00:00:00 2001
From: Thomas Munro 
Date: Mon, 20 Apr 2020 16:23:02 +1200
Subject: [PATCH v3 4/5] Rewrite the "snapshot_too_old" tests with absolute
 timestamps.

Previously the snapshot too old feature used a special test value for
old_snapshot_threshold.  Instead, use a new approach based on clobbering
the "current" timestamp used in snapshot-too-old book keeping, so that
we can control the timeline precisely without having to resort to
sleeping or special test branches in the code that are different than
what is used in production.
---
 src/backend/utils/time/snapmgr.c  | 24 --
 src/test/modules/snapshot_too_old/Makefile| 21 ++---
 .../expected/sto_using_cursor.out | 75 +++--
 .../expected/sto_using_hash_index.out | 26 --
 .../expected/sto_using_select.out | 80 ---
 .../specs/sto_using_cursor.spec   | 30 ---
 .../specs/sto_using_hash_index.spec   | 19 -
 .../specs/sto_using_select.spec   | 32 +---
 src/test/modules/snapshot_too_old/sto.conf|  2 +-
 .../snapshot_too_old/test_sto--1.0.sql| 14 
 src/test/modules/snapshot_too_old/test_sto.c  | 74 +
 .../modules/snapshot_too_old/test_sto.control |  5 ++
 12 files changed, 264 insertions(+), 138 deletions(-)
 create mode 100644 src/test/modules/snapshot_too_old/test_sto--1.0.sql
 create mode 100644 src/test/modules/snapshot_too_old/test_sto.c
 create mode 100644 src/test/modules/snapshot_too_old/test_sto.control

diff --git a/src/backend/utils/time/snapmgr.c b/src/backend/utils/time/snapmgr.c
index 72b2c61a07..19e6c52b80 100644
--- a/src/backend/utils/time/snapmgr.c
+++ b/src/backend/utils/time/snapmgr.c
@@ -1739,26 +1739,6 @@ TransactionIdLimitedForOldSnapshots(TransactionId recentXmin,
 		update_ts = oldSnapshotControl->next_map_update;
 		SpinLockRelease(>mutex_latest_xmin);
 
-		/*
-		 * Zero threshold always overrides to latest xmin, if valid.  Without
-		 * some heuristic it will find its own snapshot too old on, for
-		 * example, a simple UPDATE -- which would make it useless for most
-		 * testing, but there is no principled way to ensure that it doesn't
-		 * fail in this way.  Use a five-second delay to try to get useful
-		 * testing behavior, but this may need adjustment.
-		 */
-		if (old_snapshot_threshold == 0)
-		{
-			if (TransactionIdPrecedes(latest_xmin, MyPgXact->xmin)
-&& TransactionIdFollows(latest_xmin, xlimit))
-xlimit = latest_xmin;
-
-			ts -= 5 * USECS_PER_SEC;
-			SetOldSnapshotThresholdTimestamp(ts, xlimit);
-
-			return xlimit;
-		}
-
 		ts = AlignTimestampToMinuteBoundary(ts)
 			- (old_snapshot_threshold * USECS_PER_MINUTE);
 
@@ -1860,10 +1840,6 @@ MaintainOldSnapshotTimeMapping(TimestampTz whenTaken, TransactionId xmin)
 	if (!map_update_required)
 		return;
 
-	/* No further tracking needed for 0 (used for testing). */
-	if (old_snapshot_threshold == 0)
-		return;
-
 	/*
 	 * We don't want to do something stupid with unusual values, but we don't
 	 * want to litter the log with warnings or 

Re: Race condition in SyncRepGetSyncStandbysPriority

2020-04-19 Thread Masahiko Sawada
On Sun, 19 Apr 2020 at 01:00, Tom Lane  wrote:
>
> Masahiko Sawada  writes:
> > On Sat, 18 Apr 2020 at 00:31, Tom Lane  wrote:
> >> +   /* Quick out if not even configured to be synchronous */
> >> +   if (SyncRepConfig == NULL)
> >> +   return false;
>
> > I felt strange a bit that we do the above check in
> > SyncRepGetSyncRecPtr() because SyncRepReleaseWaiters() which is the
> > only caller says the following before calling it:
>
> Notice there was such a test in SyncRepGetSyncRecPtr already --- I just
> moved it to be before doing some work instead of after.
>
> > Can we either change it to an assertion, move it to before acquiring
> > SyncRepLock in SyncRepReleaseWaiters or just remove it?
>
> I have no objection to that in principle, but it seems like it's a
> change in SyncRepGetSyncRecPtr's API that is not necessary to fix
> this bug.  So I'd rather leave it to happen along with the larger
> API changes (getting rid of am_sync) that are proposed for v14.

Agreed.

Regards,

-- 
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: 001_rep_changes.pl stalls

2020-04-19 Thread Fujii Masao




On 2020/04/18 16:01, Noah Misch wrote:

On Fri, Apr 17, 2020 at 05:06:29PM +0900, Kyotaro Horiguchi wrote:

At Fri, 17 Apr 2020 17:00:15 +0900 (JST), Kyotaro Horiguchi 
 wrote in

By the way, if latch is consumed in WalSndLoop, succeeding call to
WalSndWaitForWal cannot be woke-up by the latch-set.  Doesn't that
cause missing wakeups? (in other words, overlooking of wakeup latch).


- Since the only source other than timeout of walsender wakeup is latch,
- we should avoid wasteful consuming of latch. (It is the same issue
- with [1]).

+ Since walsender is wokeup by LSN advancement via latch, we should
+ avoid wasteful consuming of latch. (It is the same issue with [1]).



If wakeup signal is not remembered on walsender (like
InterruptPending), WalSndPhysical cannot enter a sleep with
confidence.


No; per latch.h, "What must be avoided is placing any checks for asynchronous
events after WaitLatch and before ResetLatch, as that creates a race
condition."  In other words, the thing to avoid is calling ResetLatch()
without next examining all pending work that a latch would signal.  Each
walsender.c WaitLatch call does follow the rules.

On Sat, Apr 18, 2020 at 12:29:58AM +0900, Fujii Masao wrote:

On 2020/04/17 14:41, Noah Misch wrote:

1. Make XLogSendPhysical() more like XLogSendLogical(), calling
WalSndWaitForWal() when no WAL is available.  A quick version of this
passes tests, but I'll need to audit WalSndWaitForWal() for things that are
wrong for physical replication.


(1) makes even physical replication walsender sleep in two places and
which seems to make the code for physical replication complicated
more than necessary. I'd like to avoid (1) if possible.


Good point.


2. Make XLogSendLogical() more like XLogSendPhysical(), returning when
insufficient WAL is available.  This complicates the xlogreader.h API to
pass back "wait for this XLogRecPtr", and we'd then persist enough state to
resume decoding.  This doesn't have any advantages to make up for those.

3. Don't avoid waiting in WalSndLoop(); instead, fix the stall by copying the
WalSndKeepalive() call from WalSndWaitForWal() to WalSndLoop().  This risks
further drift between the two wait sites; on the other hand, one could
refactor later to help avoid that.


Since the additional call of WalSndKeepalive() is necessary only for
logical replication, it should be copied to, e.g., XLogSendLogical(),
instead of WalSndLoop()? For example, when XLogSendLogical() sets
WalSndCaughtUp to true, it should call WalSndKeepalive()?


We'd send a keepalive even when pq_flush_if_writable() can't empty the output
buffer.  That could be acceptable, but it's not ideal.


The root problem seems that when WAL record that's no-opes for
logical rep is processed, keep alive message has not sent immediately,
in spite of that we want pg_stat_replication to be updated promptly.


The degree of promptness should be predictable, at least.  If we removed the
WalSndKeepalive() from WalSndWaitForWal(), pg_stat_replication updates would
not be prompt, but they would be predictable.  I do, however, think prompt
updates are worthwhile.


(3) seems to try to address this problem straightly and looks better to me.


4. Keep the WalSndLoop() wait, but condition it on !logical.  This is the
minimal fix, but it crudely punches through the abstraction between
WalSndLoop() and its WalSndSendDataCallback.


(4) also looks good because it's simple, if we can redesign those
functions in good shape.


Let's do that.  I'm attaching the replacement implementation and the revert of
v1.


Thanks for the patch! Though referencing XLogSendLogical inside WalSndLoop()
might be a bit ugly,, I'm fine with this change because it's simple and easier
to understand.

+* Block if we have unsent data.  XXX For logical replication, 
let
+* WalSndWaitForWal(), handle any other blocking; idle 
receivers need
+* its additional actions.  For physical replication, also 
block if
+* caught up; its send_data does not block.

It might be better to s/WalSndWaitForWal()/send_data()? Because not only
WalSndWaitForWal() but also WalSndWriteData() seems to handle the blocking.
WalSndWriteData() is called also under send_data, i.e., XLogSendLogical().

frame #2: 0x000106bcfa84 
postgres`WalSndWriteData(ctx=0x7fb2a4812d20, lsn=22608080, xid=488, 
last_write=false) at walsender.c:1247:2
frame #3: 0x000106b98295 
postgres`OutputPluginWrite(ctx=0x7fb2a4812d20, last_write=false) at 
logical.c:540:2
frame #4: 0x0001073fe9b8 
pgoutput.so`send_relation_and_attrs(relation=0x0001073ba2c0, 
ctx=0x7fb2a4812d20) at pgoutput.c:353:2
frame #5: 0x0001073fe7a0 
pgoutput.so`maybe_send_schema(ctx=0x7fb2a4812d20, 
relation=0x0001073ba2c0, relentry=0x7fb2a483aa60) at pgoutput.c:315:2
frame #6: 0x0001073fd4c0 
pgoutput.so`pgoutput_change(ctx=0x7fb2a4812d20, 

Re: While restoring -getting error if dump contain sql statements generated from generated.sql file

2020-04-19 Thread Masahiko Sawada
On Fri, 17 Apr 2020 at 22:50, Masahiko Sawada
 wrote:
>
> On Tue, 14 Apr 2020 at 22:41, tushar  wrote:
> >
> > Hi ,
> >
> > We have a sql file  called 'generated.sql' under src/test/regress/sql
> > folder . if we run this file on psql , take the dump and try to restore
> > it on another db
> > we are getting error like -
> >
> > psql:/tmp/x:434: ERROR:  column "b" of relation "gtest1_1" is a
> > generated column
> > psql:/tmp/x:441: ERROR:  cannot use column reference in DEFAULT expression
> >
> > These sql statements , i copied from the dump file
> >
> > postgres=# CREATE TABLE public.gtest30 (
> > postgres(# a integer,
> > postgres(# b integer
> > postgres(# );
> > CREATE TABLE
> > postgres=#
> > postgres=# CREATE TABLE public.gtest30_1 (
> > postgres(# )
> > postgres-# INHERITS (public.gtest30);
> > CREATE TABLE
> > postgres=# ALTER TABLE ONLY public.gtest30_1 ALTER COLUMN b SET DEFAULT
> > (a * 2);
> > ERROR:  cannot use column reference in DEFAULT expression
> > postgres=#
> >
> > Steps to reproduce -
> >
> > connect to psql - ( ./psql postgres)
> > create database ( create database x;)
> > connect to database x (\c x )
> > execute generated.sql file (\i ../../src/test/regress/sql/generated.sql)
> > take the dump of x db (./pg_dump -Fp x > /tmp/t.dump)
> > create another database  (create database y;)
> > Connect to y db (\c y)
> > execute plain dump sql file (\i /tmp/t.dump)
> >
>
> Good catch. The minimum reproducer is to execute the following
> queries, pg_dump and pg_restore/psql.
>
> -- test case 1
> create table a (a int, b int generated always as (a * 2) stored);
> create table a1 () inherits(a);
>
> -- test case 2
> create table b (a int, b int generated always as (a * 2) stored);
> create table b1 () inherits(b);
> alter table only b alter column b drop expression;
>
> After executing the above queries, pg_dump will generate the following 
> queries:
>
> -- test case 1
> CREATE TABLE public.a (
> a integer,
> b integer GENERATED ALWAYS AS ((a * 2)) STORED
> );
> ALTER TABLE public.a OWNER TO masahiko;
> CREATE TABLE public.a1 (
> )
> INHERITS (public.a);
> ALTER TABLE public.a1 OWNER TO masahiko;
> ALTER TABLE ONLY public.a1 ALTER COLUMN b SET DEFAULT (a * 2); -- error!
>
> -- test case 2
> CREATE TABLE public.b (
> a integer,
> b integer
> );
> ALTER TABLE public.b OWNER TO masahiko;
> CREATE TABLE public.b1 (
> )
> INHERITS (public.b);
> ALTER TABLE public.b1 OWNER TO masahiko;
> ALTER TABLE ONLY public.b1 ALTER COLUMN b SET DEFAULT (a * 2); -- error!
>
> pg_dump generates the same SQL "ALTER TABLE ... ALTER COLUMN b SET
> DEFAULT (a * 2);" but the errors vary.
>
> test case 1:
> ERROR:  column "b" of relation "a1" is a generated column
>
> test case 2:
> ERROR:  cannot use column reference in DEFAULT expression
>
> In both cases, I think we can simply get rid of that ALTER TABLE
> queries if we don't support changing a normal column to a generated
> column using ALTER TABLE .. ALTER COLUMN.
>
> I've attached a WIP patch. I'll look at this closely and add regression tests.
>

After more thoughts, the approach of the previous patch doesn't seem
correct. Instead, I think we can change dumpAttrDef so that it skips
emitting the query setting an expression of a generated column if the
column is a generated column.

Currently, we need to emit a query setting the default in the
following three cases (ref. adinfo->separate):

1. default is for column on VIEW
2. shouldPrintColumn() returns false in the two case:
2-1. the column is a dropped column.
2-2. the column is not a local column and the table is not a partition.

Since we don't support to set generated column as a default value for
a column of a view the case (1) is always false. And for the case
(2)-1, we don't dump a dropped column. I think the case (2)-2 means a
column inherited from the parent table but these columns are printed
in CREATE TABLE of the parent table and a child table inherits it. We
can have a generated column having a different expression from the
parent one but it will need to drop the inherited one and create a new
generated column. Such operation will make the column a local column,
so these definitions will be printed in the CREATE TABLE of the
inherited table. Therefore, IIUC there is no case where we need a
separate query setting an expression of a generated column.

Also, I've tried to add a regression test for this but pg_dump TAP
tests seem not to have a test if the dumped queries are loaded without
errors. I think we can have such a test but the attached updated
version patch doesn't include tests so far.

Regards,

--
Masahiko Sawadahttp://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


generated_column_pg_dump_v2.patch
Description: Binary data


Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Amit Kapila
On Mon, Apr 20, 2020 at 4:32 AM Michael Paquier  wrote:
>
> On Sun, Apr 19, 2020 at 03:59:50PM -0300, Ranier Vilela wrote:
> > Em dom., 19 de abr. de 2020 às 12:34, Juan José Santamaría Flecha <
> > juanjo.santama...@gmail.com> escreveu:
> >> This line needs a break, other than that LGTM.
> >
> > Sure. Attached new patch with this revision.
>
> Amit, are you planning to look at this patch?
>

Yes, I am planning to look into it.  Actually, I think the main thing
here is to ensure that we don't break something which was working with
_create_locale API.

>  I may be able to spend
> a couple of hours on this thread this week and that's an area of the
> code I have worked with in the past, though I am not sure.
>

Okay, that will be good.

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: where should I stick that backup?

2020-04-19 Thread Amit Kapila
On Sat, Apr 18, 2020 at 5:14 AM Andres Freund  wrote:
>
> zstd -T0 < onegbofrandom > NUL
> zstd -T0 < onegbofrandom > /dev/null
> linux host: 0.361s
> windows guest:  0.602s
>
> zstd -T0 < onegbofrandom | dd bs=1M of=NUL
> zstd -T0 < onegbofrandom | dd bs=1M of=/dev/null
> linux host: 0.454s
> windows guest:  0.802s
>
> zstd -T0 < onegbofrandom | dd bs=64k | dd bs=64k | dd bs=64k | wc -c
> linux host: 0.521s
> windows guest:  1.376s
>
>
> This suggest that pipes do have a considerably higher overhead on
> windows, but that it's not all that terrible if one takes care to use
> large buffers in each pipe element.
>

I have also done some similar experiments on my Win-7 box and the
results are as follows:

zstd -T0 < 16396 > NUL

Execution time: 2.240 s

zstd -T0 < 16396 | dd bs=1M > NUL

Execution time: 4.240 s

zstd -T0 < 16396 | dd bs=64k | dd bs=64k | dd bs=64k | wc -c

Execution time: 5.959 s

In the above tests, 16396 is a 1GB file generated via pgbench.  The
above results indicate that adding more pipe chains with dd adds
significant overhead but how can we distinguish what is exact overhead
due to pipe?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: fixing old_snapshot_threshold's time->xid mapping

2020-04-19 Thread Dilip Kumar
On Thu, Apr 16, 2020 at 10:12 PM Robert Haas  wrote:
>
> Hi,
>
> I'm starting a new thread for this, because the recent discussion of
> problems with old_snapshot_threshold[1] touched on a lot of separate
> issues, and I think it will be too confusing if we discuss all of them
> on one thread. Attached are three patches.
>
> 0001 makes oldSnapshotControl "extern" rather than "static" and
> exposes the struct definition via a header.
>
> 0002 adds a contrib module called old_snapshot which makes it possible
> to examine the time->XID mapping via SQL. As Andres said, the comments
> are not really adequate in the existing code, and the code itself is
> buggy, so it was a little hard to be sure that I was understanding the
> intended meaning of the different fields correctly. However, I gave it
> a shot.
>
> 0003 attempts to fix bugs in MaintainOldSnapshotTimeMapping() so that
> it produces a sensible mapping. I encountered and tried to fix two
> issues here:
>
> First, as previously discussed, the branch that advances the mapping
> should not categorically do "oldSnapshotControl->head_timestamp = ts;"
> assuming that the head_timestamp is supposed to be the timestamp for
> the oldest bucket rather than the newest one. Rather, there are three
> cases: (1) resetting the mapping resets head_timestamp, (2) extending
> the mapping by an entry without dropping an entry leaves
> head_timestamp alone, and (3) overwriting the previous head with a new
> entry advances head_timestamp by 1 minute.
>
> Second, the calculation of the number of entries by which the mapping
> should advance is incorrect. It thinks that it should advance by the
> number of minutes between the current head_timestamp and the incoming
> timestamp. That would be correct if head_timestamp were the most
> recent entry in the mapping, but it's actually the oldest. As a
> result, without this fix, every time we move into a new minute, we
> advance the mapping much further than we actually should. Instead of
> advancing by 1, we advance by the number of entries that already exist
> in the mapping - which means we now have entries that correspond to
> times which are in the future, and don't advance the mapping again
> until those future timestamps are in the past.
>
> With these fixes, I seem to get reasonably sensible mappings, at least
> in light testing.  I tried running this in one window with \watch 10:
>
> select *, age(newest_xmin), clock_timestamp()  from
> pg_old_snapshot_time_mapping();
>
> And in another window I ran:
>
> pgbench -T 300 -R 10
>
> And the age does in fact advance by ~600 transactions per minute.

I have started reviewing these patches.  I think, the fixes looks right to me.

+ LWLockAcquire(OldSnapshotTimeMapLock, LW_SHARED);
+ mapping->head_offset = oldSnapshotControl->head_offset;
+ mapping->head_timestamp = oldSnapshotControl->head_timestamp;
+ mapping->count_used = oldSnapshotControl->count_used;
+ for (int i = 0; i < OLD_SNAPSHOT_TIME_MAP_ENTRIES; ++i)
+ mapping->xid_by_minute[i] = oldSnapshotControl->xid_by_minute[i];
+ LWLockRelease(OldSnapshotTimeMapLock);

I think memcpy would be a better choice instead of looping it for all
the entries, since we are doing this under a lock?

-- 
Regards,
Dilip Kumar
EnterpriseDB: http://www.enterprisedb.com




Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 22:00, David Rowley 
escreveu:

> On Mon, 20 Apr 2020 at 11:24, Ranier Vilela  wrote:
> > I tried: https://godbolt.org with:
> >
> > -O2:
> >
> > f1:
> > int main (int argv, char **argc)
> > {
> > return strlen(argc[0]) == 0;
> > }
> >
> > f1: Assembly
> > main:   # @main
> > mov rcx, qword ptr [rsi]
> > xor eax, eax
> > cmp byte ptr [rcx], 0
> > seteal
> > ret
> >
> > f2:
> > int main (int argv, char **argc)
> > {
> > return argc[0] == '\0';
> > }
> >
> > f2: Assembly
> >
> > main:   # @main
> > xor eax, eax
> > cmp qword ptr [rsi], 0
> > seteal
> > ret
> >
> > For me clearly str [0] == '\ 0', wins.
>
> I think you'd want to use argc[0][0] == '\0' or *argc[0] == '\0'.
> Otherwise you appear just to be checking if the first element in the
> argc pointer array is set to NULL, which is certainly not the same as
> an empty string.
>
I guess you're right.

x86-64 clang (trunk) -O2
f1:
int cmp(const char * name)
{
return strlen(name) == 0;
}

cmp:# @cmp
xor eax, eax
cmp byte ptr [rdi], 0
seteal
ret

f2:
int cmp(const char * name)
{
return name[0] == '\0';
}

cmp:# @cmp
xor eax, eax
cmp byte ptr [rdi], 0
seteal
ret

Is the same result in assembly.
Well, it doesn't matter to me, I will continue to use str[0] == '\0'.

Thanks for take part.

regards,
Ranier VIlela


Re: where should I stick that backup?

2020-04-19 Thread Amit Kapila
On Sat, Apr 18, 2020 at 8:35 PM Robert Haas  wrote:
>
> On Fri, Apr 17, 2020 at 7:44 PM Andres Freund  wrote:
> > This suggest that pipes do have a considerably higher overhead on
> > windows, but that it's not all that terrible if one takes care to use
> > large buffers in each pipe element.
> >
> > It's notable though that even the simplest use of a pipe does add a
> > considerable overhead compared to using the files directly.
>
> Thanks for these results. I think that this shows that it's probably
> not a great idea to force everything to go through pipes in every
> case, but on the other hand, there's no reason to be a particularly
> scared of the performance implications of letting some things go
> through pipes. For instance, if we decide that LZ4 compression is
> going to be a good choice for most users, we might want to do that
> in-process rather than via pipes.
>

How will the user know how to use this compressed backup?  I mean to
say if we use some compression algorithm to compress the data then the
user should know how to decompress and use the backup.   IIUC, if
currently, the user uses tar format to backup, it can simply untar it
and start the server but will that be possible if we provide some
in-built compression methods like LZ4?

-- 
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com




Re: v13: Performance regression related to FORTIFY_SOURCE

2020-04-19 Thread Jeff Davis
On Sun, 2020-04-19 at 16:19 -0700, Peter Geoghegan wrote:
> Is it possible that the issue has something to do with what the
> compiler knows about the alignment of the tapes back when they were a
> flexible array vs. now, where it's a separate allocation? Perhaps I'm
> over reaching, but it occurs to me that MemSetAligned() is itself
> concerned about the alignment of data returned from palloc(). Could
> be
> a similar issue here, too.

The memcpy() is for the buffer, not the array of LogicalTapes, so I
don't really see how that would happen.

> Some guy on the internet says that microarchitectural issues can make
> __memcpy_avx_unaligned() a lot faster that the "rep movsq"
> instruction
> (which you mentioned was a factor on the other thread) in some cases
> [1]. This explanation sounds kind of plausible.
> 
> [1] https://news.ycombinator.com/item?id=12050579

That raises another consideration: perhaps this is not uniformly a
regression, but actually faster in some situations? If so, what
situations?

Regards,
Jeff Davis






Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread David Rowley
On Mon, 20 Apr 2020 at 11:24, Ranier Vilela  wrote:
> I tried: https://godbolt.org with:
>
> -O2:
>
> f1:
> int main (int argv, char **argc)
> {
> return strlen(argc[0]) == 0;
> }
>
> f1: Assembly
> main:   # @main
> mov rcx, qword ptr [rsi]
> xor eax, eax
> cmp byte ptr [rcx], 0
> seteal
> ret
>
> f2:
> int main (int argv, char **argc)
> {
> return argc[0] == '\0';
> }
>
> f2: Assembly
>
> main:   # @main
> xor eax, eax
> cmp qword ptr [rsi], 0
> seteal
> ret
>
> For me clearly str [0] == '\ 0', wins.

I think you'd want to use argc[0][0] == '\0' or *argc[0] == '\0'.
Otherwise you appear just to be checking if the first element in the
argc pointer array is set to NULL, which is certainly not the same as
an empty string.

David




Re: sqlsmith crash incremental sort

2020-04-19 Thread Tomas Vondra

On Sat, Apr 18, 2020 at 02:23:25PM -0400, James Coleman wrote:

On Thu, Apr 16, 2020 at 9:26 PM Tom Lane  wrote:


Tomas Vondra  writes:
> I think we have essentially three options:
> 1) assuming there's just a single group
> 2) assuming each row is a separate group
> 3) something in between
> If (1) and (2) are worst/best-case scenarios, maybe we should pick
> something in between. We have DEFAULT_NUM_DISTINCT (200) which
> essentially says "we don't know what the number of groups is" so maybe
> we should use that.

I wouldn't recommend picking either the best or worst cases.

Possibly DEFAULT_NUM_DISTINCT is a sane choice, though it's fair to
wonder if it's quite applicable to the case where we already know
we've grouped by some columns.


Do you think defining a new default, say,
DEFAULT_NUM_DISTINCT_PRESORTED is preferred then? And choose some
value like "1/2 of the normal DEFAULT_NUM_DISTINCT groups" or some
such?



If we had a better intuition what a better value is, maybe. But I don't
think we have that at all, so I'd just use the existing one.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Adding missing object access hook invocations

2020-04-19 Thread Mark Dilger



> On Apr 19, 2020, at 3:55 PM, Michael Paquier  wrote:
> 
> On Thu, Mar 19, 2020 at 11:47:46AM -0700, Mark Dilger wrote:
>> On Mar 19, 2020, at 11:30 AM, Mark Dilger  
>> wrote:
>>> Will post v3 shortly.
> 
> Thanks for sending a new version of the patch and removing the bits
> about object drops.  Your additions to src/backend/ look fine to me,
> so I have no objections to commit it.  The only module we have in core
> that makes use of object_access_hook is sepgsql.  Adding support for
> it could be done in a separate commit for AMs, stats and user mappings
> but we would need a use-case for it.  One thing that I can see is that
> even if we test for ALTER put_your_object_type_here foo RENAME TO in
> the module and that your patch adds one InvokeObjectPostAlterHook()
> for ALTER RULE, we don't have support for rules in sepgsql (see
> sepgsql_object_access for OAT_POST_CREATE).  So that's fine.
> 
> Unfortunately, we are past feature freeze so this will have to wait
> until v14 opens for business to be merged, and I'll take care of it.
> Or would others prefer to not wait one extra year for those changes to
> be released?

I don't intend to make any special pleading for this to go in after feature 
freeze.  Let's see if others feel differently.

Thanks for the review!

—
Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company







Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 18:38, Tom Lane  escreveu:

> Tomas Vondra  writes:
> > On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:
> >> strlen it is one of the low fruits that can be harvested.
>
> > Maybe there are places where this would help, but I don't see a reason
> > to just throw away all strlen calls and replace them with something
> > clearly less convenient and possibly more error-prone (I'd expect quite
> > a few off-by-one mistakes with this).
>
> I've heard it claimed that modern compilers will optimize
> strlen('literal') to a constant at compile time.  I'm not sure how
> much I believe that, but to the extent that it's true, replacing such
> calls would provide exactly no performance benefit.
>
Tom, I wouldn't believe it very much, they still have a lot of stupid
compilers out there (msvc for example).
Furthermore, optimizations are a complicated business, often the adjacent
code does not allow for such optimizations.
When a programmer does, there is no doubt.


>
> I'm quite -1 on changing these to sizeof(), in any case, because
> (a) that opens up room for confusion about whether the trailing nul is
> included, and (b) it makes it very easy, when changing or copy/pasting
> code, to apply sizeof to something that's not a string literal, with
> disastrous results.
>
It may be true, but I have seen a lot of Postgres code, where sizeof is
used extensively, even with real chances of what you said happened. So that
risk already exists.


>
> The cases where Ranier proposes to replace strlen(foo) == 0
> with a test on foo[0] do seem like wins, though.  Asking for
> the full string length to be computed is more computation than
> necessary, and it's less clear that the compiler could be
> expected to save you from that.  Anyway there's a coding style
> proposition that we should be doing this consistently, and
> certainly lots of places do do this without using strlen().
>
Yes, this is the idea.


>
> I can't get excited about the proposed changes to optimize away
> multiple calls of strlen() either, unless there's performance
> measurements to support them individually.  This again seems like
> something a compiler might do for you.
>
Again, the compiler will not always save us.
I have seen many fanstatic solutions in Postgres, but I have also seen a
lot of written code, forgive me, without caprice, without much care.
The idea is, little by little, to prevent carefree code, either written or
left in Postgres.

You can see an example in that patch.
After a few calls the programmer validates the entry and leaves if it is
bad. When it should be done before anything.

diff --git a/src/backend/utils/misc/guc.c b/src/backend/utils/misc/guc.c
index 5bdc02fce2..a00cca2605 100644
--- a/src/backend/utils/misc/guc.c
+++ b/src/backend/utils/misc/guc.c
@@ -10699,10 +10699,15 @@ GUCArrayDelete(ArrayType *array, const char *name)
  struct config_generic *record;
  ArrayType  *newarray;
  int i;
+ int len;
  int index;

  Assert(name);

+ /* if array is currently null, then surely nothing to delete */
+ if (!array)
+ return NULL;
+
  /* test if the option is valid and we're allowed to set it */
  (void) validate_option_array_item(name, NULL, false);

@@ -10711,12 +10716,9 @@ GUCArrayDelete(ArrayType *array, const char *name)
  if (record)
  name = record->name;

- /* if array is currently null, then surely nothing to delete */
- if (!array)
- return NULL;
-
  newarray = NULL;
  index = 1;
+ len = strlen(name);

  for (i = 1; i <= ARR_DIMS(array)[0]; i++)
  {
@@ -10735,8 +10737,8 @@ GUCArrayDelete(ArrayType *array, const char *name)
  val = TextDatumGetCString(d);

  /* ignore entry if it's what we want to delete */
- if (strncmp(val, name, strlen(name)) == 0
- && val[strlen(name)] == '=')
+ if (strncmp(val, name, len) == 0
+ && val[len] == '=')
  continue;

  /* else add it to the output array */

regards,
Ranier Vilela


Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 19:00, David Rowley 
escreveu:

> On Mon, 20 Apr 2020 at 09:38, Tom Lane  wrote:
> > The cases where Ranier proposes to replace strlen(foo) == 0
> > with a test on foo[0] do seem like wins, though.  Asking for
> > the full string length to be computed is more computation than
> > necessary, and it's less clear that the compiler could be
> > expected to save you from that.  Anyway there's a coding style
> > proposition that we should be doing this consistently, and
> > certainly lots of places do do this without using strlen().
>
> Looking at https://godbolt.org/z/6XsjbA it seems like GCC is pretty
> good at getting rid of the strlen call even at -O0. It takes -O1 for
> clang to use it and -O2 for icc.
>
I tried: https://godbolt.org with:

-O2:

f1:
int main (int argv, char **argc)
{
return strlen(argc[0]) == 0;
}

f1: Assembly
main:   # @main
mov rcx, qword ptr [rsi]
xor eax, eax
cmp byte ptr [rcx], 0
seteal
ret

f2:
int main (int argv, char **argc)
{
return argc[0] == '\0';
}

f2: Assembly

main:   # @main
xor eax, eax
cmp qword ptr [rsi], 0
seteal
ret

For me clearly str [0] == '\ 0', wins.

regards,
Ranier Vilela


Re: v13: Performance regression related to FORTIFY_SOURCE

2020-04-19 Thread Peter Geoghegan
On Sun, Apr 19, 2020 at 3:07 PM Jeff Davis  wrote:
> 1. Is my analysis correct?
> 2. What is the scale of this problem? What about other platforms or
> compilers? Are there other cases in PostgreSQL that might suffer from
> the use of FORTIFY_SOURCE?
> 3. Even if this is the compiler's fault, should we still fix it?

The precedent set by MemSetAligned() is that sometimes we see the code
generated by very common standard library functions as a problem for
us to fix, or to paper over.

Is it possible that the issue has something to do with what the
compiler knows about the alignment of the tapes back when they were a
flexible array vs. now, where it's a separate allocation? Perhaps I'm
over reaching, but it occurs to me that MemSetAligned() is itself
concerned about the alignment of data returned from palloc(). Could be
a similar issue here, too.

Some guy on the internet says that microarchitectural issues can make
__memcpy_avx_unaligned() a lot faster that the "rep movsq" instruction
(which you mentioned was a factor on the other thread) in some cases
[1]. This explanation sounds kind of plausible.

[1] https://news.ycombinator.com/item?id=12050579
-- 
Peter Geoghegan




Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Michael Paquier
On Sun, Apr 19, 2020 at 03:59:50PM -0300, Ranier Vilela wrote:
> Em dom., 19 de abr. de 2020 às 12:34, Juan José Santamaría Flecha <
> juanjo.santama...@gmail.com> escreveu:
>> This line needs a break, other than that LGTM.
>
> Sure. Attached new patch with this revision.

Amit, are you planning to look at this patch?  I may be able to spend
a couple of hours on this thread this week and that's an area of the
code I have worked with in the past, though I am not sure.
--
Michael


signature.asc
Description: PGP signature


Re: Parallel Append can break run-time partition pruning

2020-04-19 Thread David Rowley
On Fri, 17 Apr 2020 at 19:08, Amit Langote  wrote:
> While looking at this, I observed that the PartitionPruneInfo of the
> top-level Append (the one that later gets thrown out) contains bogus
> information:
>
>{PARTITIONPRUNEINFO
>:prune_infos ((
>   {PARTITIONEDRELPRUNEINFO
>   :rtindex 1
>   :present_parts (b 0)
>   :nparts 1
>   :subplan_map  0
>   :subpart_map  1
>
> One of these should be -1.
>
>   {PARTITIONEDRELPRUNEINFO
>   :rtindex 2
>   :present_parts (b)
>   :nparts 2
>   :subplan_map  -1 -1
>   :subpart_map  -1 -1
>
> subplan_map values are not correct, because subpaths list that would
> have been passed would not include paths of lower-level partitions as
> the flattening didn't occur.

It's not great that we're generating that, but as far as I can see,
it's not going to cause any misbehaviour.  It'll cause a small
slowdown in run-time pruning due to perhaps having to perform an
additional call to find_matching_subplans_recurse() during execution.
In this case, it'll never find any subnodes that match due to both
maps having all -1 element values.

Since f2343653f5, we're not using partitioned_rels for anything else,
so we should likely fix this so that we don't add the item to
partitioned_rels when we don't pullup the sub-Append.   I think we
should hold off on fixing that until we decide if any adjustments need
to be made to the sub-Append pullup code.

David




Re: Adding missing object access hook invocations

2020-04-19 Thread Michael Paquier
On Thu, Mar 19, 2020 at 11:47:46AM -0700, Mark Dilger wrote:
> On Mar 19, 2020, at 11:30 AM, Mark Dilger  
> wrote:
>>  Will post v3 shortly.

Thanks for sending a new version of the patch and removing the bits
about object drops.  Your additions to src/backend/ look fine to me,
so I have no objections to commit it.  The only module we have in core
that makes use of object_access_hook is sepgsql.  Adding support for
it could be done in a separate commit for AMs, stats and user mappings
but we would need a use-case for it.  One thing that I can see is that
even if we test for ALTER put_your_object_type_here foo RENAME TO in
the module and that your patch adds one InvokeObjectPostAlterHook()
for ALTER RULE, we don't have support for rules in sepgsql (see
sepgsql_object_access for OAT_POST_CREATE).  So that's fine.

Unfortunately, we are past feature freeze so this will have to wait
until v14 opens for business to be merged, and I'll take care of it.
Or would others prefer to not wait one extra year for those changes to
be released?

Please note that there is a commit fest entry, though you forgot to
add your name as author of the patch:
https://commitfest.postgresql.org/28/2513/
--
Michael


signature.asc
Description: PGP signature


v13: Performance regression related to FORTIFY_SOURCE

2020-04-19 Thread Jeff Davis

There was a previous thread[1], but I think it needs some wider
discussion.

I brought up an issue where GCC in combination with FORTIFY_SOURCE[2]
causes a perf regression for logical tapes after introducing
LogicalTapeSetExtend()[3]. Unfortunately, FORTIFY_SOURCE is used by
default on ubuntu. I have not observed the problem with clang.

There is no reason why the change should trigger the regression, but it
does. The slowdown is due to GCC switching to an inlined version of
memcpy() for LogicalTapeWrite() at logtape.c:768. The change[3] seems
to have little if anything to do with that.

GCC's Object Size Checking[4] doc says:

  "There are built-in functions added for many common
   string operation functions, e.g., for memcpy 
   __builtin___memcpy_chk built-in is provided. This
   built-in has an additional last argument, which is
   the number of bytes remaining in the object the dest
   argument points to or (size_t) -1 if the size is not
   known. The built-in functions are optimized into the
   normal string functions like memcpy if the last
   argument is (size_t) -1 or if it is known at compile
   time that the destination object will not be
   overflowed..."

In other words, if GCC knows the size of the object it tries to either
verify at compile time that it will never overflow, or it inserts a
runtime check. But if it doesn't know the size of the object, there's
nothing it can do so it just uses memcpy() like normal.

Knowing the destination buffer size at compile time would be impossible
(before or after my change) because palloc() doesn't have the
alloc_size attribute[5] specified. Even if it is specified (which I
tried), and if the compiler was smart enough (which it's not), it could
still only come up with a maximum size because the offset changes at
runtime. Regardless, I tried printing out the results of:
__builtin_object_size (lt->buffer + lt->pos, 0)
and the result is always -1 (unknown).

I have attached a workaround patch which restores the performance, and
it's isolatted to logtape.c, but it's ugly (and not a little bit).

The questions are:

1. Is my analysis correct?
2. What is the scale of this problem? What about other platforms or
compilers? Are there other cases in PostgreSQL that might suffer from
the use of FORTIFY_SOURCE?
3. Even if this is the compiler's fault, should we still fix it?
4. Does the attached fix have any dangers of regressing on other
compilers/platforms?
5. Does anyone have a suggestion for a better fix?

Regards,
Jeff Davis

[1] 
https://postgr.es/m/91ca648cfd1f99bf07981487a7d81a1ec926caad.ca...@j-davis.com
[2] 
https://fedoraproject.org/wiki/Security_Features?rd=Security/Features#Compile_Time_Buffer_Checks_.28FORTIFY_SOURCE.29
[3] 
https://git.postgresql.org/gitweb/?p=postgresql.git;a=commitdiff;h=24d85952
[4] https://gcc.gnu.org/onlinedocs/gcc/Object-Size-Checking.html
[5] 
https://gcc.gnu.org/onlinedocs/gcc/Common-Function-Attributes.html#index-alloc_005fsize-function-attribute
diff --git a/src/backend/utils/sort/logtape.c b/src/backend/utils/sort/logtape.c
index 48ae0de305b..48f32cbe86f 100644
--- a/src/backend/utils/sort/logtape.c
+++ b/src/backend/utils/sort/logtape.c
@@ -153,6 +153,19 @@ typedef struct LogicalTape
 	int			nbytes;			/* total # of valid bytes in buffer */
 } LogicalTape;
 
+/*
+ * It would be more natural to simply use an ordinary pointer in
+ * LogicalTapeSet, and allocate many elements for it. Unfortunately, some
+ * compilers (notably GCC with FORTIFY_SOURCE specified) generate signficantly
+ * slower code when that's done, and using this dummy structure is a
+ * workaround for that.
+ */
+typedef struct LogicalTapeDummy
+{
+	int dummy;
+	LogicalTape tape_array[FLEXIBLE_ARRAY_MEMBER];
+} LogicalTapeDummy;
+
 /*
  * This data structure represents a set of related "logical tapes" sharing
  * space in a single underlying file.  (But that "file" may be multiple files
@@ -192,7 +205,7 @@ struct LogicalTapeSet
 
 	/* The array of logical tapes. */
 	intnTapes;	/* # of logical tapes in set */
-	LogicalTape	   *tapes;	/* has nTapes nentries */
+	LogicalTapeDummy	   *tapes;	/* has nTapes nentries */
 };
 
 static void ltsWriteBlock(LogicalTapeSet *lts, long blocknum, void *buffer);
@@ -479,7 +492,7 @@ ltsConcatWorkerTapes(LogicalTapeSet *lts, TapeShare *shared,
 		BufFile*file;
 		int64		filesize;
 
-		lt = >tapes[i];
+		lt = >tapes->tape_array[i];
 
 		pg_itoa(i, filename);
 		file = BufFileOpenShared(fileset, filename);
@@ -616,10 +629,12 @@ LogicalTapeSetCreate(int ntapes, TapeShare *shared, SharedFileSet *fileset,
 	lts->freeBlocks = (long *) palloc(lts->freeBlocksLen * sizeof(long));
 	lts->nFreeBlocks = 0;
 	lts->nTapes = ntapes;
-	lts->tapes = (LogicalTape *) palloc(ntapes * sizeof(LogicalTape));
+	lts->tapes = (LogicalTapeDummy *) palloc(
+		offsetof(LogicalTapeDummy, tape_array) +
+		ntapes * sizeof(LogicalTape));
 
 	for (i = 0; i < ntapes; i++)
-		ltsInitTape(>tapes[i]);
+		

Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread David Rowley
On Mon, 20 Apr 2020 at 09:38, Tom Lane  wrote:
> The cases where Ranier proposes to replace strlen(foo) == 0
> with a test on foo[0] do seem like wins, though.  Asking for
> the full string length to be computed is more computation than
> necessary, and it's less clear that the compiler could be
> expected to save you from that.  Anyway there's a coding style
> proposition that we should be doing this consistently, and
> certainly lots of places do do this without using strlen().

Looking at https://godbolt.org/z/6XsjbA it seems like GCC is pretty
good at getting rid of the strlen call even at -O0. It takes -O1 for
clang to use it and -O2 for icc.

David




Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Tom Lane
Tomas Vondra  writes:
> On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:
>> strlen it is one of the low fruits that can be harvested.

> Maybe there are places where this would help, but I don't see a reason
> to just throw away all strlen calls and replace them with something
> clearly less convenient and possibly more error-prone (I'd expect quite
> a few off-by-one mistakes with this).

I've heard it claimed that modern compilers will optimize
strlen('literal') to a constant at compile time.  I'm not sure how
much I believe that, but to the extent that it's true, replacing such
calls would provide exactly no performance benefit.

I'm quite -1 on changing these to sizeof(), in any case, because
(a) that opens up room for confusion about whether the trailing nul is
included, and (b) it makes it very easy, when changing or copy/pasting
code, to apply sizeof to something that's not a string literal, with
disastrous results.

The cases where Ranier proposes to replace strlen(foo) == 0
with a test on foo[0] do seem like wins, though.  Asking for
the full string length to be computed is more computation than
necessary, and it's less clear that the compiler could be
expected to save you from that.  Anyway there's a coding style
proposition that we should be doing this consistently, and
certainly lots of places do do this without using strlen().

I can't get excited about the proposed changes to optimize away
multiple calls of strlen() either, unless there's performance
measurements to support them individually.  This again seems like
something a compiler might do for you.

regards, tom lane




Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Andreas Karlsson

On 4/19/20 10:29 PM, Ranier Vilela wrote:
Em dom., 19 de abr. de 2020 às 16:33, Tomas Vondra 
mailto:tomas.von...@2ndquadrant.com>> 
escreveu:


On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:
 >Hi,
 >strlen it is one of the low fruits that can be harvested.
 >What is your opinion?
 >

That assumes this actually affects/improves performance, without any
measurements proving that. Considering large number of the places you
modified are related to DDL (CreateComment, ChooseIndexColumnNames, ...)
or stuff that runs only once or infrequently (like the changes in
PostmasterMain or libpqrcv_get_senderinfo). Likewise, it seems entirely
pointless to worry about strlen() overhead e.g. in fsync_parent_path
which is probably dominated by I/O.

With code as interconnected as postgres, it is difficult to say that a 
function, which calls strlen, repeatedly, would not have any gain.
Regarding the functions, I was just being consistent, trying to remove 
all occurrences, even where, there is very little gain.


At least gcc 9.3 optimizes "strlen(s) == 0" to "s[0] == '\0'", even at 
low optimization levels. I tried it out with https://godbolt.org/.


Maybe some of the others cases are performance improvements, I have not 
checked your patch in details, but strlen() == 0 is easily handled by 
the compiler.


C code:

int f1(char *str) {
return strlen(str) == 0;
}

int f2(char *str) {
return str[0] == '\0';
}

Assembly generated with default flags:

f1:
pushq   %rbp
movq%rsp, %rbp
movq%rdi, -8(%rbp)
movq-8(%rbp), %rax
movzbl  (%rax), %eax
testb   %al, %al
sete%al
movzbl  %al, %eax
popq%rbp
ret
f2:
pushq   %rbp
movq%rsp, %rbp
movq%rdi, -8(%rbp)
movq-8(%rbp), %rax
movzbl  (%rax), %eax
testb   %al, %al
sete%al
movzbl  %al, %eax
popq%rbp
ret

Assembly generated with -O2.

f1:
xorl%eax, %eax
cmpb$0, (%rdi)
sete%al
ret
f2:
xorl%eax, %eax
cmpb$0, (%rdi)
sete%al
ret

Andreas




Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Tomas Vondra

On Sun, Apr 19, 2020 at 05:29:52PM -0300, Ranier Vilela wrote:

Em dom., 19 de abr. de 2020 às 16:33, Tomas Vondra <
tomas.von...@2ndquadrant.com> escreveu:


On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:
>Hi,
>strlen it is one of the low fruits that can be harvested.
>What is your opinion?
>

That assumes this actually affects/improves performance, without any
measurements proving that. Considering large number of the places you
modified are related to DDL (CreateComment, ChooseIndexColumnNames, ...)
or stuff that runs only once or infrequently (like the changes in
PostmasterMain or libpqrcv_get_senderinfo). Likewise, it seems entirely
pointless to worry about strlen() overhead e.g. in fsync_parent_path
which is probably dominated by I/O.


With code as interconnected as postgres, it is difficult to say that a
function, which calls strlen, repeatedly, would not have any gain.
Regarding the functions, I was just being consistent, trying to remove all
occurrences, even where, there is very little gain.



That very much depends on the function, I think. For most places modified
by this patch it's not that hard, I think. The DDL cases (comments and
indexes) seem pretty clear. Similarly for the command parsing, wal
receiver, lockfile creation, guc, exec.c, and so on.

Perhaps the only places worth changing might be xml.c and spell.c, but
I'm not convinced even these are worth it, really.





Maybe there are places where this would help, but I don't see a reason
to just throw away all strlen calls and replace them with something
clearly less convenient and possibly more error-prone (I'd expect quite
a few off-by-one mistakes with this).


Yes, always, it is prone to errors, but for the most part, they are safe
changes.
It passes all 199 tests, of course it has not been tested in a real
production environment.
Perhaps the time is not the best, the end of the cycle, but, once done, I
believe it would be a good harvest.



I wasn't really worried about bugs in this patch, but rather in future
changes made to this code. Off-by-one errors are trivial to make.


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: DETACH PARTITION and FOR EACH ROW triggers on partitioned tables

2020-04-19 Thread Justin Pryzby
On Sun, Apr 19, 2020 at 03:13:29PM -0400, Alvaro Herrera wrote:
> On 2020-Apr-18, Justin Pryzby wrote:
> > I haven't heard a compelling argument for or against either way.
> > 
> > Maybe the worst behavior might be if, during ATTACH, we searched for a 
> > matching
> > trigger, and "merged" it (marked it inherited) if it matched.  That could be
> > bad if someone *wanted* two triggers, which seems unlikely, but to each 
> > their
> > own.
> 
> I think the simplicity argument trumps the other ones, so I agree to go
> with your v3 patch proposed downthread.
> 
> What happens if you detach the parent?  I mean, should the trigger
> removal recurse to children?

It think it should probably exactly undo what CloneRowTriggersToPartition does.
..and I guess you're trying to politely say that it didn't.  I tried to fix in
v4 - please check if that's right.

> > It occured to me that we don't currently distinguish between a trigger on a
> > child table, and a trigger on a parent table which was recursively created 
> > on a
> > child.  That makes sense for indexes and constraints, since the objects 
> > persist
> > if the table is detached, so it doesn't matter how it was defined.
> > 
> > But if we remove trigger during DETACH, then it's *not* the same as a 
> > trigger
> > that was defined on the child, and I suggest that in v13 we should make that
> > visible.
> 
> Hmm, interesting point -- whether the trigger is partition or not is
> important because it affects what happens on detach.  I agree that we
> should make it visible.  Is the proposed single bit "PARTITION" good
> enough, or should we indicate what's the ancestor table that defines the
> partition?

Yea, it's an obvious thing to do.

One issue is that tgparentid is new, so showing the partition status of the
trigger when connected to an pre-13 server would be nontrivial: b9b408c48.

-- 
Justin
>From b45a047f37d215cac2e72661b92e0ade4730b1e1 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Fri, 3 Apr 2020 22:43:26 -0500
Subject: [PATCH v4] fix detaching tables with inherited row triggers

The old behavior is buggy, and the intended behavior was not previously
documented.  So define the behavior that the trigger is removed if the table is
detached.  It is an error if a table being attached already has a trigger of
the same.  This differs from the behavior for indexes and constraints.

See also:
86f575948c773b0ec5b0f27066e37dd93a7f0a96

Discussion:
https://www.postgresql.org/message-id/flat/20200408152412.GZ2228%40telsasoft.com
---
 doc/src/sgml/ref/create_trigger.sgml   |  1 +
 src/backend/commands/tablecmds.c   | 42 
 src/bin/psql/describe.c| 14 ++--
 src/test/regress/expected/triggers.out | 45 ++
 src/test/regress/sql/triggers.sql  | 21 
 5 files changed, 121 insertions(+), 2 deletions(-)

diff --git a/doc/src/sgml/ref/create_trigger.sgml b/doc/src/sgml/ref/create_trigger.sgml
index bde3a63f90..303881fcfb 100644
--- a/doc/src/sgml/ref/create_trigger.sgml
+++ b/doc/src/sgml/ref/create_trigger.sgml
@@ -526,6 +526,7 @@ UPDATE OF column_name1 [, column_name2INSTEAD OF.
   
 
diff --git a/src/backend/commands/tablecmds.c b/src/backend/commands/tablecmds.c
index 037d457c3d..514c5ff844 100644
--- a/src/backend/commands/tablecmds.c
+++ b/src/backend/commands/tablecmds.c
@@ -16797,6 +16797,48 @@ ATExecDetachPartition(Relation rel, RangeVar *name)
 	}
 	table_close(classRel, RowExclusiveLock);
 
+	/*
+	 * Drop any ROW triggers inherited from partitioned table.
+	 * This undoes what CloneRowTriggersToPartition did.
+	 */
+	{
+		ScanKeyData skey;
+		SysScanDesc	scan;
+		HeapTuple	trigtup;
+		Relation	tgrel;
+
+		ScanKeyInit(, Anum_pg_trigger_tgrelid, BTEqualStrategyNumber,
+F_OIDEQ, ObjectIdGetDatum(RelationGetRelid(partRel)));
+		tgrel = table_open(TriggerRelationId, RowExclusiveLock);
+		scan = systable_beginscan(tgrel, TriggerRelidNameIndexId,
+true, NULL, 1, );
+
+		while (HeapTupleIsValid(trigtup = systable_getnext(scan)))
+		{
+			Form_pg_trigger pg_trigger = (Form_pg_trigger) GETSTRUCT(trigtup);
+			ObjectAddress object;
+
+			if (!OidIsValid(pg_trigger->tgparentid) ||
+	!pg_trigger->tgisinternal ||
+	!TRIGGER_FOR_ROW(pg_trigger->tgtype))
+continue;
+
+			deleteDependencyRecordsFor(TriggerRelationId,
+	pg_trigger->oid,
+	false);
+			deleteDependencyRecordsFor(RelationRelationId,
+	pg_trigger->oid,
+	false);
+
+			CommandCounterIncrement();
+			ObjectAddressSet(object, TriggerRelationId, pg_trigger->oid);
+			performDeletion(, DROP_RESTRICT, PERFORM_DELETION_INTERNAL);
+		}
+
+		systable_endscan(scan);
+		table_close(tgrel, RowExclusiveLock);
+	}
+
 	/*
 	 * Detach any foreign keys that are inherited.  This includes creating
 	 * additional action triggers.
diff --git a/src/bin/psql/describe.c b/src/bin/psql/describe.c
index f05e914b4d..1de33ac772 100644
--- a/src/bin/psql/describe.c
+++ b/src/bin/psql/describe.c
@@ -2939,14 +2939,18 

Re: DETACH PARTITION and FOR EACH ROW triggers on partitioned tables

2020-04-19 Thread Alvaro Herrera
On 2020-Apr-19, Justin Pryzby wrote:

> It's probably rare that we'd be inserting into a table old enough to be
> detached, and normally that would be ok, but if a trigger were missing, it
> would misbehave.  In our use-case, we're creating trigger on the parent as a
> convenient way to maintain them on the partitions, which doesn't work if a
> table exists but detached..
> 
> So we'd actually prefer the behavior of indexes/constraints, where the trigger
> is preserved if the child is detached.  I'm not requesting to do that just for
> our use case, which may be atypical or not a good model, but adding our one
> data point.

I think the easiest way to implement this is to have two triggers -- the
one that's direct in the partition checks whether the table is a
partition and does nothing in that case.

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 16:33, Tomas Vondra <
tomas.von...@2ndquadrant.com> escreveu:

> On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:
> >Hi,
> >strlen it is one of the low fruits that can be harvested.
> >What is your opinion?
> >
>
> That assumes this actually affects/improves performance, without any
> measurements proving that. Considering large number of the places you
> modified are related to DDL (CreateComment, ChooseIndexColumnNames, ...)
> or stuff that runs only once or infrequently (like the changes in
> PostmasterMain or libpqrcv_get_senderinfo). Likewise, it seems entirely
> pointless to worry about strlen() overhead e.g. in fsync_parent_path
> which is probably dominated by I/O.
>
With code as interconnected as postgres, it is difficult to say that a
function, which calls strlen, repeatedly, would not have any gain.
Regarding the functions, I was just being consistent, trying to remove all
occurrences, even where, there is very little gain.


>
> Maybe there are places where this would help, but I don't see a reason
> to just throw away all strlen calls and replace them with something
> clearly less convenient and possibly more error-prone (I'd expect quite
> a few off-by-one mistakes with this).
>
Yes, always, it is prone to errors, but for the most part, they are safe
changes.
It passes all 199 tests, of course it has not been tested in a real
production environment.
Perhaps the time is not the best, the end of the cycle, but, once done, I
believe it would be a good harvest.

 Thank you for comment.

regards,
Ranier Vilela


Re: HEAPDEBUGALL is broken

2020-04-19 Thread Alexander Lakhin
Hello hackers,
19.04.2020 13:37, Tom Lane wrote:
>
> Peter Eisentraut  writes:
>> The HEAPDEBUGALL define has been broken since PG12 due to tableam
>> changes.  Should we just remove this?  It doesn't look very useful.
>> It's been around since Postgres95.
>> If we opt for removing: PG12 added an analogous HEAPAMSLOTDEBUGALL
>> (which still compiles correctly).  Would we want to keep that?
>
> +1 for removing both.  There are a lot of such debug "features"
> in the code, and few of them are worth anything IME.
To the point, I've tried to use HAVE_ALLOCINFO on master today and it
failed too:
$ CPPFLAGS="-DHAVE_ALLOCINFO" ./configure --enable-tap-tests
--enable-debug --enable-cassert  >/dev/null && make -j16 >/dev/null
generation.c: In function ‘GenerationAlloc’:
generation.c:191:11: error: ‘GenerationContext {aka struct
GenerationContext}’ has no member named ‘name’
 (_cxt)->name, (_chunk), (_chunk)->size)
   ^
generation.c:386:3: note: in expansion of macro ‘GenerationAllocInfo’
   GenerationAllocInfo(set, chunk);
   ^~~
generation.c:191:11: error: ‘GenerationContext {aka struct
GenerationContext}’ has no member named ‘name’
 (_cxt)->name, (_chunk), (_chunk)->size)
   ^
generation.c:463:2: note: in expansion of macro ‘GenerationAllocInfo’
  GenerationAllocInfo(set, chunk);
  ^~~

Best regards,
Alexander


Re: [PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Tomas Vondra

On Sun, Apr 19, 2020 at 11:24:38AM -0300, Ranier Vilela wrote:

Hi,
strlen it is one of the low fruits that can be harvested.
What is your opinion?



That assumes this actually affects/improves performance, without any
measurements proving that. Considering large number of the places you
modified are related to DDL (CreateComment, ChooseIndexColumnNames, ...)
or stuff that runs only once or infrequently (like the changes in
PostmasterMain or libpqrcv_get_senderinfo). Likewise, it seems entirely
pointless to worry about strlen() overhead e.g. in fsync_parent_path
which is probably dominated by I/O.

Maybe there are places where this would help, but I don't see a reason
to just throw away all strlen calls and replace them with something
clearly less convenient and possibly more error-prone (I'd expect quite
a few off-by-one mistakes with this).


regards

--
Tomas Vondra  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: DETACH PARTITION and FOR EACH ROW triggers on partitioned tables

2020-04-19 Thread Justin Pryzby
On Wed, Apr 08, 2020 at 11:44:33AM -0500, Justin Pryzby wrote:
> On Wed, Apr 08, 2020 at 12:02:39PM -0400, Alvaro Herrera wrote:
> > On 2020-Apr-08, Justin Pryzby wrote:
> > 
> > > This seems to be a bug in master, v12, and (probably) v11, where "FOR 
> > > EACH FOR"
> > > was first allowed on partition tables (86f575948).
> > > 
> > > I thought this would work like partitioned indexes (8b08f7d48), where 
> > > detaching
> > > a partition makes its index non-inherited, and attaching a partition 
> > > marks a
> > > pre-existing, matching partition as inherited rather than creating a new 
> > > one.
> > 
> > Hmm.  Let's agree to what behavior we want, and then we implement that.
> > It seems to me there are two choices:
> > 
> > 1. on detach, keep the trigger but make it independent of the trigger on
> > parent.  (This requires that the trigger is made dependent on the
> > trigger on parent, if the table is attached as partition again;
> > otherwise you'd end up with multiple copies of the trigger if you
> > detach/attach multiple times).
> > 
> > 2. on detach, remove the trigger from the partition.
> > 
> > I think (2) is easier to implement, but (1) is the more convenient
> > behavior.
> 
> At telsasoft, we don't care (we uninherit tables before ALTERing parents to
> avoid disruptive locking and to avoid worst-case disk use).

I realized that I was wrong about what would be most desirable for us, for an
uncommon case:

Our loader INSERTs into the child table, not the parent (I think I did that to
try to implement UPSERT before partitioned indexes in v11).

All but the newest partitions are DETACHed when we need to promote a column.

It's probably rare that we'd be inserting into a table old enough to be
detached, and normally that would be ok, but if a trigger were missing, it
would misbehave.  In our use-case, we're creating trigger on the parent as a
convenient way to maintain them on the partitions, which doesn't work if a
table exists but detached..

So we'd actually prefer the behavior of indexes/constraints, where the trigger
is preserved if the child is detached.  I'm not requesting to do that just for
our use case, which may be atypical or not a good model, but adding our one
data point.

-- 
Justin




Re: DETACH PARTITION and FOR EACH ROW triggers on partitioned tables

2020-04-19 Thread Alvaro Herrera
On 2020-Apr-18, Justin Pryzby wrote:

> I haven't heard a compelling argument for or against either way.
> 
> Maybe the worst behavior might be if, during ATTACH, we searched for a 
> matching
> trigger, and "merged" it (marked it inherited) if it matched.  That could be
> bad if someone *wanted* two triggers, which seems unlikely, but to each their
> own.

I think the simplicity argument trumps the other ones, so I agree to go
with your v3 patch proposed downthread.

What happens if you detach the parent?  I mean, should the trigger
removal recurse to children?

> It occured to me that we don't currently distinguish between a trigger on a
> child table, and a trigger on a parent table which was recursively created on 
> a
> child.  That makes sense for indexes and constraints, since the objects 
> persist
> if the table is detached, so it doesn't matter how it was defined.
> 
> But if we remove trigger during DETACH, then it's *not* the same as a trigger
> that was defined on the child, and I suggest that in v13 we should make that
> visible.

Hmm, interesting point -- whether the trigger is partition or not is
important because it affects what happens on detach.  I agree that we
should make it visible.  Is the proposed single bit "PARTITION" good
enough, or should we indicate what's the ancestor table that defines the
partition?

-- 
Álvaro Herrerahttps://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 12:34, Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> escreveu:

>
> On Sun, Apr 19, 2020 at 1:58 PM Ranier Vilela  wrote:
>
>> Em dom., 19 de abr. de 2020 às 07:16, Juan José Santamaría Flecha <
>> juanjo.santama...@gmail.com> escreveu:
>>
>>> On Sat, Apr 18, 2020 at 1:43 PM Ranier Vilela 
>>> wrote:
>>>
 I have some observations about this patch, related to style, if you
 will allow me.

>>> Please find attached a revised version.
>>>
>> Looks good to me, but, sorry, I think I missed a glitch in the previous
>> review.
>> If _create_locale fail, no need to call _free_locale(loct);.
>>
>> Another point is, what is the difference between pg_mbstrlen and wcslen?
>> It would not be better to use only wcslen?
>>
>
> pg_mbstrlen() is for multibyte strings and  wcslen() is for wide-character
> strings, the "pg" equivalent would be pg_wchar_strlen().
>
> Attached have the patch with this comments.
>>
>
> + } else
>
> This line needs a break, other than that LGTM.
>
Sure. Attached new patch with this revision.

regards,
Ranier Vilela


0001-PG-compilation-error-with-VS-2015-2017-2019_v09.patch
Description: Binary data


proposal - psql - use pager for \watch command

2020-04-19 Thread Pavel Stehule
Hi,

last week I finished pspg 3.0 https://github.com/okbob/pspg . pspg now
supports pipes, named pipes very well. Today the pspg can be used as pager
for output of \watch command. Sure, psql needs attached patch.

I propose new psql environment variable PSQL_WATCH_PAGER. When this
variable is not empty, then \watch command starts specified pager, and
redirect output to related pipe. When pipe is closed - by pager, then
\watch cycle is leaved.

If you want to test proposed feature, you need a pspg with
cb4114f98318344d162a84b895a3b7f8badec241
commit.

Then you can set your env

export PSQL_WATCH_PAGER="pspg --stream"
psql

SELECT * FROM pg_stat_database;
\watch 1

Comments, notes?

Regards

Pavel
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index a5160f91de..b78c0b3100 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -171,6 +171,7 @@ static char *pset_value_string(const char *param, printQueryOpt *popt);
 static void checkWin32Codepage(void);
 #endif
 
+static bool sigpipe_received = false;
 
 
 /*--
@@ -4637,6 +4638,17 @@ do_shell(const char *command)
 	return true;
 }
 
+/*
+ * We want to detect sigpipe to break from watch cycle faster,
+ * before waiting 1 sec in sleep time, and unsuccess write to
+ * pipe.
+ */
+static void
+pagerpipe_sigpipe_handler(int signum)
+{
+	sigpipe_received = true;
+}
+
 /*
  * do_watch -- handler for \watch
  *
@@ -4651,6 +4663,8 @@ do_watch(PQExpBuffer query_buf, double sleep)
 	const char *strftime_fmt;
 	const char *user_title;
 	char	   *title;
+	const char *pagerprog = NULL;
+	FILE	   *pagerpipe = NULL;
 	int			title_len;
 	int			res = 0;
 
@@ -4660,6 +4674,21 @@ do_watch(PQExpBuffer query_buf, double sleep)
 		return false;
 	}
 
+	pagerprog = getenv("PSQL_WATCH_PAGER");
+	if (pagerprog)
+	{
+		sigpipe_received = false;
+
+#ifndef WIN32
+		pqsignal(SIGPIPE, pagerpipe_sigpipe_handler);
+#endif
+
+		pagerpipe = popen(pagerprog, "w");
+		if (!pagerpipe)
+			/*silently proceed without pager */
+			restore_sigpipe_trap();
+	}
+
 	/*
 	 * Choose format for timestamps.  We might eventually make this a \pset
 	 * option.  In the meantime, using a variable for the format suppresses
@@ -4671,7 +4700,8 @@ do_watch(PQExpBuffer query_buf, double sleep)
 	 * Set up rendering options, in particular, disable the pager, because
 	 * nobody wants to be prompted while watching the output of 'watch'.
 	 */
-	myopt.topt.pager = 0;
+	if (!pagerpipe)
+		myopt.topt.pager = 0;
 
 	/*
 	 * If there's a title in the user configuration, make sure we have room
@@ -4705,7 +4735,7 @@ do_watch(PQExpBuffer query_buf, double sleep)
 		myopt.title = title;
 
 		/* Run the query and print out the results */
-		res = PSQLexecWatch(query_buf->data, );
+		res = PSQLexecWatch(query_buf->data, , pagerpipe);
 
 		/*
 		 * PSQLexecWatch handles the case where we can no longer repeat the
@@ -4714,6 +4744,9 @@ do_watch(PQExpBuffer query_buf, double sleep)
 		if (res <= 0)
 			break;
 
+		if (pagerpipe && ferror(pagerpipe))
+			break;
+
 		/*
 		 * Set up cancellation of 'watch' via SIGINT.  We redo this each time
 		 * through the loop since it's conceivable something inside
@@ -4731,14 +4764,27 @@ do_watch(PQExpBuffer query_buf, double sleep)
 		i = sleep_ms;
 		while (i > 0)
 		{
-			long		s = Min(i, 1000L);
+			long		s = Min(i, pagerpipe ? 100L : 1000L);
 
 			pg_usleep(s * 1000L);
 			if (cancel_pressed)
 break;
+
+			if (sigpipe_received)
+break;
+
 			i -= s;
 		}
 		sigint_interrupt_enabled = false;
+
+		if (sigpipe_received)
+			break;
+	}
+
+	if (pagerpipe)
+	{
+		fclose(pagerpipe);
+		restore_sigpipe_trap();
 	}
 
 	pg_free(title);
diff --git a/src/bin/psql/common.c b/src/bin/psql/common.c
index 621a33f7e8..64a58eacce 100644
--- a/src/bin/psql/common.c
+++ b/src/bin/psql/common.c
@@ -588,12 +588,13 @@ PSQLexec(const char *query)
  * e.g., because of the interrupt, -1 on error.
  */
 int
-PSQLexecWatch(const char *query, const printQueryOpt *opt)
+PSQLexecWatch(const char *query, const printQueryOpt *opt, FILE *printQueryFout)
 {
 	PGresult   *res;
 	double		elapsed_msec = 0;
 	instr_time	before;
 	instr_time	after;
+	FILE	   *fout;
 
 	if (!pset.db)
 	{
@@ -634,14 +635,16 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt)
 		return 0;
 	}
 
+	fout = printQueryFout ? printQueryFout : pset.queryFout;
+
 	switch (PQresultStatus(res))
 	{
 		case PGRES_TUPLES_OK:
-			printQuery(res, opt, pset.queryFout, false, pset.logfile);
+			printQuery(res, opt, fout, false, pset.logfile);
 			break;
 
 		case PGRES_COMMAND_OK:
-			fprintf(pset.queryFout, "%s\n%s\n\n", opt->title, PQcmdStatus(res));
+			fprintf(fout, "%s\n%s\n\n", opt->title, PQcmdStatus(res));
 			break;
 
 		case PGRES_EMPTY_QUERY:
@@ -664,7 +667,7 @@ PSQLexecWatch(const char *query, const printQueryOpt *opt)
 
 	PQclear(res);
 
-	fflush(pset.queryFout);
+	fflush(fout);
 
 	/* Possible microtiming output */
 	if (pset.timing)
diff --git a/src/bin/psql/common.h b/src/bin/psql/common.h
index 

Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Tom Lane
Isaac Morland  writes:
> If we were only concerned with HTML output then based on the desired
> semantics and appearance I would recommend  without hesitation. Because
> of the need to produce PDF as well and my lack of knowledge of the Postgres
> documentation build process, I can't be so certain but I still suspect 
> to be the best approach.

Yeah ... so a part of this problem is to persuade DocBook to generate
that.

As I mentioned upthread, I did experiment with putting a single-item
 in each table cell.  That works out to an annoying amount
of markup overhead, since variablelist is a rather overengineered
construct, but I imagine we could live with it.  The real problem was
the amount of whitespace it wanted to add.  We could probably hack our
way out of that with CSS for HTML output, but it was quite unclear whether
the PDF toolchain could be made to render it reasonably.

A desirable solution, perhaps, would be a  corresponding to
the entire table with rendering customization that produces table-like
dividing lines around s.  I'm not volunteering to figure
out how to do that though, especially not for PDF.

In the meantime I plan to push forward with the markup approach we've
got.  The editorial content should still work if we find a better
markup answer, and I'm willing to do the work of replacing the markup
as long as somebody else figures out what it should be.

regards, tom lane




Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Isaac Morland
On Sun, 19 Apr 2020 at 09:23, Tom Lane  wrote:


> In any case, I reject the idea that we should just drop the table
> markup altogether and use inline variablelists.  In most of these
> sections there is a very clear separation between the table contents
> (with per-function or per-operator details) and the surrounding
> commentary, which deals with more general concerns.  That's a useful
> separation for both readers and authors, so we need to preserve it
> in some form, but the standard rendering of variablelists won't.
> (Our existing major use of variablelists, in the GUC chapter, works
> around this basically by not having any "surrounding commentary"
> ... but that solution doesn't work here.)
>
> There is also value in being able to say things like "see Table m.n
> for the available operators for type foo".
>

The HTML definition list under discussion looks like this:


 term 1 
 description 1 
 term 2 
 description 2a 
 description 2b 


So the enclosing  element has the same role in the overall document as
the , and could be styled to set it apart from the main text and
make it clear that it is a single unit (and at least in principle could be
included in the "table" numbering). In the function/operator listing use
case, there would be one  for the description and a  for each
example. See:

https://developer.mozilla.org/en-US/docs/Web/HTML/Element/dl

If we were only concerned with HTML output then based on the desired
semantics and appearance I would recommend  without hesitation. Because
of the need to produce PDF as well and my lack of knowledge of the Postgres
documentation build process, I can't be so certain but I still suspect 
to be the best approach.


Re: Incremental sorts and EXEC_FLAG_REWIND

2020-04-19 Thread James Coleman
On Wed, Apr 15, 2020 at 2:04 PM James Coleman  wrote:
>
> On Wed, Apr 15, 2020 at 11:02 AM James Coleman  wrote:
> >
> > On Tue, Apr 14, 2020 at 2:53 AM Michael Paquier  wrote:
> > >
> > > Hi,
> > >
> > > When initializing an incremental sort node, we have the following as
> > > of ExecInitIncrementalSort():
> > > /*
> > >  * Incremental sort can't be used with either EXEC_FLAG_REWIND,
> > >  * EXEC_FLAG_BACKWARD or EXEC_FLAG_MARK, because we only one of many 
> > > sort
> > >  * batches in the current sort state.
> > >  */
> > >  Assert((eflags & (EXEC_FLAG_BACKWARD |
> > >EXEC_FLAG_MARK)) == 0);
> > > While I don't quite follow why EXEC_FLAG_REWIND should be allowed here
> > > to begin with (because incremental sorts don't support rescans without
> > > parameter changes, right?), the comment and the assertion are telling
> > > a different story.
> >
> > I remember changing this assertion in response to an issue I'd found
> > which led to rewriting the rescan implementation, but I must have
> > missed updating the comment.
>
> All right, here are the most relevant messages:
>
> [1]: Here I'd said:
> --
> While working on finding a test case to show rescan isn't implemented
> properly yet, I came across a bug. At the top of
> ExecInitIncrementalSort, we assert that eflags does not contain
> EXEC_FLAG_REWIND. But the following query (with merge and hash joins
> disabled) breaks that assertion:
>
> select * from t join (select * from t order by a, b) s on s.a = t.a
> where t.a in (1,2);
>
> The comments about this flag in src/include/executor/executor.h say:
>
> * REWIND indicates that the plan node should try to efficiently support
> * rescans without parameter changes. (Nodes must support ExecReScan calls
> * in any case, but if this flag was not given, they are at liberty to do it
> * through complete recalculation. Note that a parameter change forces a
> * full recalculation in any case.)
>
> Now we know that except in rare cases (as just discussed recently up
> thread) we can't implement rescan efficiently.
>
> So is this a planner bug (i.e., should we try not to generate
> incremental sort plans that require efficient rewind)? Or can we just
> remove that part of the assertion and know that we'll implement the
> rescan, albeit inefficiently? We already explicitly declare that we
> don't support backwards scanning, but I don't see a way to declare the
> same for rewind.
> --
>
> So it seems to me that we can't disallow REWIND, and we have to
> support rescan, but, we could try to mitigate the effects (without a
> param change) with a materialize node, as noted below.
>
> [2]: Here, in response to my questioning above if this was a planner
> bug, I'd said:
> --
> Other nodes seem to get a materialization node placed above them to
> support this case "better". Is that something we should be doing?
> --
>
> I never got any reply on this point; if we _did_ introduce a
> materialize node here, then it would mean we could start disallowing
> REWIND again. See the email for full details of a specific plan that I
> encountered that reproduced this.
>
> Thoughts?
>
> > In the meantime, your question is primarily about making sure the
> > code/comments/etc. are consistent and not a behavioral problem or
> > failure you've seen in testing?
>
> Still want to confirm this is the case.
>
> James
>
> [1]: 
> https://www.postgresql.org/message-id/CAAaqYe9%2Bap2SbU_E2WaC4F9ZMF4oa%3DpJZ1NBwaKDMP6GFUA77g%40mail.gmail.com
> [2]: 
> https://www.postgresql.org/message-id/CAAaqYe-sOp2o%3DL7nvGZDJ6GsL9%3Db_ztrGE1rhyi%2BF82p3my2bQ%40mail.gmail.com

Looking at this more, I think this is definitely suspect. The current
code shields lower nodes from EXEC_FLAG_BACKWARD and EXEC_FLAG_MARK --
the former is definitely fine because we declare that we don't support
backwards scans. The latter seems like the same reasoning would apply,
but unfortunately we didn't add it to ExecSupportsMarkRestore, so I've
attached a patch to do that.

The EXEC_FLAG_REWIND situation though I'm still not clear on -- given
the comments/docs seem to suggest it's a hint for efficiency rather
than something that has to work or be declared as not implemented, so
it seems like one of the following should be the outcome:

1. "Disallow" it by only generating materialize nodes above the
incremental sort node if REWIND will be required. I'm not sure if this
would mean that incremental sort just wouldn't be useful in that case?
2. Keep the existing implementation where we basically ignore REWIND
and use our more inefficient implementation. In this case, I believe
we need to stop shielding child nodes from REWIND, though, since we we
aren't actually storing the full result set and will instead be
re-executing the child nodes.

I've attached a patch to take course (2), since it's the easiest to
implement. But I'd still like feedback on what we should do here,
because I don't feel like I 

Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Juan José Santamaría Flecha
On Sun, Apr 19, 2020 at 1:58 PM Ranier Vilela  wrote:

> Em dom., 19 de abr. de 2020 às 07:16, Juan José Santamaría Flecha <
> juanjo.santama...@gmail.com> escreveu:
>
>> On Sat, Apr 18, 2020 at 1:43 PM Ranier Vilela 
>> wrote:
>>
>>> I have some observations about this patch, related to style, if you will
>>> allow me.
>>>
>> Please find attached a revised version.
>>
> Looks good to me, but, sorry, I think I missed a glitch in the previous
> review.
> If _create_locale fail, no need to call _free_locale(loct);.
>
> Another point is, what is the difference between pg_mbstrlen and wcslen?
> It would not be better to use only wcslen?
>

pg_mbstrlen() is for multibyte strings and  wcslen() is for wide-character
strings, the "pg" equivalent would be pg_wchar_strlen().

Attached have the patch with this comments.
>

+ } else

This line needs a break, other than that LGTM.

Regards,

Juan José Santamaría Flecha


[PATCH] Small optimization across postgres (remove strlen duplicate usage)

2020-04-19 Thread Ranier Vilela
Hi,
strlen it is one of the low fruits that can be harvested.
What is your opinion?

regards,
Ranier Vilela


remove_strlen.patch
Description: Binary data


Re: WAL usage calculation patch

2020-04-19 Thread Julien Rouhaud
Hi Justin,

Thanks for the review!

On Sat, Apr 18, 2020 at 10:41 PM Justin Pryzby  wrote:
>
> Should capitalize at least the non-text one ?  And maybe the text one for
> consistency ?
>
> +   ExplainPropertyInteger("WAL fpw", NULL,

I think we should keep both version consistent, whether lower or upper
case.  The uppercase version is probably more correct, but it's a
little bit weird to have it being the only upper case label in all
output, so I kept it lower case.

> And add the acronym to the docs:
>
> $ git grep 'full page' '*/explain.sgml'
> doc/src/sgml/ref/explain.sgml:  number of records, number of full page 
> writes and amount of WAL bytes
>
> "..full page writes (FPW).."

Indeed!  Fixed (using lowercase to match current output).

> Should we also change vacuumlazy.c for consistency ?
>
> +_("WAL usage: %ld 
> records, %ld full page writes, "
> +  UINT64_FORMAT " 
> bytes"),

I don't think this one should be changed, vacuumlazy output is already
entirely different, and is way more verbose so keeping it as is makes
sense to me.
diff --git a/doc/src/sgml/ref/explain.sgml b/doc/src/sgml/ref/explain.sgml
index aedd70a6ad..e5b50d3790 100644
--- a/doc/src/sgml/ref/explain.sgml
+++ b/doc/src/sgml/ref/explain.sgml
@@ -198,8 +198,8 @@ ROLLBACK;
 
  
   Include information on WAL record generation. Specifically, include the
-  number of records, number of full page writes and amount of WAL bytes
-  generated.  In text format, only non-zero values are printed.  This
+  number of records, number of full page writes (fpw) and amount of WAL
+  bytes generated.  In text format, only non-zero values are printed.  This
   parameter may only be used when ANALYZE is also
   enabled.  It defaults to FALSE.
  
diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 7ae6131676..9cc1b13b76 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -3350,13 +3350,13 @@ show_wal_usage(ExplainState *es, const WalUsage *usage)
 			appendStringInfoString(es->str, "WAL:");
 
 			if (usage->wal_records > 0)
-appendStringInfo(es->str, "  records=%ld",
+appendStringInfo(es->str, " records=%ld",
  usage->wal_records);
 			if (usage->wal_fpw > 0)
-appendStringInfo(es->str, "  full page writes=%ld",
+appendStringInfo(es->str, " fpw=%ld",
  usage->wal_fpw);
 			if (usage->wal_bytes > 0)
-appendStringInfo(es->str, "  bytes=" UINT64_FORMAT,
+appendStringInfo(es->str, " bytes=" UINT64_FORMAT,
  usage->wal_bytes);
 			appendStringInfoChar(es->str, '\n');
 		}
@@ -3365,7 +3365,7 @@ show_wal_usage(ExplainState *es, const WalUsage *usage)
 	{
 		ExplainPropertyInteger("WAL records", NULL,
 			   usage->wal_records, es);
-		ExplainPropertyInteger("WAL full page writes", NULL,
+		ExplainPropertyInteger("WAL fpw", NULL,
 			   usage->wal_fpw, es);
 		ExplainPropertyUInteger("WAL bytes", NULL,
 usage->wal_bytes, es);


Re: [PATCH] Incremental sort (was: PoC: Partial sort)

2020-04-19 Thread James Coleman
On Sat, Apr 18, 2020 at 10:36 PM Justin Pryzby  wrote:
>
> On Tue, Apr 07, 2020 at 10:53:05AM -0500, Justin Pryzby wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > > And, should it use two spaces before "Sort Method", "Memory" and 
> > > > "Pre-sorted
> > ...
> > > I read through that subthread, and the ending seemed to be Peter
> > > wanting things to be unified. Was there a conclusion beyond that?
> >
> > This discussion is ongoing.  I think let's wait until that's settled before
> > addressing this more complex and even newer case.  We can add "explain, two
> > spaces and equals vs colon" to the "Open items" list if need be - I hope the
> > discussion will not delay the release.
>
> The change proposed on the WAL thread is minimal, and makes new explain(WAL)
> output consistent with the that of explain(BUFFERS).
>
> That uses a different format from "Sort", which is what incremental sort 
> should
> follow.  (Hashjoin also uses the Sort's format of two-spaces and colons rather
> than equals).

I think it's not great that buffers/sort are different, but I agree
that we should follow sort.

> So the attached 0001 makes explain output for incremental sort more consistent
> with sort:
>
>  - Two spaces;
>  - colons rather than equals;
>  - Don't use semicolon, which isn't in use anywhere else;
>
> I tested with this:
> template1=# DROP TABLE t; CREATE TABLE t(i int, j int); INSERT INTO t SELECT 
> a-(a%100), a%1000 FROM generate_series(1,9)a; CREATE INDEX ON t(i); 
> VACUUM VERBOSE ANALYZE t;
> template1=# explain analyze SELECT * FROM t a ORDER BY i,j;
> ...
>Full-sort Groups: 1000  Sort Method: quicksort  Average Memory: 28kB  Peak 
> Memory: 28kB  Pre-sorted Groups: 1000  Sort Method: quicksort  Average 
> Memory: 30kB  Peak Memory: 30kB
>
> On Tue, Apr 07, 2020 at 05:34:15PM +0200, Tomas Vondra wrote:
> > On Tue, Apr 07, 2020 at 08:40:30AM -0400, James Coleman wrote:
> > > On Tue, Apr 7, 2020 at 12:25 AM Justin Pryzby  
> > > wrote:
> > > > Should "Pre-sorted Groups:" be on a separate line ?
> > > > | Full-sort Groups: 1 Sort Method: quicksort Memory: avg=28kB peak=28kB 
> > > > Pre-sorted Groups: 1 Sort Method: quicksort Memory: avg=30kB peak=30kB
> > >
> > > I'd originally had that, but Tomas wanted it to be more compact. It's
> > > easy to adjust though if the consensus changes on that.
> >
> > I'm OK with changing the format if there's a consensus. The current
> > format seemed better to me, but I'm not particularly attached to it.
>
> I still think Pre-sorted groups should be on a separate line, as in 0002.
> In addition to looking better (to me), and being easier to read, another 
> reason
> is that there are essentially key=>values here, but the keys are repeated 
> (Sort
> Method, etc).

I collapsed this into 0001 because I think that if we're going to do
away with the key=value style then we effectively to have to do this
to avoid the repeated values being confusing (with key=value it kinda
worked, because that made it seem like the avg/peak were clearly a
subset of the Sort Groups info).

I also cleaned up the newline patch a bit in the process (we already
have a way to indent with a flag so don't need to do it directly).

> I also suggested to rename: s/Presorted/Pre-sorted/, but I didn't do that 
> here.

I went ahead and did that too; we already use "Full-sort", so the
proposed change makes both parallel.

> Michael already patched most of the comment typos, the remainder I'm including
> here as a "nearby patch"..

Modified slightly.

James
From becd60ba348558fa241db5cc2100a84b757cbdc5 Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Mon, 6 Apr 2020 17:37:31 -0500
Subject: [PATCH v2 2/2] comment typos: Incremental Sort

commit d2d8a229bc58a2014dce1c7a4fcdb6c5ab9fb8da
Author: Tomas Vondra 

Previously reported here:
https://www.postgresql.org/message-id/20200407042521.GH2228%40telsasoft.com
---
 src/backend/commands/explain.c |  4 ++--
 src/backend/executor/nodeIncrementalSort.c | 10 ++
 src/backend/utils/sort/tuplesort.c |  8 
 3 files changed, 12 insertions(+), 10 deletions(-)

diff --git a/src/backend/commands/explain.c b/src/backend/commands/explain.c
index 5f91c569a0..86c10458f4 100644
--- a/src/backend/commands/explain.c
+++ b/src/backend/commands/explain.c
@@ -2865,7 +2865,7 @@ show_incremental_sort_group_info(IncrementalSortGroupInfo *groupInfo,
 }
 
 /*
- * If it's EXPLAIN ANALYZE, show tuplesort stats for a incremental sort node
+ * If it's EXPLAIN ANALYZE, show tuplesort stats for an incremental sort node
  */
 static void
 show_incremental_sort_info(IncrementalSortState *incrsortstate,
@@ -2913,7 +2913,7 @@ show_incremental_sort_info(IncrementalSortState *incrsortstate,
 			>shared_info->sinfo[n];
 
 			/*
-			 * If a worker hasn't process any sort groups at all, then exclude
+			 * If a worker hasn't processed any sort groups at all, then exclude
 			 * it from output since it either 

Re: HEAPDEBUGALL is broken

2020-04-19 Thread Tom Lane
Peter Eisentraut  writes:
> The HEAPDEBUGALL define has been broken since PG12 due to tableam 
> changes.  Should we just remove this?  It doesn't look very useful. 
> It's been around since Postgres95.
> If we opt for removing: PG12 added an analogous HEAPAMSLOTDEBUGALL 
> (which still compiles correctly).  Would we want to keep that?

+1 for removing both.  There are a lot of such debug "features"
in the code, and few of them are worth anything IME.

regards, tom lane




Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Tom Lane
Peter Eisentraut  writes:
> This scares me in terms of maintainability of both the toolchain and the 
> markup.  Table formatting is already incredibly fragile, and here we 
> just keep poking it until it looks a certain way instead of thinking 
> about semantic markup.

That's a fair criticism, but ...

> A good old definition list of the kind
> synopsis
>  explanation
>  example or two
> would be much easier to maintain on all fronts.  And we could for 
> example link directly to a function, which is currently not really possible.
> If we want to draw a box around this and change the spacing, we can do 
> that with CSS.

... "we can fix it with CSS" is just as much reliance on toolchain.

In any case, I reject the idea that we should just drop the table
markup altogether and use inline variablelists.  In most of these
sections there is a very clear separation between the table contents
(with per-function or per-operator details) and the surrounding
commentary, which deals with more general concerns.  That's a useful
separation for both readers and authors, so we need to preserve it
in some form, but the standard rendering of variablelists won't.
(Our existing major use of variablelists, in the GUC chapter, works
around this basically by not having any "surrounding commentary"
... but that solution doesn't work here.)

There is also value in being able to say things like "see Table m.n
for the available operators for type foo".

If somebody's got an idea how to obtain this painfully-agreed-to
visual appearance from more robust markup, I'm all ears.  This
stuff is a bit outside my skill set, so I don't claim to have
found the best possible implementation.

regards, tom lane




HEAPDEBUGALL is broken

2020-04-19 Thread Peter Eisentraut
The HEAPDEBUGALL define has been broken since PG12 due to tableam 
changes.  Should we just remove this?  It doesn't look very useful. 
It's been around since Postgres95.


If we opt for removing: PG12 added an analogous HEAPAMSLOTDEBUGALL 
(which still compiles correctly).  Would we want to keep that?


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Ranier Vilela
Em dom., 19 de abr. de 2020 às 07:16, Juan José Santamaría Flecha <
juanjo.santama...@gmail.com> escreveu:

>
> On Sat, Apr 18, 2020 at 6:07 AM Amit Kapila 
> wrote:
>
>> On Sat, Apr 18, 2020 at 12:14 AM Juan José Santamaría Flecha
>>  wrote:
>> >
>> > We can get a match for those locales in non-ISO format by enumerating
>> available locales with EnumSystemLocalesEx(), and trying to find a match.
>>
>> I have not reviewed or tested the new patch but one thing I would like
>> to see is the impact of setting LC_MESAGGES with different locale
>> information.  Basically, the error messages after setting the locale
>> with _create_locale and with the new method being discussed.  This
>> will help us in ensuring that we didn't break anything which was
>> working with prior versions of MSVC.  Can you or someone try to test
>> and share the results of the same?
>>
>
> I cannot find a single place where all supported locales are listed, but I
> have created a small test program (WindowsNLSLocales.c) based on:
> [_] format locales [1], additional supported language
> strings [2], and additional supported country and region strings [3]. Based
> on the results from this test program, it is possible to to do a good job
> with the [_] types using the proposed logic, but the
> two later cases are Windows specific, and there is no way arround a
> lookup-table.
>
> The attached results (WindowsNLSLocales.ods) come from Windows 10 (1903)
> and Visual C++ build 1924, 64-bit.
>
> On Sat, Apr 18, 2020 at 1:43 PM Ranier Vilela  wrote:
>
>> I have some observations about this patch, related to style, if you will
>> allow me.
>>
>
> Please find attached a revised version.
>
Looks good to me, but, sorry, I think I missed a glitch in the previous
review..

+#else /* _WIN32_WINNT < 0x0600 */
+ _locale_t loct;
+
+ loct = _create_locale(LC_CTYPE, winlocname);
+ if (loct != NULL)
+{
+ rc = wchar2char(iso_lc_messages, loct->locinfo->locale_name[LC_CTYPE],
+ sizeof(iso_lc_messages), NULL);
  _free_locale(loct);
+}
+#endif /* _WIN32_WINNT >= 0x0600 */

If _create_locale fail, no need to call _free_locale(loct);.

Another point is, what is the difference between pg_mbstrlen and wcslen?
It would not be better to use only wcslen?

Attached have the patch with this comments.

regards,
Ranier Vilela


0001-PG-compilation-error-with-VS-2015-2017-2019_v08.patch
Description: Binary data


Re: WIP: WAL prefetch (another approach)

2020-04-19 Thread Dmitry Dolgov
> On Thu, Apr 09, 2020 at 09:55:25AM +1200, Thomas Munro wrote:
> Thanks.  Here's a rebase.

Thanks for working on this patch, it seems like a great feature. I'm
probably a bit late to the party, but still want to make couple of
commentaries.

The patch indeed looks good, I couldn't find any significant issues so
far and almost all my questions I had while reading it were actually
answered in this thread. I'm still busy with benchmarking, mostly to see
how prefetching would work with different workload distributions and how
much the kernel will actually prefetch.

In the meantime I have a few questions:

> On Wed, Feb 12, 2020 at 07:52:42PM +1300, Thomas Munro wrote:
> > On Fri, Jan 3, 2020 at 7:10 AM Tomas Vondra
> >  wrote:
> > > Could we instead specify the number of blocks to prefetch? We'd probably
> > > need to track additional details needed to determine number of blocks to
> > > prefetch (essentially LSN for all prefetch requests).
>
> Here is a new WIP version of the patch set that does that.  Changes:
>
> 1.  It now uses effective_io_concurrency to control how many
> concurrent prefetches to allow.  It's possible that we should have a
> different GUC to control "maintenance" users of concurrency I/O as
> discussed elsewhere[1], but I'm staying out of that for now; if we
> agree to do that for VACUUM etc, we can change it easily here.  Note
> that the value is percolated through the ComputeIoConcurrency()
> function which I think we should discuss, but again that's off topic,
> I just want to use the standard infrastructure here.

This totally makes sense, I believe the question "how much to prefetch"
eventually depends equally on a type of workload (correlates with how
far in WAL to read) and how much resources are available for prefetching
(correlates with queue depth). But in the documentation it looks like
maintenance-io-concurrency is just an "unimportant" option, and I'm
almost sure will be overlooked by many readers:

The maximum distance to look ahead in the WAL during recovery, to find
blocks to prefetch.  Prefetching blocks that will soon be needed can
reduce I/O wait times.  The number of concurrent prefetches is limited
by this setting as well as
.  Setting it too high
might be counterproductive, if it means that data falls out of the
kernel cache before it is needed.  If this value is specified without
units, it is taken as bytes.  A setting of -1 disables prefetching
during recovery.

Maybe it makes also sense to emphasize that maintenance-io-concurrency
directly affects resource consumption and it's a "primary control"?

> On Wed, Mar 18, 2020 at 06:18:44PM +1300, Thomas Munro wrote:
>
> Here's a new version that changes that part just a bit more, after a
> brief chat with Andres about his async I/O plans.  It seems clear that
> returning an enum isn't very extensible, so I decided to try making
> PrefetchBufferResult a struct whose contents can be extended in the
> future.  In this patch set it's still just used to distinguish 3 cases
> (hit, miss, no file), but it's now expressed as a buffer and a flag to
> indicate whether I/O was initiated.  You could imagine that the second
> thing might be replaced by a pointer to an async I/O handle you can
> wait on or some other magical thing from the future.

I like the idea of extensible PrefetchBufferResult. Just one commentary,
if I understand correctly the way how it is being used together with
prefetch_queue assumes one IO operation at a time. This limits potential
extension of the underlying code, e.g. one can't implement some sort of
buffering of requests and submitting an iovec to a sycall, then
prefetch_queue will no longer correctly represent inflight IO. Also,
taking into account that "we don't have any awareness of when I/O really
completes", maybe in the future it makes to reconsider having queue in
the prefetcher itself and rather ask for this information from the
underlying code?

> On Wed, Apr 08, 2020 at 04:24:21AM +1200, Thomas Munro wrote:
> > Is there a way we could have a "historical" version of at least some of
> > these? An average queue depth, or such?
>
> Ok, I added simple online averages for distance and queue depth that
> take a sample every time recovery advances by 256kB.

Maybe it was discussed in the past in other threads. But if I understand
correctly, this implementation weights all the samples. Since at the
moment it depends directly on replaying speed (so a lot of IO involved),
couldn't it lead to a single outlier at the beginning skewing this value
and make it less useful? Does it make sense to decay old values?




Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Peter Eisentraut

On 2020-04-17 02:25, Tom Lane wrote:

I eventually figured out that the approved way to do per-table-entry
customization is to attach "role" properties to the DocBook elements,
and then key off the role names in applying formatting changes in
the customization layer.  So attached is a v3 that handles the desired
formatting changes by applying a hanging indent to table 
contents if the entry is marked with role="functableentry".  It may
well be possible to do this in a cleaner fashion, but this seems
good enough for discussion.


This scares me in terms of maintainability of both the toolchain and the 
markup.  Table formatting is already incredibly fragile, and here we 
just keep poking it until it looks a certain way instead of thinking 
about semantic markup.


A good old definition list of the kind

synopsis

explanation

example or two

would be much easier to maintain on all fronts.  And we could for 
example link directly to a function, which is currently not really possible.


If we want to draw a box around this and change the spacing, we can do 
that with CSS.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Peter Eisentraut

On 2020-04-16 08:26, Pierre Giraud wrote:

The screenshot attached uses a  tag for the descrition/example block.


I like this better, but then you don't really need the table because you 
can just make the whole thing a definition list.


--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: Poll: are people okay with function/operator table redesign?

2020-04-19 Thread Peter Eisentraut

On 2020-04-13 22:33, Tom Lane wrote:

Maybe we're just trying to shoehorn too much information into a single
table.

Yeah, back at the beginning of this exercise, Alvaro wondered aloud
if we should go to something other than tables altogether.  I dunno
what that'd look like though.


Yeah, after reading all this, my conclusion is also, probably tables are 
not the right solution.


A variablelist/definition list would be the next thing to try in my mind.

--
Peter Eisentraut  http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services




Re: PG compilation error with Visual Studio 2015/2017/2019

2020-04-19 Thread Juan José Santamaría Flecha
On Sat, Apr 18, 2020 at 6:07 AM Amit Kapila  wrote:

> On Sat, Apr 18, 2020 at 12:14 AM Juan José Santamaría Flecha
>  wrote:
> >
> > We can get a match for those locales in non-ISO format by enumerating
> available locales with EnumSystemLocalesEx(), and trying to find a match.
>
> I have not reviewed or tested the new patch but one thing I would like
> to see is the impact of setting LC_MESAGGES with different locale
> information.  Basically, the error messages after setting the locale
> with _create_locale and with the new method being discussed.  This
> will help us in ensuring that we didn't break anything which was
> working with prior versions of MSVC.  Can you or someone try to test
> and share the results of the same?
>

I cannot find a single place where all supported locales are listed, but I
have created a small test program (WindowsNLSLocales.c) based on:
[_] format locales [1], additional supported language
strings [2], and additional supported country and region strings [3]. Based
on the results from this test program, it is possible to to do a good job
with the [_] types using the proposed logic, but the
two later cases are Windows specific, and there is no way arround a
lookup-table.

The attached results (WindowsNLSLocales.ods) come from Windows 10 (1903)
and Visual C++ build 1924, 64-bit.

On Sat, Apr 18, 2020 at 1:43 PM Ranier Vilela  wrote:

> I have some observations about this patch, related to style, if you will
> allow me.
>

Please find attached a revised version.

[1]
https://docs.microsoft.com/en-us/openspecs/windows_protocols/ms-lcid/a9eac961-e77d-41a6-90a5-ce1a8b0cdb9c
[2] https://docs.microsoft.com/en-us/cpp/c-runtime-library/language-strings

[3]
https://docs.microsoft.com/en-us/cpp/c-runtime-library/country-region-strings

#include 
#include 
#include 

#if _MSC_VER >= 1800
/* From VS2012. */
typedef struct localerefcount
{
char* locale;
wchar_t* wlocale;
int* refcount;
int* wrefcount;
} locrefcount;

typedef struct __crt_locale_data
{
int refcount;
unsigned int lc_codepage;
unsigned int lc_collate_cp;
unsigned int lc_time_cp;
locrefcount lc_category[6];
int lc_clike;
int mb_cur_max;
int* lconv_intl_refcount;
int* lconv_num_refcount;
int* lconv_mon_refcount;
struct lconv* lconv;
int* ctype1_refcount;
unsigned short* ctype1;
const unsigned short* pctype;
const unsigned char* pclmap;
const unsigned char* pcumap;
struct __lc_time_data* lc_time_curr;
wchar_t* locale_name[6];
} threadlocinfo;
#endif

static const LPCWSTR AdditionalLocales[] =
{
/* Additional supported language strings */
L"american",
L"american english",
L"american-english",
L"australian",
L"belgian",
L"canadian",
L"chh",
L"chi",
L"chinese",
L"chinese-hongkong",
L"chinese-simplified",
L"chinese-singapore",
L"chinese-traditional",
L"dutch-belgian",
L"english-american",
L"english-aus",
L"english-belize",
L"english-can",
L"english-caribbean",
L"english-ire",
L"english-jamaica",
L"english-nz",
L"english-south africa",
L"english-trinidad y tobago",
L"english-uk",
L"english-us",
L"english-usa",
L"french-belgian",
L"french-canadian",
L"french-luxembourg",
L"french-swiss",
L"german-austrian",
L"german-lichtenstein",
L"german-luxembourg",
L"german-swiss",
L"irish-english",
L"italian-swiss",
L"norwegian",
L"norwegian-bokmal",
L"norwegian-nynorsk",
L"portuguese-brazilian",
L"spanish-argentina",
L"spanish-bolivia",
L"spanish-chile",
L"spanish-colombia",
L"spanish-costa rica",
L"spanish-dominican republic",
L"spanish-ecuador",
L"spanish-el salvador",
L"spanish-guatemala",
L"spanish-honduras",
L"spanish-mexican",
L"spanish-modern",
L"spanish-nicaragua",
L"spanish-panama",
L"spanish-paraguay",
L"spanish-peru",
L"spanish-puerto rico",
L"spanish-uruguay",
L"spanish-venezuela",
L"swedish-finland",
L"swiss",
L"uk",
L"us",
L"usa",
/* Additional supported country and region strings */
L"america",
L"britain",
L"china",
L"czech",
L"england",
L"great britain",
L"holland",
L"hong-kong",
L"new-zealand",
L"nz",
L"pr china",
L"pr-china",
L"puerto-rico",