Re: Add Boolean node

2021-12-27 Thread Peter Eisentraut

On 27.12.21 14:15, Ashutosh Bapat wrote:

That looks like a good change. I wonder what motivates that now? Why
wasn't it added when the usages grew? Are there more Boolean usages
planned?


Mainly, I was looking at Integer/makeInteger() and noticed that most 
uses of those weren't actually integers but booleans.  This change makes 
it clearer which is which.





Re: refactoring basebackup.c

2021-12-27 Thread Jeevan Ladhe
Hi Tushar,

You need to apply Robert's v10 version patches 0002, 0003 and 0004, before
applying the lz4 patch(v8 version).
Please let me know if you still face any issues.

Regards,
Jeevan Ladhe

On Mon, Dec 27, 2021 at 7:01 PM tushar 
wrote:

> On 11/22/21 11:05 PM, Jeevan Ladhe wrote:
> > Please find the lz4 compression patch here that basically has:
> Thanks, Could you please rebase your patch, it is failing at my end -
>
> [edb@centos7tushar pg15_lz]$ git apply /tmp/v8-0001-LZ4-compression.patch
> error: patch failed: doc/src/sgml/ref/pg_basebackup.sgml:230
> error: doc/src/sgml/ref/pg_basebackup.sgml: patch does not apply
> error: patch failed: src/backend/replication/Makefile:19
> error: src/backend/replication/Makefile: patch does not apply
> error: patch failed: src/backend/replication/basebackup.c:64
> error: src/backend/replication/basebackup.c: patch does not apply
> error: patch failed: src/include/replication/basebackup_sink.h:285
> error: src/include/replication/basebackup_sink.h: patch does not apply
>
> --
> regards,tushar
> EnterpriseDB  https://www.enterprisedb.com/
> The Enterprise PostgreSQL Company
>
>


Re: Is it worth adding ReplicationSlot active_pid to ReplicationSlotPersistentData?

2021-12-27 Thread Bharath Rupireddy
On Wed, Dec 15, 2021 at 8:32 AM Kyotaro Horiguchi
 wrote:
> > Here's the patch that adds a LOG message whenever a replication slot
> > becomes active and inactive. These logs will be extremely useful on
> > production servers to debug and analyze inactive replication slot
> > issues.
> >
> > Thoughts?
>
> If I create a replication slot, I saw the following lines in server log.
>
> [22054:client backend] LOG:  replication slot "s1" becomes active
> [22054:client backend] DETAIL:  The process with PID 22054 acquired it.
> [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
> [22054:client backend] LOG:  replication slot "s1" becomes inactive
> [22054:client backend] DETAIL:  The process with PID 22054 released it.
> [22054:client backend] STATEMENT:  select pg_drop_replication_slot('s1');
>
> They are apparently too much as if they were DEBUG3 lines.  The
> process PID shown is of the process the slot operations took place so
> I think it conveys no information.  The STATEMENT lines are also noisy
> for non-ERROR emssages.  Couldn't we hide that line?
>
> That is, how about making the log lines as simple as the follows?
>
> [17156:walsender]  LOG:  acquired replication slot "s1"
> [17156:walsender]  LOG:  released replication slot "s1"

Thanks for taking a look at the patch. Here's what I've come up with:

for drops:
2021-12-28 06:39:34.963 UTC [2541600] LOG:  acquired persistent
physical replication slot "myslot1"
2021-12-28 06:39:34.980 UTC [2541600] LOG:  dropped persistent
physical replication slot "myslot1"
2021-12-28 06:47:39.994 UTC [2544153] LOG:  acquired persistent
logical replication slot "myslot2"
2021-12-28 06:47:40.003 UTC [2544153] LOG:  dropped persistent logical
replication slot "myslot2"

for creates:
2021-12-28 06:39:46.859 UTC [2541600] LOG:  created persistent
physical replication slot "myslot1"
2021-12-28 06:39:46.859 UTC [2541600] LOG:  released persistent
physical replication slot "myslot1"
2021-12-28 06:45:20.037 UTC [2544153] LOG:  created persistent logical
replication slot "myslot2"
2021-12-28 06:45:20.058 UTC [2544153] LOG:  released persistent
logical replication slot "myslot2"

for pg_basebackup:
2021-12-28 06:41:04.601 UTC [2542686] LOG:  created temporary physical
replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.602 UTC [2542686] LOG:  released temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.602 UTC [2542686] LOG:  acquired temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.867 UTC [2542686] LOG:  released temporary
physical replication slot "pg_basebackup_2542686"
2021-12-28 06:41:04.954 UTC [2542686] LOG:  dropped temporary physical
replication slot "pg_basebackup_2542686"

The amount of logs may seem noisy, but they do help a lot given the
fact that the server generates much more noise, for instance, the
server logs the syntax error statements. And, the replication slots
don't get created/dropped every now and then (at max, the
pg_basebackup if at all used and configured to take the backups, say,
every 24hrs or so). With the above set of logs debugging for inactive
replication slots becomes easier. One can find the root cause and
answer questions like "why there was a huge WAL file growth at some
point or when did a replication slot become inactive or how much time
a replication slot was inactive or when did a standby disconnected and
connected again or when did a pg_receivewal or pg_recvlogical
connected and disconnected so on.".

Here's the v2 patch. Please provide your thoughts.

Regards,
Bharath Rupireddy.


v2-0001-Add-LOG-messages-when-replication-slots-become-ac.patch
Description: Binary data


Re: sequences vs. synchronous replication

2021-12-27 Thread Fujii Masao




On 2021/12/24 19:40, Tomas Vondra wrote:

Maybe, but what would such workload look like? Based on the tests I did, such 
workload probably can't generate any WAL. The amount of WAL added by the change 
is tiny, the regression is caused by having to flush WAL.

The only plausible workload I can think of is just calling nextval, and the 
cache pretty much fixes that.


Some users don't want to increase cache setting, do they? Because

- They may expect that setval() affects all subsequent nextval(). But if cache 
is set to greater than one, the value set by setval() doesn't affect other 
backends until they consumed all the cached sequence values.
- They may expect that the value returned from nextval() is basically increased 
monotonically. If cache is set to greater than one, subsequent nextval() can 
easily return smaller value than one returned by previous nextval().
- They may want to avoid "hole" of a sequence as much as possible, e.g., as far as the 
server is running normally. If cache is set to greater than one, such "hole" can happen 
even thought the server doesn't crash yet.



FWIW I plan to explore the idea of looking at sequence page LSN, and flushing 
up to that position.


Sounds great, thanks!

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION




Re: Allow escape in application_name

2021-12-27 Thread Fujii Masao



On 2021/12/28 9:32, Masahiko Sawada wrote:

Doesn't this query return 64? So the expression "substring(str for
(SELECT max_identifier_length FROM pg_control_init()))" returns the
first 64 characters of the given string while the application_name is
truncated to be 63 (NAMEDATALEN - 1) characters. It also seems to be
fine to use current_setting('max_identifier_length') instead of
max_identifier_length of pg_control_init().


Interesting! I agree that current_setting('max_identifier_length') should be 
used instead. Attached is the updated version of the patch. It uses 
current_setting('max_identifier_length').

BTW it seems confusing that pg_control_init() (also pg_controldata) and GUC 
report different values as max_identifier_length. Probably they should return 
the same value such as NAMEDATALEN - 1. But this change might be overkill...

Regards,

--
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATIONdiff --git a/contrib/postgres_fdw/expected/postgres_fdw.out 
b/contrib/postgres_fdw/expected/postgres_fdw.out
index 7720ab9c58..7d6f7d9e3d 100644
--- a/contrib/postgres_fdw/expected/postgres_fdw.out
+++ b/contrib/postgres_fdw/expected/postgres_fdw.out
@@ -10825,3 +10825,54 @@ ERROR:  invalid value for integer option "batch_size": 
100$%$#$#
 ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
 ERROR:  invalid option "nonexistent"
 HINT:  There are no valid options in this context.
+-- ===
+-- test postgres_fdw.application_name GUC
+-- ===
+--- Turn debug_discard_caches off for this test to make sure that
+--- the remote connection is alive when checking its application_name.
+SET debug_discard_caches = 0;
+-- Specify escape sequences in application_name option of a server
+-- object so as to test that they are replaced with status information
+-- expectedly.
+--
+-- Since pg_stat_activity.application_name may be truncated to less than
+-- NAMEDATALEN characters, note that substring() needs to be used
+-- at the condition of test query to make sure that the string consisting
+-- of database name and process ID is also less than that.
+ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
+SELECT 1 FROM ft6 LIMIT 1;
+ ?column? 
+--
+1
+(1 row)
+
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_database() || pg_backend_pid() for
+  current_setting('max_identifier_length')::int);
+ pg_terminate_backend 
+--
+ t
+(1 row)
+
+-- postgres_fdw.application_name overrides application_name option
+-- of a server object if both settings are present.
+SET postgres_fdw.application_name TO 'fdw_%a%u%%';
+SELECT 1 FROM ft6 LIMIT 1;
+ ?column? 
+--
+1
+(1 row)
+
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_setting('application_name') ||
+  CURRENT_USER || '%' for current_setting('max_identifier_length')::int);
+ pg_terminate_backend 
+--
+ t
+(1 row)
+
+--Clean up
+RESET postgres_fdw.application_name;
+RESET debug_discard_caches;
diff --git a/contrib/postgres_fdw/sql/postgres_fdw.sql 
b/contrib/postgres_fdw/sql/postgres_fdw.sql
index beeac8af1e..9eb673e369 100644
--- a/contrib/postgres_fdw/sql/postgres_fdw.sql
+++ b/contrib/postgres_fdw/sql/postgres_fdw.sql
@@ -3452,3 +3452,38 @@ CREATE FOREIGN TABLE inv_bsz (c1 int )
 
 -- No option is allowed to be specified at foreign data wrapper level
 ALTER FOREIGN DATA WRAPPER postgres_fdw OPTIONS (nonexistent 'fdw');
+
+-- ===
+-- test postgres_fdw.application_name GUC
+-- ===
+--- Turn debug_discard_caches off for this test to make sure that
+--- the remote connection is alive when checking its application_name.
+SET debug_discard_caches = 0;
+
+-- Specify escape sequences in application_name option of a server
+-- object so as to test that they are replaced with status information
+-- expectedly.
+--
+-- Since pg_stat_activity.application_name may be truncated to less than
+-- NAMEDATALEN characters, note that substring() needs to be used
+-- at the condition of test query to make sure that the string consisting
+-- of database name and process ID is also less than that.
+ALTER SERVER loopback2 OPTIONS (application_name 'fdw_%d%p');
+SELECT 1 FROM ft6 LIMIT 1;
+SELECT pg_terminate_backend(pid, 18) FROM pg_stat_activity
+  WHERE application_name =
+substring('fdw_' || current_database() || pg_backend_pid() for
+  current_setting('max_identifier_length')::int);
+
+-- postgres_fdw.application_name overrides application_name option
+-- of a server object if both settings are present.
+SET 

Re: Add Boolean node

2021-12-27 Thread Zhihong Yu
Hi,
For buildDefItem():

+   if (strcmp(val, "true") == 0)
+   return makeDefElem(pstrdup(name),
+  (Node *) makeBoolean(true),
+  -1);
+   if (strcmp(val, "false") == 0)

Should 'TRUE' / 'FALSE' be considered above ?

-   issuper = intVal(dissuper->arg) != 0;
+   issuper = boolVal(dissuper->arg) != 0;

Can the above be written as (since issuper is a bool):

+   issuper = boolVal(dissuper->arg);

Cheers


Re: Add checkpoint and redo LSN to LogCheckpointEnd log message

2021-12-27 Thread Bharath Rupireddy
On Fri, Dec 24, 2021 at 5:54 PM Bharath Rupireddy
 wrote:
>
> On Fri, Dec 24, 2021 at 5:42 PM Michael Paquier  wrote:
> >
> > On Fri, Dec 24, 2021 at 02:51:34PM +0900, Kyotaro Horiguchi wrote:
> > > I thougt about something like the following, but your proposal may be
> > > clearer.
> >
> > +"LSN=%X/%X, REDO LSN=%X/%X",
> > This could be rather confusing for the average user, even if I agree
> > that this is some information that only an advanced user could
> > understand.  Could it be possible to define those fields in a more
> > deterministic way?  For one, it is hard to understand the relationship
> > between both fields without looking at the code, particulary if both
> > share the same value.  It is at least rather..  Well, mostly, easy to
> > guess what each other field means in this context, which is not the
> > case of what you are proposing here.  One idea could be use also
> > "start point" for REDO, for example.
>
> How about "location=%X/%X, REDO start location=%X/%X"? The entire log
> message can look like below:
>
> 2021-12-24 12:20:19.140 UTC [1977834] LOG:  checkpoint
> complete:location=%X/%X, REDO start location=%X/%X; wrote 7 buffers
> (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s,
> sync=0.007 s, total=0.192 s; sync files=5, longest=0.005 s,
> average=0.002 s; distance=293 kB, estimate=56584 kB
>
> Another variant:
> 2021-12-24 12:20:19.140 UTC [1977834] LOG:  checkpoint completed at
> location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%);
> 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007
> s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s;
> distance=293 kB, estimate=56584 kB
> 2021-12-24 12:20:19.140 UTC [1977834] LOG:  restartpoint completed at
> location=%X/%X with REDO start location=%X/%X: wrote 7 buffers (0.0%);
> 0 WAL file(s) added, 0 removed, 0 recycled; write=0.005 s, sync=0.007
> s, total=0.192 s; sync files=5, longest=0.005 s, average=0.002 s;
> distance=293 kB, estimate=56584 kB

Here are the 2 patches.

one(v2-1-XXX.patch) adding the info as:
2021-12-28 02:44:34.870 UTC [2384386] LOG:  checkpoint complete:
location=0/1B03040, REDO start location=0/1B03008; wrote 466 buffers
(2.8%); 0 WAL file(s) added, 0 removed, 0 recycled; write=0.014 s,
sync=0.038 s, total=0.072 s; sync files=21, longest=0.022 s,
average=0.002 s; distance=6346 kB, estimate=6346 kB

another(v2-2-XXX.patch) adding the info as:
2021-12-28 02:52:24.464 UTC [2394396] LOG:  checkpoint completed at
location=0/212FFC8 with REDO start location=0/212FF90: wrote 451
buffers (2.8%); 0 WAL file(s) added, 0 removed, 1 recycled;
write=0.012 s, sync=0.032 s, total=0.071 s; sync files=6,
longest=0.022 s, average=0.006 s; distance=6272 kB, estimate=6272 kB

attaching v1-0001-XXX from the initial mail again just for the sake of
completion:
2021-12-23 14:58:54.714 UTC [1965649] LOG:  checkpoint complete: wrote
0 buffers (0.0%); 0 WAL file(s) added, 0 removed, 0 recycled;
write=0.001 s, sync=0.001 s, total=0.025 s; sync files=0,
longest=0.000 s, average=0.000 s; distance=0 kB, estimate=0 kB;
LSN=0/14D0AD0, REDO LSN=0/14D0AD0

Thoughts?

Regards,
Bharath Rupireddy.


v2-1-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch
Description: Binary data


v2-2-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch
Description: Binary data


v1-0001-Add-checkpoint-and-redo-LSN-to-LogCheckpointEnd-l.patch
Description: Binary data


RE: Optionally automatically disable logical replication subscriptions on error

2021-12-27 Thread wangw.f...@fujitsu.com
On Thursday, December 16, 2021 8:51 PM osumi.takami...@fujitsu.com 
 wrote:
> Attached the updated patch v14.

A comment to the timing of printing a log:
After the log[1] was printed, I altered subscription's option
(DISABLE_ON_ERROR) from true to false before invoking DisableSubscriptionOnError
to disable subscription. Subscription was not disabled.
[1] "LOG:  logical replication subscription "sub1" will be disabled due to an 
error"

I found this log is printed in function WorkerErrorRecovery:
+   ereport(LOG,
+   errmsg("logical replication subscription \"%s\" will be 
disabled due to an error",
+  MySubscription->name));
This log is printed here, but in DisableSubscriptionOnError, there is a check to
confirm subscription's disableonerr field. If disableonerr is found changed from
true to false in DisableSubscriptionOnError, subscription will not be disabled.

In this case, "disable subscription" is printed, but subscription will not be
disabled actually.
I think it is a little confused to user, so what about moving this message after
the check which is mentioned above in DisableSubscriptionOnError?


Regards,
Wang wei


Can there ever be out of sequence WAL files?

2021-12-27 Thread Bharath Rupireddy
Hi,

Can the postgres server ever have/generate out of sequence WAL files?
For instance, 0001020C00A2, 0001020C00A3,
0001020C00A5 and so on, missing 0001020C00A4.
Manual/Accidental deletion of the WAL files can happes, but are there
any other extreme situations (like recycling, removing old WAL files
etc.) caused by the postgres server leading to missing WAL files?

What happens when postgres server finds missing WAL file during
crash/standby recovery?

Thoughts?

Regards,
Bharath Rupireddy.




Re: sequences vs. synchronous replication

2021-12-27 Thread Tomas Vondra

On 12/27/21 21:24, Peter Eisentraut wrote:

On 24.12.21 09:04, Kyotaro Horiguchi wrote:

Still, as Fujii-san concerns, I'm afraid that some people may suffer
the degradation the patch causes.  I wonder it is acceptable to get
back the previous behavior by exposing SEQ_LOG_VALS itself or a
boolean to do that, as a 'not-recommended-to-use' variable.


There is also the possibility of unlogged sequences if you want to avoid 
the WAL logging and get higher performance.


But unlogged sequences are not supported:

  test=# create unlogged sequence s;
  ERROR:  unlogged sequences are not supported

And even if we did, what would be the behavior after crash? For tables 
we discard the contents, so for sequences we'd probably discard it too 
and start from scratch? That doesn't seem particularly useful.


We could also write / fsync the sequence buffer, but that has other 
downsides. But that's not implemented either, and it's certainly out of 
scope for this patch.



regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company




Re: Throttling WAL inserts when the standby falls behind more than the configured replica_lag_in_bytes

2021-12-27 Thread SATYANARAYANA NARLAPURAM
On Sat, Dec 25, 2021 at 9:25 PM Dilip Kumar  wrote:

> On Sun, Dec 26, 2021 at 10:36 AM SATYANARAYANA NARLAPURAM <
> satyanarlapu...@gmail.com> wrote:
>
>>
>>> Actually all the WAL insertions are done under a critical section
>>> (except few exceptions), that means if you see all the references of
>>> XLogInsert(), it is always called under the critical section and that is my
>>> main worry about hooking at XLogInsert level.
>>>
>>
>> Got it, understood the concern. But can we document the limitations of
>> the hook and let the hook take care of it? I don't expect an error to be
>> thrown here since we are not planning to allocate memory or make file
>> system calls but instead look at the shared memory state and add delays
>> when required.
>>
>>
> Yet another problem is that if we are in XlogInsert() that means we are
> holding the buffer locks on all the pages we have modified, so if we add a
> hook at that level which can make it wait then we would also block any of
> the read operations needed to read from those buffers.  I haven't thought
> what could be better way to do this but this is certainly not good.
>

Yes, this is a problem. The other approach is adding a hook at
XLogWrite/XLogFlush? All the other backends will be waiting behind the
WALWriteLock. The process that is performing the write enters into a busy
loop with small delays until the criteria are met. Inability to process the
interrupts inside the critical section is a challenge in both approaches.
Any other thoughts?


>
>
> --
> Regards,
> Dilip Kumar
> EnterpriseDB: http://www.enterprisedb.com
>


Re: Allow escape in application_name

2021-12-27 Thread Masahiko Sawada
On Tue, Dec 28, 2021 at 8:57 AM kuroda.hay...@fujitsu.com
 wrote:
>
> Dear Sawada-san,
>
> > If so, we need to do substring(... for
> > 63) instead.

Just to be clear, I meant substring(... for NAMEDATALEN - 1).

>
> Yeah, the parameter will be truncated as one less than NAMEDATALEN:
>
> ```
> max_identifier_length (integer)
> Reports the maximum identifier length. It is determined as one less than the 
> value of NAMEDATALEN when building the server.
> The default value of NAMEDATALEN is 64; therefore the default 
> max_identifier_length is 63 bytes,
> which can be less than 63 characters when using multibyte encodings.
> ```

I think this is the description of the max_identifier_length GUC parameter.

>
> But in Fujii-san's patch length is picked up by the following SQL, so I think 
> it works well.
>
> ```
> SELECT max_identifier_length FROM pg_control_init()
> ```

Doesn't this query return 64? So the expression "substring(str for
(SELECT max_identifier_length FROM pg_control_init()))" returns the
first 64 characters of the given string while the application_name is
truncated to be 63 (NAMEDATALEN - 1) characters. It also seems to be
fine to use current_setting('max_identifier_length') instead of
max_identifier_length of pg_control_init().

Regards,

-- 
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add index scan progress to pg_stat_progress_vacuum

2021-12-27 Thread Imseih (AWS), Sami
I do agree that tracking progress by # of blocks scanned is not deterministic 
for all index types.

Based on this feedback, I went back to the drawing board on this. 

Something like below may make more sense.

In pg_stat_progress_vacuum, introduce 2 new columns:

1. total_index_vacuum   - total # of indexes to vacuum
2. max_cycle_time - the time in seconds of the longest index cycle. 

Introduce another view called pg_stat_progress_vacuum_index_cycle:

postgres=# \d pg_stat_progress_vacuum_index_cycle
   View "public.pg_stat_progress_vacuum_worker"
 Column |  Type   | Collation | Nullable | Default
+-+---+--+-
pid| integer |   |  |   
<<<-- the PID of the vacuum worker ( or leader if it's doing index vacuuming )
leader_pid | bigint  |   |  |   
<<<-- the leader PID to allow this view to be joined back to 
pg_stat_progress_vacuum
indrelid   | bigint  |   |  |   
<<<- the index relid of the index being vacuumed
ordinal_position | bigint |   |  |  
<<<- the processing position, which will give an idea of the processing 
position of the index being vacuumed. 
dead_tuples_removed | bigint | |<<<- the number 
of dead rows removed in the current cycle for the index.

Having this information, one can

1. Determine which index is being vacuumed. For monitoring tools, this can help 
identify the index that accounts for most of the index vacuuming time.
2. Having the processing order of the current index will allow the user to 
determine how many of the total indexes has been completed in the current cycle.
3. dead_tuples_removed will show progress on the index vacuum in the current 
cycle.
4. the max_cycle_time will give an idea on how long the longest index cycle 
took for the current vacuum operation.


On 12/23/21, 2:46 AM, "Masahiko Sawada"  wrote:

CAUTION: This email originated from outside of the organization. Do not 
click links or open attachments unless you can confirm the sender and know the 
content is safe.



On Tue, Dec 21, 2021 at 3:37 AM Peter Geoghegan  wrote:
>
> On Wed, Dec 15, 2021 at 2:10 PM Bossart, Nathan  
wrote:
> > nitpick: Shouldn't index_blks_scanned be index_blks_vacuumed?  IMO it
> > is more analogous to heap_blks_vacuumed.
>
> +1.
>
> > This will tell us which indexes are currently being vacuumed and the
> > current progress of those operations, but it doesn't tell us which
> > indexes have already been vacuumed or which ones are pending vacuum.
>
> VACUUM will process a table's indexes in pg_class OID order (outside
> of parallel VACUUM, I suppose). See comments about sort order above
> RelationGetIndexList().

Right.

>
> Anyway, it might be useful to add ordinal numbers to each index, that
> line up with this processing/OID order. It would also be reasonable to
> display the same number in log_autovacuum* (and VACUUM VERBOSE)
> per-index output, to reinforce the idea. Note that we don't
> necessarily display a distinct line for each distinct index in this
> log output, which is why including the ordinal number there makes
> sense.

An alternative idea would be to show the number of indexes on the
table and the number of indexes that have been processed in the
leader's entry of pg_stat_progress_vacuum. Even in parallel vacuum
cases, since we have index vacuum status for each index it would not
be hard for the leader process to count how many indexes have been
processed.

Regarding the details of the progress of index vacuum, I'm not sure
this progress information can fit for pg_stat_progress_vacuum. As
Peter already mentioned, the behavior quite varies depending on index
AM.

Regards,


--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/



RE: Allow escape in application_name

2021-12-27 Thread kuroda.hay...@fujitsu.com
Dear Sawada-san,

> Good idea. But the application_name is actually truncated to 63
> characters (NAMEDATALEN - 1)? If so, we need to do substring(... for
> 63) instead.

Yeah, the parameter will be truncated as one less than NAMEDATALEN:

```
max_identifier_length (integer)
Reports the maximum identifier length. It is determined as one less than the 
value of NAMEDATALEN when building the server.
The default value of NAMEDATALEN is 64; therefore the default 
max_identifier_length is 63 bytes,
which can be less than 63 characters when using multibyte encodings.
```

But in Fujii-san's patch length is picked up by the following SQL, so I think 
it works well.

```
SELECT max_identifier_length FROM pg_control_init()
```

Best Regards,
Hayato Kuroda
FUJITSU LIMITED



Re: Emit a warning if the extension's GUC is set incorrectly

2021-12-27 Thread Tom Lane
I wrote:
> Concretely, I think we should do the attached, which removes much of
> 75d22069e and does the check at the point of placeholder creation.

I pushed that, and along the way moved the test case to be beside
the existing tests concerning custom GUC names, rather than appended
at the end of guc.sql as it had been.  That turns out to make the
buildfarm very unhappy.  I had not thought to test that change with
"force_parallel_mode = regress"; but with that on, we can see that
the warning (or now error) is reported each time a parallel worker
is launched, if the parent process contains a bogus placeholder.
So that accidentally unveiled another deficiency in the design of
the original patch --- we surely don't want that to happen.

As a stopgap to turn the farm green again, I am going to revert
75d22069e as well as my followup patches.  If we don't want to
give up on that idea altogether, we have to find some way to
suppress the chatter from parallel workers.  I wonder whether
it would be appropriate to go further than we have, and actively
delete placeholders that turn out to be within an extension's
reserved namespace.  The core issue here is that workers don't
necessarily set GUCs and load extensions in the same order that
their parent did, so if we leave any invalid placeholders behind
after reserving an extension's prefix, we're risking issues
during worker start.

regards, tom lane




Re: sequences vs. synchronous replication

2021-12-27 Thread Peter Eisentraut

On 24.12.21 09:04, Kyotaro Horiguchi wrote:

Still, as Fujii-san concerns, I'm afraid that some people may suffer
the degradation the patch causes.  I wonder it is acceptable to get
back the previous behavior by exposing SEQ_LOG_VALS itself or a
boolean to do that, as a 'not-recommended-to-use' variable.


There is also the possibility of unlogged sequences if you want to avoid 
the WAL logging and get higher performance.





Re: Column Filtering in Logical Replication

2021-12-27 Thread Alvaro Herrera
On 2021-Dec-27, Tom Lane wrote:

> Alvaro Herrera  writes:
> > Determining that an array has a NULL element seems convoluted.  I ended
> > up with this query, where comparing the result of array_positions() with
> > an empty array does that.  If anybody knows of a simpler way, or any
> > situations in which this fails, I'm all ears.
> 
> Maybe better to rethink why we allow elements of prattrs to be null?

What I'm doing is an unnest of all arrays and then aggregating them
back into a single array.  If one array is null, the resulting aggregate
contains a null element.

Hmm, maybe I can in parallel do a bool_or() aggregate of "array is null" to
avoid that.  ... ah yes, that works:

with published_cols as (
select pg_catalog.bool_or(pr.prattrs is null) as all_columns,
pg_catalog.array_agg(distinct unnest order by unnest) AS attrs
from pg_catalog.pg_publication p join
pg_catalog.pg_publication_rel pr on (p.oid = pr.prpubid) left 
join
unnest(prattrs) on (true)
where prrelid = :table and p.pubname in ('pub1', 'pub2')
)
SELECT a.attname,
   a.atttypid,
   a.attnum = ANY(i.indkey)
  FROM pg_catalog.pg_attribute a
  LEFT JOIN pg_catalog.pg_index i
   ON (i.indexrelid = pg_get_replica_identity_index(:table)),
 published_cols
 WHERE a.attnum > 0::pg_catalog.int2
   AND NOT a.attisdropped and a.attgenerated = ''
   AND a.attrelid = :table
   AND (all_columns OR attnum = ANY(published_cols.attrs))
 ORDER BY a.attnum ;

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/




Re: Foreign key joins revisited

2021-12-27 Thread Tom Lane
Isaac Morland  writes:
> On Mon, 27 Dec 2021 at 03:22, Joel Jacobson  wrote:
>> However, I see one problem with leaving out the key columns:
>> First, there is only one FK in permission pointing to role, and we write a
>> query leaving out the key columns.
>> Then, another different FK in permission pointing to role is later added,
>> and our old query is suddenly in trouble.

> I thought the proposal was to give the FK constraint name. However, if the
> idea now is to allow leaving that out also if there is only one FK, then
> that's also OK as long as people understand it can break in the same way
> NATURAL JOIN can break when columns are added later.

NATURAL JOIN is widely regarded as a foot-gun that the SQL committee
should never have invented.  Why would we want to create another one?

(I suspect that making the constraint name optional would be problematic
for reasons of syntax ambiguity, anyway.)

regards, tom lane




Re: Add index scan progress to pg_stat_progress_vacuum

2021-12-27 Thread Justin Pryzby
Please send your patches as *.diff or *.patch, so they're processed by the
patch tester.  Preferably with commit messages; git format-patch is the usual
tool for this.
http://cfbot.cputube.org/sami-imseih.html

(Occasionally, it's also useful to send a *.txt to avoid the cfbot processing
the wrong thing, in case one sends an unrelated, secondary patch, or sends
fixes to a patch as a "relative patch" which doesn't include the main patch.)

I'm including a patch rebased on 8e1fae193.
>From 1d034ff6317919e52c70ff8a4f3af9ac1c101368 Mon Sep 17 00:00:00 2001
From: "Imseih (AWS), Sami" 
Date: Mon, 20 Dec 2021 17:55:03 +
Subject: [PATCH] Add index scan progress to pg_stat_progress_vacuum

Here is a V2 attempt of the patch to include a new view called pg_stat_progress_vacuum_worker. Also, scans for index cleanups will also have an entry in the new view.

Re: Add index scan progress to pg_stat_progress_vacuum
---
 src/backend/access/brin/brin.c| 27 +++--
 src/backend/access/gin/ginvacuum.c| 55 +++
 src/backend/access/gist/gistvacuum.c  | 24 
 src/backend/access/hash/hash.c| 46 +-
 src/backend/access/hash/hashpage.c|  4 +-
 src/backend/access/heap/vacuumlazy.c  | 36 +-
 src/backend/access/nbtree/nbtree.c| 14 ++-
 src/backend/access/spgist/spgvacuum.c | 25 
 src/backend/catalog/system_views.sql  | 11 ++
 src/backend/commands/vacuumparallel.c | 11 ++
 src/include/access/hash.h |  3 +-
 src/include/commands/progress.h   |  2 +
 12 files changed, 247 insertions(+), 11 deletions(-)

diff --git a/src/backend/access/brin/brin.c b/src/backend/access/brin/brin.c
index f521bb96356..97b2f8bc13a 100644
--- a/src/backend/access/brin/brin.c
+++ b/src/backend/access/brin/brin.c
@@ -39,6 +39,8 @@
 #include "utils/index_selfuncs.h"
 #include "utils/memutils.h"
 #include "utils/rel.h"
+#include "commands/progress.h"
+#include "pgstat.h"
 
 
 /*
@@ -77,7 +79,7 @@ static void brinsummarize(Relation index, Relation heapRel, BlockNumber pageRang
 static void form_and_insert_tuple(BrinBuildState *state);
 static void union_tuples(BrinDesc *bdesc, BrinMemTuple *a,
 		 BrinTuple *b);
-static void brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy);
+static void brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy, bool report_progress);
 static bool add_values_to_range(Relation idxRel, BrinDesc *bdesc,
 BrinMemTuple *dtup, Datum *values, bool *nulls);
 static bool check_null_keys(BrinValues *bval, ScanKey *nullkeys, int nnullkeys);
@@ -953,7 +955,7 @@ brinvacuumcleanup(IndexVacuumInfo *info, IndexBulkDeleteResult *stats)
 	heapRel = table_open(IndexGetRelation(RelationGetRelid(info->index), false),
 		 AccessShareLock);
 
-	brin_vacuum_scan(info->index, info->strategy);
+	brin_vacuum_scan(info->index, info->strategy, info->report_progress);
 
 	brinsummarize(info->index, heapRel, BRIN_ALL_BLOCKRANGES, false,
   >num_index_tuples, >num_index_tuples);
@@ -1635,16 +1637,24 @@ union_tuples(BrinDesc *bdesc, BrinMemTuple *a, BrinTuple *b)
  * and such.
  */
 static void
-brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy)
+brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy, bool report_progress)
 {
 	BlockNumber nblocks;
 	BlockNumber blkno;
+	const intinitprog_index[] = {
+			PROGRESS_SCAN_BLOCKS_DONE,
+			PROGRESS_SCAN_BLOCKS_TOTAL
+	};
+	int64initprog_val[2];
 
 	/*
 	 * Scan the index in physical order, and clean up any possible mess in
 	 * each page.
 	 */
 	nblocks = RelationGetNumberOfBlocks(idxrel);
+	if (report_progress)
+		pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_TOTAL,
+	 nblocks);
 	for (blkno = 0; blkno < nblocks; blkno++)
 	{
 		Buffer		buf;
@@ -1656,9 +1666,20 @@ brin_vacuum_scan(Relation idxrel, BufferAccessStrategy strategy)
 
 		brin_page_cleanup(idxrel, buf);
 
+		if (report_progress)
+			pgstat_progress_update_param(PROGRESS_SCAN_BLOCKS_DONE,
+		 blkno + 1);
+
 		ReleaseBuffer(buf);
 	}
 
+	if (report_progress)
+	{
+		initprog_val[0] = 0;
+		initprog_val[1] = 0;
+		pgstat_progress_update_multi_param(2, initprog_index, initprog_val);
+	}
+
 	/*
 	 * Update all upper pages in the index's FSM, as well.  This ensures not
 	 * only that we propagate leaf-page FSM updates made by brin_page_cleanup,
diff --git a/src/backend/access/gin/ginvacuum.c b/src/backend/access/gin/ginvacuum.c
index a276eb020b5..714586040aa 100644
--- a/src/backend/access/gin/ginvacuum.c
+++ b/src/backend/access/gin/ginvacuum.c
@@ -24,6 +24,8 @@
 #include "storage/lmgr.h"
 #include "storage/predicate.h"
 #include "utils/memutils.h"
+#include "commands/progress.h"
+#include "pgstat.h"
 
 struct GinVacuumState
 {
@@ -571,6 +573,14 @@ ginbulkdelete(IndexVacuumInfo *info, IndexBulkDeleteResult *stats,
 	Buffer		buffer;
 	BlockNumber rootOfPostingTree[BLCKSZ / (sizeof(IndexTupleData) + sizeof(ItemId))];
 	

Re: Column Filtering in Logical Replication

2021-12-27 Thread Tom Lane
Alvaro Herrera  writes:
> Determining that an array has a NULL element seems convoluted.  I ended
> up with this query, where comparing the result of array_positions() with
> an empty array does that.  If anybody knows of a simpler way, or any
> situations in which this fails, I'm all ears.

Maybe better to rethink why we allow elements of prattrs to be null?

regards, tom lane




Re: Inconsistent ellipsis in regression test error message?

2021-12-27 Thread Tom Lane
Peter Smith  writes:
> The most recent cfbot run for a patch I am interested in has failed a
> newly added regression test.
> Please see http://cfbot.cputube.org/ for 36/2906
> The failure logs [2] are very curious because the error message is
> what was expected but it has a different position of the ellipsis

That "expected" output is clearly completely insane; it's pointing
the cursor in the middle of the "TABLE" keyword, not at the offending
constant.  I can reproduce that when the database encoding is UTF8,
but if it's SQL_ASCII or a single-byte encoding then I get a saner result:

regression=# ALTER PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (1234);
ERROR:  argument of PUBLICATION WHERE must be type boolean, not type integer
LINE 1: ...PUBLICATION testpub5 SET TABLE testpub_rf_tbl3 WHERE (1234);
 ^

This is not a client-side problem: the error position being reported
by the server is different, as you can easily see in the server's log:

2021-12-27 12:05:15.395 EST [1510837] ERROR:  argument of PUBLICATION WHERE 
must be type boolean, not type integer at character 33
2021-12-27 12:05:15.395 EST [1510837] STATEMENT:  ALTER PUBLICATION testpub5 
SET TABLE testpub_rf_tbl3 WHERE (1234);

(it says "at character 61" in the sane case).

I traced this as far as finding that the pstate being passed to
coerce_to_boolean has a totally wrong p_sourcetext:

(gdb) p *pstate
$3 = {parentParseState = 0x0, 
  p_sourcetext = 0x1fba9e8 "{A_CONST :val 1234 :location 60}", 
  p_rtable = 0x2063ce0, p_joinexprs = 0x0, p_joinlist = 0x0, 
  p_namespace = 0x2063dc8, p_lateral_active = false, p_ctenamespace = 0x0, 
  p_future_ctes = 0x0, p_parent_cte = 0x0, p_target_relation = 0x0, 
  p_target_nsitem = 0x0, p_is_insert = false, p_windowdefs = 0x0, 
  p_expr_kind = EXPR_KIND_NONE, p_next_resno = 1, p_multiassign_exprs = 0x0, 
  p_locking_clause = 0x0, p_locked_from_parent = false, 
  p_resolve_unknowns = true, p_queryEnv = 0x0, p_hasAggs = false, 
  p_hasWindowFuncs = false, p_hasTargetSRFs = false, p_hasSubLinks = false, 
  p_hasModifyingCTE = false, p_last_srf = 0x0, p_pre_columnref_hook = 0x0, 
  p_post_columnref_hook = 0x0, p_paramref_hook = 0x0, 
  p_coerce_param_hook = 0x0, p_ref_hook_state = 0x0}

In short, GetTransformedWhereClause is inserting completely faulty data in
p_sourcetext.  This code needs to be revised to pass down the original
command string, or maybe better pass down the whole ParseState that was
available to AlterPublication, instead of inventing a bogus one.

The reason why the behavior depends on DB encoding is left as an
exercise for the student.

regards, tom lane




Re: default to to ON_ERROR_STOP=on (Re: psql: exit status with multiple -c and -f)

2021-12-27 Thread Tom Lane
Justin Pryzby  writes:
> I think the current behavior of the regression test SQL scripts is exactly the
> opposite of what's desirable for almost all other scripts.  The attached makes
> ON_ERROR_STOP the default, and runs the regression tests with ON_ERROR_STOP=0.

> Is it viable to consider changing this ?

I don't think so.  The number of scripts you will break is far greater
than the number whose behavior will be improved, because people who
wanted this behavior will already be selecting it.  Maybe this wasn't
the greatest choice of default, but it's about twenty years too late
to change it.

I'd also note that I see a fairly direct parallel to "set -e" in
shell scripts, which is likewise not the default.

We could consider documentation changes to make this issue
more visible, perhaps.  Not sure what would be a good place.

regards, tom lane




Re: Foreign key joins revisited

2021-12-27 Thread Corey Huinker
>
>
> First, there is only one FK in permission pointing to role, and we write a
> query leaving out the key columns.
> Then, another different FK in permission pointing to role is later added,
> and our old query is suddenly in trouble.
>
>
We already have that problem with cases where two tables have a common x
column:

SELECT x FROM a, b


so this would be on-brand for the standards body. And worst case scenario
you're just back to the situation you have now.


Re: Column Filtering in Logical Replication

2021-12-27 Thread Alvaro Herrera
Determining that an array has a NULL element seems convoluted.  I ended
up with this query, where comparing the result of array_positions() with
an empty array does that.  If anybody knows of a simpler way, or any
situations in which this fails, I'm all ears.

with published_cols as (
select case when
pg_catalog.array_positions(pg_catalog.array_agg(unnest), null) 
<> '{}' then null else
pg_catalog.array_agg(distinct unnest order by unnest) end AS 
attrs
from pg_catalog.pg_publication p join
pg_catalog.pg_publication_rel pr on (p.oid = pr.prpubid) left 
join
unnest(prattrs) on (true)
where prrelid = 38168 and p.pubname in ('pub1', 'pub2')
)
SELECT a.attname,
   a.atttypid,
   a.attnum = ANY(i.indkey)
  FROM pg_catalog.pg_attribute a
  LEFT JOIN pg_catalog.pg_index i
   ON (i.indexrelid = pg_get_replica_identity_index(38168)),
 published_cols
 WHERE a.attnum > 0::pg_catalog.int2
   AND NOT a.attisdropped and a.attgenerated = ''
   AND a.attrelid = 38168
   AND (published_cols.attrs IS NULL OR attnum = ANY(published_cols.attrs))
 ORDER BY a.attnum;

This returns all columns if at least one publication has a NULL prattrs,
or only the union of columns listed in all publications, if all
publications have a list of columns.

(I was worried about obtaining the list of publications, but it turns
out that it's already as a convenient list of OIDs in the MySubscription
struct.)

With this, we can remove the second query added by Rahila's original patch to
filter out nonpublished columns.

I still need to add pg_partition_tree() in order to search for
publications containing a partition ancestor.  I'm not yet sure what
happens (and what *should* happen) if an ancestor is part of a
publication and the partition is also part of a publication, and the
column lists differ.

-- 
Álvaro Herrera  Valdivia, Chile  —  https://www.EnterpriseDB.com/
Al principio era UNIX, y UNIX habló y dijo: "Hello world\n".
No dijo "Hello New Jersey\n", ni "Hello USA\n".




Re: Foreign key joins revisited

2021-12-27 Thread Joel Jacobson
On Mon, Dec 27, 2021, at 17:03, Isaac Morland wrote:
> On Mon, 27 Dec 2021 at 10:20, Joel Jacobson  wrote:
> 
> Foreign key constraint names have been given the same names as the referenced 
> tables.
>
> While I agree this could be a simple approach in many real cases for having 
> easy to understand FK constraint names, I 
> wonder if for illustration and explaining the feature if it might work better 
> to use names that are completely unique so that 
> it's crystal clear that the names are constraint names, not table names.

Good point, I agree. New version below:

FROM permission p
LEFT JOIN KEY p.permission_role_id_fkey r
LEFT JOIN team_role tr KEY team_role_role_id_fkey REF r
LEFT JOIN KEY tr.team_role_team_id_fkey t
LEFT JOIN user_role ur KEY user_role_role_id_fkey REF r
LEFT JOIN KEY ur.user_role_user_id_fkey u
WHERE p.id = 1;

Thoughts?

/Joel

Re: default to to ON_ERROR_STOP=on (Re: psql: exit status with multiple -c and -f)

2021-12-27 Thread Pavel Stehule
po 27. 12. 2021 v 17:10 odesílatel Justin Pryzby 
napsal:

> On Mon, Dec 06, 2021 at 09:08:56AM -0600, Justin Pryzby wrote:
> > I raised this issue a few years ago.
> >
> https://www.postgresql.org/message-id/20181217175841.GS13019%40telsasoft.com
> >
> > |[pryzbyj@database ~]$ psql -v VERBOSITY=terse ts -xtc 'ONE' -c "SELECT
> 'TWO'"; echo "exit status $?"
> > |ERROR:  syntax error at or near "ONE" at character 1
> > |?column? | TWO
> > |
> > |exit status 0
> >
> > The documentation doen't say what the exit status should be in this case:
> > | psql returns 0 to the shell if it finished normally, 1 if a fatal
> error of its own occurs (e.g., out of memory, file not found), 2 if the
> connection to the server went bad and the session was not interactive, and
> 3 if an error occurred in a script and the variable ON_ERROR_STOP was set.
> >
> > It returns 1 if the final command fails, even though it's not a "fatal
> error"
> > (it would've happily kept running more commands).
> >
> > | x=`some_command_that_fails`
> > | rm -fr "$x/$y # removes all your data
> >
> > | psql -c "begin; C REATE TABLE newtable(LIKE oldtable) INSERT INTO
> newtable SELECT * FROM oldtable; commit" -c "DROP TABLE oldtable
> > | psql -v VERBOSITY=terse ts -xtc '0CREATE TABLE newtbl(i int)' -c
> 'INSERT INTO newtbl SELECT * FROM tbl' -c 'DROP TABLE IF EXISTS tbl' -c
> 'ALTER TABLE newtbl RENAME TO tbl'; echo ret=$?
> >
> > David J suggested to change the default value of ON_ERROR_STOP.  The exit
> > status in the non-default case would have to be documented.  That's one
> > solution, and allows the old behavior if anybody wants it.  That
> probably does
> > what most people want, too.  This is more likely to expose a real
> problem that
> > someone would have missed than to break a legitimate use.  That doesn't
> appear
> > to break regression tests at all.
>
> I was wrong - the regression tests specifically exercise failure cases, so
> the
> scripts must continue after errors.
>
> I think the current behavior of the regression test SQL scripts is exactly
> the
> opposite of what's desirable for almost all other scripts.  The attached
> makes
> ON_ERROR_STOP the default, and runs the regression tests with
> ON_ERROR_STOP=0.
>
> Is it viable to consider changing this ?
>

I have not any problems with the proposed feature, and I agree so proposed
behavior is more practical for users. Unfortunately, it breaks a lot of
applications' regress tests, but it can be fixed easily, and it is better
to do it quickly without more complications.

Regards

Pavel


default to to ON_ERROR_STOP=on (Re: psql: exit status with multiple -c and -f)

2021-12-27 Thread Justin Pryzby
On Mon, Dec 06, 2021 at 09:08:56AM -0600, Justin Pryzby wrote:
> I raised this issue a few years ago.
> https://www.postgresql.org/message-id/20181217175841.GS13019%40telsasoft.com
> 
> |[pryzbyj@database ~]$ psql -v VERBOSITY=terse ts -xtc 'ONE' -c "SELECT 
> 'TWO'"; echo "exit status $?"
> |ERROR:  syntax error at or near "ONE" at character 1
> |?column? | TWO
> |
> |exit status 0
> 
> The documentation doen't say what the exit status should be in this case:
> | psql returns 0 to the shell if it finished normally, 1 if a fatal error of 
> its own occurs (e.g., out of memory, file not found), 2 if the connection to 
> the server went bad and the session was not interactive, and 3 if an error 
> occurred in a script and the variable ON_ERROR_STOP was set.
> 
> It returns 1 if the final command fails, even though it's not a "fatal error"
> (it would've happily kept running more commands).
> 
> | x=`some_command_that_fails`
> | rm -fr "$x/$y # removes all your data
> 
> | psql -c "begin; C REATE TABLE newtable(LIKE oldtable) INSERT INTO newtable 
> SELECT * FROM oldtable; commit" -c "DROP TABLE oldtable
> | psql -v VERBOSITY=terse ts -xtc '0CREATE TABLE newtbl(i int)' -c 'INSERT 
> INTO newtbl SELECT * FROM tbl' -c 'DROP TABLE IF EXISTS tbl' -c 'ALTER TABLE 
> newtbl RENAME TO tbl'; echo ret=$?
> 
> David J suggested to change the default value of ON_ERROR_STOP.  The exit
> status in the non-default case would have to be documented.  That's one
> solution, and allows the old behavior if anybody wants it.  That probably does
> what most people want, too.  This is more likely to expose a real problem that
> someone would have missed than to break a legitimate use.  That doesn't appear
> to break regression tests at all.

I was wrong - the regression tests specifically exercise failure cases, so the
scripts must continue after errors.

I think the current behavior of the regression test SQL scripts is exactly the
opposite of what's desirable for almost all other scripts.  The attached makes
ON_ERROR_STOP the default, and runs the regression tests with ON_ERROR_STOP=0.

Is it viable to consider changing this ?
>From 5a54d265bff7565bd311f5e0d4115efb615f172f Mon Sep 17 00:00:00 2001
From: Justin Pryzby 
Date: Wed, 8 Dec 2021 22:30:21 -0600
Subject: [PATCH] psql: default to ON_ERROR_STOP

---
 src/bin/psql/startup.c | 1 +
 src/test/regress/pg_regress_main.c | 2 +-
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/startup.c b/src/bin/psql/startup.c
index f7ea4ce3d46..47cc37e6bc1 100644
--- a/src/bin/psql/startup.c
+++ b/src/bin/psql/startup.c
@@ -200,6 +200,7 @@ main(int argc, char *argv[])
 
 	/* Default values for variables (that don't match the result of \unset) */
 	SetVariableBool(pset.vars, "AUTOCOMMIT");
+	SetVariableBool(pset.vars, "ON_ERROR_STOP");
 	SetVariable(pset.vars, "PROMPT1", DEFAULT_PROMPT1);
 	SetVariable(pset.vars, "PROMPT2", DEFAULT_PROMPT2);
 	SetVariable(pset.vars, "PROMPT3", DEFAULT_PROMPT3);
diff --git a/src/test/regress/pg_regress_main.c b/src/test/regress/pg_regress_main.c
index 1524676f3b3..8b1f456a674 100644
--- a/src/test/regress/pg_regress_main.c
+++ b/src/test/regress/pg_regress_main.c
@@ -82,7 +82,7 @@ psql_start_test(const char *testname,
 	   bindir ? bindir : "",
 	   bindir ? "/" : "",
 	   dblist->str,
-	   "-v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on",
+	   "-v HIDE_TABLEAM=on -v HIDE_TOAST_COMPRESSION=on -v ON_ERROR_STOP=no",
 	   infile,
 	   outfile);
 	if (offset >= sizeof(psql_cmd))
-- 
2.17.0



Re: Foreign key joins revisited

2021-12-27 Thread Isaac Morland
On Mon, 27 Dec 2021 at 10:20, Joel Jacobson  wrote:


> Foreign key constraint names have been given the same names as the
> referenced tables.
>

While I agree this could be a simple approach in many real cases for having
easy to understand FK constraint names, I wonder if for illustration and
explaining the feature if it might work better to use names that are
completely unique so that it's crystal clear that the names are constraint
names, not table names.


Re: why does reindex invalidate relcache without modifying system tables

2021-12-27 Thread Tom Lane
wenjing zeng  writes:
> I found that in the index_update_stats function, i.e. the CREATE 
> INDEX/REINDEX/Truncate INDEX process,
> relchche is invalidated whether the index information is updated. I want to 
> know why you're did this

Did you read the function's header comment?  It says

 * NOTE: an important side-effect of this operation is that an SI invalidation
 * message is sent out to all backends --- including me --- causing relcache
 * entries to be flushed or updated with the new data.  This must happen even
 * if we find that no change is needed in the pg_class row.  When updating
 * a heap entry, this ensures that other backends find out about the new
 * index.  When updating an index, it's important because some index AMs
 * expect a relcache flush to occur after REINDEX.

That is, what we need to force an update of is either the relcache's
rd_indexlist list (for a table) or rd_amcache (for an index).

In the REINDEX case, we could conceivably skip the flush on the table,
but not on the index.  I don't think it's worth worrying about though,
because REINDEX will very probably have an update for the table's
physical size data (relpages and/or reltuples), so that it's unlikely
that the no-change path would be taken anyway.

regards, tom lane




Re: Foreign key joins revisited

2021-12-27 Thread Sascha Kuhl
Joel Jacobson  schrieb am Mo., 27. Dez. 2021, 16:21:

> >On Mon, Dec 27, 2021, at 15:48, Isaac Morland wrote:
> >I thought the proposal was to give the FK constraint name.
> >However, if the idea now is to allow leaving that out also if there
> >is only one FK, then that's also OK as long as people understand it can
> break in the same way NATURAL JOIN can break
> >when columns are added later. For that matter, a join mentioning column
> names can break if the columns are changed. But
> >breakage where the query no longer compiles are better than ones where it
> suddenly means something very different so
> >overall I wouldn't worry about this too much.
>
> Yes, my proposal was indeed to give the FK constraint name.
> I just commented on Corey's different proposal that instead specified FK
> columns.
> I agree with your reasoning regarding the trade-offs and problems with
> such a proposal.
>
> I still see more benefits in using the FK constraint name though.
>
> I have made some new progress on the idea since last proposal:
>
> SYNTAX
>
> join_type JOIN KEY referencing_alias.fk_name [ [ AS ] alias ]
>
> join_type table_name [ [ AS ] alias ] KEY fk_name REF referenced_alias
>
> EXAMPLE
>
> FROM permission p
> LEFT JOIN KEY p.role r
> LEFT JOIN team_role tr KEY role REF r
> LEFT JOIN KEY tr.team t
> LEFT JOIN user_role ur KEY role REF r
> LEFT JOIN KEY ur.user u
> WHERE p.id = 1;
>


Ref = in and to, great

>
> Foreign key constraint names have been given the same names as the
> referenced tables.
>
> Thoughts?
>
> /Joel
>


Re: Foreign key joins revisited

2021-12-27 Thread Joel Jacobson
>On Mon, Dec 27, 2021, at 15:48, Isaac Morland wrote:
>I thought the proposal was to give the FK constraint name.
>However, if the idea now is to allow leaving that out also if there 
>is only one FK, then that's also OK as long as people understand it can break 
>in the same way NATURAL JOIN can break 
>when columns are added later. For that matter, a join mentioning column names 
>can break if the columns are changed. But 
>breakage where the query no longer compiles are better than ones where it 
>suddenly means something very different so 
>overall I wouldn't worry about this too much.

Yes, my proposal was indeed to give the FK constraint name.
I just commented on Corey's different proposal that instead specified FK 
columns.
I agree with your reasoning regarding the trade-offs and problems with such a 
proposal.
 
I still see more benefits in using the FK constraint name though.

I have made some new progress on the idea since last proposal:

SYNTAX

join_type JOIN KEY referencing_alias.fk_name [ [ AS ] alias ]

join_type table_name [ [ AS ] alias ] KEY fk_name REF referenced_alias

EXAMPLE

FROM permission p
LEFT JOIN KEY p.role r
LEFT JOIN team_role tr KEY role REF r
LEFT JOIN KEY tr.team t
LEFT JOIN user_role ur KEY role REF r
LEFT JOIN KEY ur.user u
WHERE p.id = 1;

Foreign key constraint names have been given the same names as the referenced 
tables.

Thoughts?

/Joel

Re: Add Boolean node

2021-12-27 Thread Alvaro Herrera
On 2021-Dec-27, Peter Eisentraut wrote:

> This patch adds a new node type Boolean, to go alongside the "value" nodes
> Integer, Float, String, etc.  This seems appropriate given that Boolean
> values are a fundamental part of the system and are used a lot.

I like the idea.  I'm surprised that there is no notational savings in
the patch, however.

> diff --git a/src/test/regress/expected/create_function_3.out 
> b/src/test/regress/expected/create_function_3.out
> index 3a4fd45147..e0c4bee893 100644
> --- a/src/test/regress/expected/create_function_3.out
> +++ b/src/test/regress/expected/create_function_3.out
> @@ -403,7 +403,7 @@ SELECT pg_get_functiondef('functest_S_13'::regproc);
>LANGUAGE sql+
>   BEGIN ATOMIC +
>SELECT 1;   +
> -  SELECT false AS bool;   +
> +  SELECT false;   +
>   END  +

Hmm, interesting side-effect: we no longer assign a column name in this
case so it remains "?column?", just like it happens for other datatypes.
This seems okay to me.  (This is also what causes the changes in the
isolationtester expected output.)

-- 
Álvaro Herrera   39°49'30"S 73°17'W  —  https://www.EnterpriseDB.com/
"Ni aún el genio muy grande llegaría muy lejos
si tuviera que sacarlo todo de su propio interior" (Goethe)




Re: Add Boolean node

2021-12-27 Thread Tom Lane
Ashutosh Bapat  writes:
> That looks like a good change. I wonder what motivates that now? Why
> wasn't it added when the usages grew?

You'd have to find some of the original Berkeley people to get an
answer for that.  Possibly it's got something to do with the fact
that C didn't have a separate bool type back then ... or, going
even further back, that LISP didn't either.  In any case, it seems
like a plausible improvement now.

Didn't really read the patch in any detail, but I did have one idea:
I think that the different things-that-used-to-be-Value-nodes ought to
use different field names, say ival, rval, bval, sval not just "val".
That makes it more likely that you'd catch any code that is doing the
wrong thing and not going through one of the access macros.

regards, tom lane




Re: Foreign key joins revisited

2021-12-27 Thread Isaac Morland
On Mon, 27 Dec 2021 at 03:22, Joel Jacobson  wrote:


> However, I see one problem with leaving out the key columns:
> First, there is only one FK in permission pointing to role, and we write a
> query leaving out the key columns.
> Then, another different FK in permission pointing to role is later added,
> and our old query is suddenly in trouble.
>

I thought the proposal was to give the FK constraint name. However, if the
idea now is to allow leaving that out also if there is only one FK, then
that's also OK as long as people understand it can break in the same way
NATURAL JOIN can break when columns are added later. For that matter, a
join mentioning column names can break if the columns are changed. But
breakage where the query no longer compiles are better than ones where it
suddenly means something very different so overall I wouldn't worry about
this too much.


Re: Is there a way (except from server logs) to know the kind of on-going/last checkpoint?

2021-12-27 Thread Bharath Rupireddy
On Thu, Dec 9, 2021 at 12:08 PM Bharath Rupireddy
 wrote:
>
> On Wed, Dec 8, 2021 at 7:43 AM Bharath Rupireddy
>  wrote:
> >
> > On Wed, Dec 8, 2021 at 7:34 AM Tomas Vondra
> >  wrote:
> > > >> I agree it might be useful to provide information about the nature of
> > > >> the checkpoint, and perhaps even PID of the backend that triggered it
> > > >> (although that may be tricky, if the backend terminates).
> > >
> > > >> I'm not sure about adding it to control data, though. That doesn't seem
> > > >> like a very good match for something that's mostly for monitoring.
> > > >
> > > > Having it in the control data file (along with the existing checkpoint
> > > > information) will be helpful to know what was the last checkpoint
> > > > information and we can use the existing pg_control_checkpoint function
> > > > or the tool to emit that info. I plan to add an int16 flag as
> > > > suggested by Justin in this thread and come up with a patch.
> > > >
> > > OK, although I'm not sure it's all that useful (if we have that in some
> > > sort of system view).
> >
> > If the server is down, the control file will help. Since we already
> > have the other checkpoint info in the control file, it's much more
> > useful and sensible to have this extra piece of missing information
> > (checkpoint kind) there.
>
> Here's the patch that adds the last checkpoint kind to the control
> file, displayed as an output in the pg_controldata tool and also
> exposes it via the pg_control_checkpoint function.

Here's v2, rebased onto the latest master.

Regards,
Bharath Rupireddy.


v2-0001-add-last-checkpoint-kind-to-pg_control-file.patch
Description: Binary data


Re: refactoring basebackup.c

2021-12-27 Thread tushar

On 11/22/21 11:05 PM, Jeevan Ladhe wrote:

Please find the lz4 compression patch here that basically has:

Thanks, Could you please rebase your patch, it is failing at my end -

[edb@centos7tushar pg15_lz]$ git apply /tmp/v8-0001-LZ4-compression.patch
error: patch failed: doc/src/sgml/ref/pg_basebackup.sgml:230
error: doc/src/sgml/ref/pg_basebackup.sgml: patch does not apply
error: patch failed: src/backend/replication/Makefile:19
error: src/backend/replication/Makefile: patch does not apply
error: patch failed: src/backend/replication/basebackup.c:64
error: src/backend/replication/basebackup.c: patch does not apply
error: patch failed: src/include/replication/basebackup_sink.h:285
error: src/include/replication/basebackup_sink.h: patch does not apply

--
regards,tushar
EnterpriseDB  https://www.enterprisedb.com/
The Enterprise PostgreSQL Company





RE: row filtering for logical replication

2021-12-27 Thread houzj.f...@fujitsu.com
On Mon, Dec 27, 2021 9:16 PM houzj.f...@fujitsu.com  
wrote:
> On Thur, Dec 23, 2021 4:28 PM Peter Smith  wrote:
> > Here is the v54* patch set:
> 
> Attach the v55 patch set which add the following testcases in 0003 patch.
Sorry for the typo here, I mean the tests are added 0002 patch.

Best regards,
Hou zj


Re: Add Boolean node

2021-12-27 Thread Ashutosh Bapat
That looks like a good change. I wonder what motivates that now? Why
wasn't it added when the usages grew? Are there more Boolean usages
planned?

I ask because this code change will affect ability to automatically
cherry-pick some of the patches.

defGetBoolean() - please update the comment in the default to case to
mention defGetString along with opt_boolean_or_string production.
Reading the existing code in that function, one would wonder why to
use true and false over say on and off. But true/false seems a natural
choice. So that's fine.

defGetBoolean() and nodeRead() could use a common function to parse a
boolean string. The code in nodeRead() seems to assume that any string
starting with "t" will represent value true. Is that right?

We are using literal constants "true"/"false"  at many places. This
patch adds another one. I am wondering whether it makes sense to add
#define TRUE_STR, FALSE_STR and use it everywhere for consistency and
correctness.

For the sake of consistency (again :)), we should have a function to
return string representation of a Boolean node and use it in both
defGetString and _outBoolean().

Are the expected output changes like below necessary? Might affect
backward compatibility for applications.
-bool
-
-t
+?column?
+
+t

On Mon, Dec 27, 2021 at 2:32 PM Peter Eisentraut
 wrote:
>
>
> This patch adds a new node type Boolean, to go alongside the "value"
> nodes Integer, Float, String, etc.  This seems appropriate given that
> Boolean values are a fundamental part of the system and are used a lot.
>
> Before, SQL-level Boolean constants were represented by a string with
> a cast, and internal Boolean values in DDL commands were usually
> represented by Integer nodes.  This takes the place of both of these
> uses, making the intent clearer and having some amount of type safety.



-- 
Best Wishes,
Ashutosh Bapat




Re: psql - add SHOW_ALL_RESULTS option

2021-12-27 Thread Peter Eisentraut

On 23.12.21 12:40, Fabien COELHO wrote:
In [0], it was reported that certain replication commands result in 
infinite loops because of faulty error handling.  This still happens.  
I wrote a test for it, attached here.  (I threw in a few more basic 
tests, just to have some more coverage that was lacking, and to have a 
file to put the new test in.)


Hmmm… For some unclear reason on errors on a PGRES_COPY_* state 
PQgetResult keeps on returning an empty result. PQexec manually ignores 
it, so I did the same with a comment, but for me the real bug is somehow 
in PQgetResult behavior…


In [1], it was reported that server crashes produce duplicate error 
messages. This also still happens.  I didn't write a test for it, 
maybe you have an idea.  (Obviously, we could check whether the error 
message is literally there twice in the output, but that doesn't seem 
very general.)  But it's easy to test manually: just have psql 
connect, shut down the server, then run a query.


This is also a feature/bug of libpq which happens to be hidden by 
PQexec: when one command crashes PQgetResult actually returns *2* 
results. First one with the FATAL message, second one when libpq figures 
out that the connection was lost with the second message appended to the 
first. PQexec just happen to silently ignore the first result. I added a 
manual reset of the error message when first shown so that it is not 
shown twice. It is unclear to me whether the reset should be somewhere 
in libpq instead. I added a voluntary crash at the end of the psql test.


I agree that these two behaviors in libpq are dubious, especially the 
second one.  I want to spend some time analyzing this more and see if 
fixes in libpq might be appropriate.





Re: Add Boolean node

2021-12-27 Thread Pavel Stehule
po 27. 12. 2021 v 13:05 odesílatel Sascha Kuhl 
napsal:

>
>
> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
> 12:28:
>
>>
>>
>> po 27. 12. 2021 v 12:23 odesílatel Sascha Kuhl 
>> napsal:
>>
>>>
>>>
>>> Sascha Kuhl  schrieb am Mo., 27. Dez. 2021,
>>> 12:13:
>>>


 Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
 11:49:

> Hi
>
> po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
> napsal:
>
>> You think, all values are valid. Is a higher german order valid for
>> Turkey, that only know baskets, as a Form of order. For me not all forms 
>> of
>> all are valid for all. You cannot Export or Import food that You dislike,
>> because it would hurt you. Do you have dishes that you dislike? Is all
>> valid for you and your culture.
>>
>> It is ok that this is an internal feature, that is not cultural
>> dependent. Iwanted to give you my Interpretation of this Feature. It is 
>> ok
>> It doesn't fit 
>>
>
> Please, don't use top posting mode in this mailing list
> https://en.wikipedia.org/wiki/Posting_style#Top-posting
>

 I will read and learn on that. Thanks for the hint.


> This is an internal feature - Node structures are not visible from SQL
> level. And internal features will be faster and less complex, if we don't
> need to implement cultural dependency there. So False is just only false,
> and not "false" or "lez" or "nepravda" or "Marchen" any other.
>
> On a custom level it is a different situation. Although I am not sure
> if it is a good idea to implement local dependency for boolean type. In
> Czech language we have two related words for "false" - "lez" and
> "nepravda". And nothing is used in IT. But we use Czech (German) format
> date (and everywhere in code ISO format shou lld be preferred), and we use
> czech sorting. In internal things less complexity is better (higher
> complexity means lower safety) . On a custom level, anybody can do what
> they like.
>

>>> If you See databases as a tree, buche like books, the stem is internal,
>>> less complexity, strong and safe. The custom level are the bows and leafs.
>>> Ever leaf gets the ingredients it likes, but all are of the same type.
>>>
>>
>> again - Node type is not equal to data type.
>>
>
> Did you know that different culture have different trees. You read that.
> The Chinese Bonsai reflects Chinese Société, as well as the german buche
> reflects Verwaltung
>
> Thanks for the separation of node and data. If you consider keys, ie.
> Indexes trees, keys and nodes can be easily the same, in a simulation.
> Thanks for your view.
>

look at Postgres source code , please.
https://github.com/postgres/postgres/tree/master/src/backend/nodes. In this
case nodes have no relation to the index's tree.

Regards

Pavel



>> Regards
>>
>> Pavel
>>
>>


Re: Add Boolean node

2021-12-27 Thread Sascha Kuhl
Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
12:28:

>
>
> po 27. 12. 2021 v 12:23 odesílatel Sascha Kuhl 
> napsal:
>
>>
>>
>> Sascha Kuhl  schrieb am Mo., 27. Dez. 2021, 12:13:
>>
>>>
>>>
>>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>>> 11:49:
>>>
 Hi

 po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
 napsal:

> You think, all values are valid. Is a higher german order valid for
> Turkey, that only know baskets, as a Form of order. For me not all forms 
> of
> all are valid for all. You cannot Export or Import food that You dislike,
> because it would hurt you. Do you have dishes that you dislike? Is all
> valid for you and your culture.
>
> It is ok that this is an internal feature, that is not cultural
> dependent. Iwanted to give you my Interpretation of this Feature. It is ok
> It doesn't fit 
>

 Please, don't use top posting mode in this mailing list
 https://en.wikipedia.org/wiki/Posting_style#Top-posting

>>>
>>> I will read and learn on that. Thanks for the hint.
>>>
>>>
 This is an internal feature - Node structures are not visible from SQL
 level. And internal features will be faster and less complex, if we don't
 need to implement cultural dependency there. So False is just only false,
 and not "false" or "lez" or "nepravda" or "Marchen" any other.

 On a custom level it is a different situation. Although I am not sure
 if it is a good idea to implement local dependency for boolean type. In
 Czech language we have two related words for "false" - "lez" and
 "nepravda". And nothing is used in IT. But we use Czech (German) format
 date (and everywhere in code ISO format shou lld be preferred), and we use
 czech sorting. In internal things less complexity is better (higher
 complexity means lower safety) . On a custom level, anybody can do what
 they like.

>>>
>> If you See databases as a tree, buche like books, the stem is internal,
>> less complexity, strong and safe. The custom level are the bows and leafs.
>> Ever leaf gets the ingredients it likes, but all are of the same type.
>>
>
> again - Node type is not equal to data type.
>

Did you know that different culture have different trees. You read that.
The Chinese Bonsai reflects Chinese Société, as well as the german buche
reflects Verwaltung

Thanks for the separation of node and data. If you consider keys, ie.
Indexes trees, keys and nodes can be easily the same, in a simulation.
Thanks for your view.

>
> Regards
>
> Pavel
>
>


Re: Add Boolean node

2021-12-27 Thread Pavel Stehule
po 27. 12. 2021 v 12:23 odesílatel Sascha Kuhl 
napsal:

>
>
> Sascha Kuhl  schrieb am Mo., 27. Dez. 2021, 12:13:
>
>>
>>
>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>> 11:49:
>>
>>> Hi
>>>
>>> po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
>>> napsal:
>>>
 You think, all values are valid. Is a higher german order valid for
 Turkey, that only know baskets, as a Form of order. For me not all forms of
 all are valid for all. You cannot Export or Import food that You dislike,
 because it would hurt you. Do you have dishes that you dislike? Is all
 valid for you and your culture.

 It is ok that this is an internal feature, that is not cultural
 dependent. Iwanted to give you my Interpretation of this Feature. It is ok
 It doesn't fit 

>>>
>>> Please, don't use top posting mode in this mailing list
>>> https://en.wikipedia.org/wiki/Posting_style#Top-posting
>>>
>>
>> I will read and learn on that. Thanks for the hint.
>>
>>
>>> This is an internal feature - Node structures are not visible from SQL
>>> level. And internal features will be faster and less complex, if we don't
>>> need to implement cultural dependency there. So False is just only false,
>>> and not "false" or "lez" or "nepravda" or "Marchen" any other.
>>>
>>> On a custom level it is a different situation. Although I am not sure if
>>> it is a good idea to implement local dependency for boolean type. In Czech
>>> language we have two related words for "false" - "lez" and "nepravda". And
>>> nothing is used in IT. But we use Czech (German) format date (and
>>> everywhere in code ISO format shou lld be preferred), and we use czech
>>> sorting. In internal things less complexity is better (higher complexity
>>> means lower safety) . On a custom level, anybody can do what they like.
>>>
>>
> If you See databases as a tree, buche like books, the stem is internal,
> less complexity, strong and safe. The custom level are the bows and leafs.
> Ever leaf gets the ingredients it likes, but all are of the same type.
>

again - Node type is not equal to data type.

Regards

Pavel


Re: Add Boolean node

2021-12-27 Thread Sascha Kuhl
Sascha Kuhl  schrieb am Mo., 27. Dez. 2021, 12:13:

>
>
> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
> 11:49:
>
>> Hi
>>
>> po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
>> napsal:
>>
>>> You think, all values are valid. Is a higher german order valid for
>>> Turkey, that only know baskets, as a Form of order. For me not all forms of
>>> all are valid for all. You cannot Export or Import food that You dislike,
>>> because it would hurt you. Do you have dishes that you dislike? Is all
>>> valid for you and your culture.
>>>
>>> It is ok that this is an internal feature, that is not cultural
>>> dependent. Iwanted to give you my Interpretation of this Feature. It is ok
>>> It doesn't fit 
>>>
>>
>> Please, don't use top posting mode in this mailing list
>> https://en.wikipedia.org/wiki/Posting_style#Top-posting
>>
>
> I will read and learn on that. Thanks for the hint.
>
>
>> This is an internal feature - Node structures are not visible from SQL
>> level. And internal features will be faster and less complex, if we don't
>> need to implement cultural dependency there. So False is just only false,
>> and not "false" or "lez" or "nepravda" or "Marchen" any other.
>>
>> On a custom level it is a different situation. Although I am not sure if
>> it is a good idea to implement local dependency for boolean type. In Czech
>> language we have two related words for "false" - "lez" and "nepravda". And
>> nothing is used in IT. But we use Czech (German) format date (and
>> everywhere in code ISO format shou lld be preferred), and we use czech
>> sorting. In internal things less complexity is better (higher complexity
>> means lower safety) . On a custom level, anybody can do what they like.
>>
>
If you See databases as a tree, buche like books, the stem is internal,
less complexity, strong and safe. The custom level are the bows and leafs.
Ever leaf gets the ingredients it likes, but all are of the same type.


>>
>
> I agree on that from a german point of view. This is great structure on a
> first guess.
>
>
>> Regards
>>
>> Pavel
>>
>>
>>>
>>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>>> 11:15:
>>>


 po 27. 12. 2021 v 11:08 odesílatel Sascha Kuhl 
 napsal:

> Can that boolean node be cultural dependent validation for the value?
> By the developer? By all?
>

 why?

 The boolean node is not a boolean type.

 This is an internal feature. There should not be any cultural dependency

 Regards

 Pavel


> Pavel Stehule  schrieb am Mo., 27. Dez.
> 2021, 10:09:
>
>>
>>
>> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
>> peter.eisentr...@enterprisedb.com> napsal:
>>
>>>
>>> This patch adds a new node type Boolean, to go alongside the "value"
>>> nodes Integer, Float, String, etc.  This seems appropriate given
>>> that
>>> Boolean values are a fundamental part of the system and are used a
>>> lot.
>>>
>>> Before, SQL-level Boolean constants were represented by a string with
>>> a cast, and internal Boolean values in DDL commands were usually
>>> represented by Integer nodes.  This takes the place of both of these
>>> uses, making the intent clearer and having some amount of type
>>> safety.
>>
>>
>> +1
>>
>> Regards
>>
>> Pavel
>>
>>


Re: Allow escape in application_name

2021-12-27 Thread Masahiko Sawada
On Mon, Dec 27, 2021 at 1:40 PM Fujii Masao  wrote:
>
>
>
> On 2021/12/27 10:40, kuroda.hay...@fujitsu.com wrote:
> > Dear Fujii-san, Horiguchi-san,
> >
> > I confirmed that the feature was committed but reverted the test.
> > Now I'm checking buildfarm.
>
> Attached is the patch that adds the regression test for 
> postgres_fdw.application_name. Could you review this?
>
> As Horiguchi-san suggested at [1], I split the test into two, and then 
> tweaked them as follows.
>
> 1. Set application_name option of a server option to 'fdw_%d%p'
> 2. Set postgres_fdw.application_name to 'fdw_%a%u%%'
>
> 'fdw_%d%p' and 'fdw_%a%u%%' still may be larger than NAMEDATALEN depending on 
> the regression test environment. To make the test stable even in that case, 
> the patch uses substring() is truncate application_name string in the test 
> query's condition to less than NAMEDATALEN.

Good idea. But the application_name is actually truncated to 63
characters (NAMEDATALEN - 1)? If so, we need to do substring(... for
63) instead.

Regards,

--
Masahiko Sawada
EDB:  https://www.enterprisedb.com/




Re: Add Boolean node

2021-12-27 Thread Sascha Kuhl
Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
11:49:

> Hi
>
> po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
> napsal:
>
>> You think, all values are valid. Is a higher german order valid for
>> Turkey, that only know baskets, as a Form of order. For me not all forms of
>> all are valid for all. You cannot Export or Import food that You dislike,
>> because it would hurt you. Do you have dishes that you dislike? Is all
>> valid for you and your culture.
>>
>> It is ok that this is an internal feature, that is not cultural
>> dependent. Iwanted to give you my Interpretation of this Feature. It is ok
>> It doesn't fit 
>>
>
> Please, don't use top posting mode in this mailing list
> https://en.wikipedia.org/wiki/Posting_style#Top-posting
>

I will read and learn on that. Thanks for the hint.


> This is an internal feature - Node structures are not visible from SQL
> level. And internal features will be faster and less complex, if we don't
> need to implement cultural dependency there. So False is just only false,
> and not "false" or "lez" or "nepravda" or "Marchen" any other.
>
> On a custom level it is a different situation. Although I am not sure if
> it is a good idea to implement local dependency for boolean type. In Czech
> language we have two related words for "false" - "lez" and "nepravda". And
> nothing is used in IT. But we use Czech (German) format date (and
> everywhere in code ISO format should be preferred), and we use czech
> sorting. In internal things less complexity is better (higher complexity
> means lower safety) . On a custom level, anybody can do what they like.
>

I agree on that from a german point of view. This is great structure on a
first guess.


> Regards
>
> Pavel
>
>
>>
>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>> 11:15:
>>
>>>
>>>
>>> po 27. 12. 2021 v 11:08 odesílatel Sascha Kuhl 
>>> napsal:
>>>
 Can that boolean node be cultural dependent validation for the value?
 By the developer? By all?

>>>
>>> why?
>>>
>>> The boolean node is not a boolean type.
>>>
>>> This is an internal feature. There should not be any cultural dependency
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>
 Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
 10:09:

>
>
> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
> peter.eisentr...@enterprisedb.com> napsal:
>
>>
>> This patch adds a new node type Boolean, to go alongside the "value"
>> nodes Integer, Float, String, etc.  This seems appropriate given that
>> Boolean values are a fundamental part of the system and are used a
>> lot.
>>
>> Before, SQL-level Boolean constants were represented by a string with
>> a cast, and internal Boolean values in DDL commands were usually
>> represented by Integer nodes.  This takes the place of both of these
>> uses, making the intent clearer and having some amount of type safety.
>
>
> +1
>
> Regards
>
> Pavel
>
>


Re: Add Boolean node

2021-12-27 Thread Pavel Stehule
Hi

po 27. 12. 2021 v 11:24 odesílatel Sascha Kuhl 
napsal:

> You think, all values are valid. Is a higher german order valid for
> Turkey, that only know baskets, as a Form of order. For me not all forms of
> all are valid for all. You cannot Export or Import food that You dislike,
> because it would hurt you. Do you have dishes that you dislike? Is all
> valid for you and your culture.
>
> It is ok that this is an internal feature, that is not cultural dependent.
> Iwanted to give you my Interpretation of this Feature. It is ok It doesn't
> fit 
>

Please, don't use top posting mode in this mailing list
https://en.wikipedia.org/wiki/Posting_style#Top-posting

This is an internal feature - Node structures are not visible from SQL
level. And internal features will be faster and less complex, if we don't
need to implement cultural dependency there. So False is just only false,
and not "false" or "lez" or "nepravda" or "Marchen" any other.

On a custom level it is a different situation. Although I am not sure if it
is a good idea to implement local dependency for boolean type. In Czech
language we have two related words for "false" - "lez" and "nepravda". And
nothing is used in IT. But we use Czech (German) format date (and
everywhere in code ISO format should be preferred), and we use czech
sorting. In internal things less complexity is better (higher complexity
means lower safety) . On a custom level, anybody can do what they like.

Regards

Pavel


>
> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
> 11:15:
>
>>
>>
>> po 27. 12. 2021 v 11:08 odesílatel Sascha Kuhl 
>> napsal:
>>
>>> Can that boolean node be cultural dependent validation for the value? By
>>> the developer? By all?
>>>
>>
>> why?
>>
>> The boolean node is not a boolean type.
>>
>> This is an internal feature. There should not be any cultural dependency
>>
>> Regards
>>
>> Pavel
>>
>>
>>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>>> 10:09:
>>>


 po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
 peter.eisentr...@enterprisedb.com> napsal:

>
> This patch adds a new node type Boolean, to go alongside the "value"
> nodes Integer, Float, String, etc.  This seems appropriate given that
> Boolean values are a fundamental part of the system and are used a lot.
>
> Before, SQL-level Boolean constants were represented by a string with
> a cast, and internal Boolean values in DDL commands were usually
> represented by Integer nodes.  This takes the place of both of these
> uses, making the intent clearer and having some amount of type safety.


 +1

 Regards

 Pavel




Re: Add Boolean node

2021-12-27 Thread Sascha Kuhl
You think, all values are valid. Is a higher german order valid for Turkey,
that only know baskets, as a Form of order. For me not all forms of all are
valid for all. You cannot Export or Import food that You dislike, because
it would hurt you. Do you have dishes that you dislike? Is all valid for
you and your culture.

It is ok that this is an internal feature, that is not cultural dependent.
Iwanted to give you my Interpretation of this Feature. It is ok It doesn't
fit 

Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
11:15:

>
>
> po 27. 12. 2021 v 11:08 odesílatel Sascha Kuhl 
> napsal:
>
>> Can that boolean node be cultural dependent validation for the value? By
>> the developer? By all?
>>
>
> why?
>
> The boolean node is not a boolean type.
>
> This is an internal feature. There should not be any cultural dependency
>
> Regards
>
> Pavel
>
>
>> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
>> 10:09:
>>
>>>
>>>
>>> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
>>> peter.eisentr...@enterprisedb.com> napsal:
>>>

 This patch adds a new node type Boolean, to go alongside the "value"
 nodes Integer, Float, String, etc.  This seems appropriate given that
 Boolean values are a fundamental part of the system and are used a lot.

 Before, SQL-level Boolean constants were represented by a string with
 a cast, and internal Boolean values in DDL commands were usually
 represented by Integer nodes.  This takes the place of both of these
 uses, making the intent clearer and having some amount of type safety.
>>>
>>>
>>> +1
>>>
>>> Regards
>>>
>>> Pavel
>>>
>>>


Re: Add Boolean node

2021-12-27 Thread Julien Rouhaud
On Mon, Dec 27, 2021 at 5:09 PM Pavel Stehule  wrote:
>
> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut 
>  napsal:
>>
>> This patch adds a new node type Boolean, to go alongside the "value"
>> nodes Integer, Float, String, etc.  This seems appropriate given that
>> Boolean values are a fundamental part of the system and are used a lot.
>>
>> Before, SQL-level Boolean constants were represented by a string with
>> a cast, and internal Boolean values in DDL commands were usually
>> represented by Integer nodes.  This takes the place of both of these
>> uses, making the intent clearer and having some amount of type safety.
>
> +1

+1 too, looks like a good improvement.  The patch looks good to me,
although it's missing comment updates for at least nodeTokenType() and
nodeRead().




Re: Add Boolean node

2021-12-27 Thread Pavel Stehule
po 27. 12. 2021 v 11:08 odesílatel Sascha Kuhl 
napsal:

> Can that boolean node be cultural dependent validation for the value? By
> the developer? By all?
>

why?

The boolean node is not a boolean type.

This is an internal feature. There should not be any cultural dependency

Regards

Pavel


> Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
> 10:09:
>
>>
>>
>> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
>> peter.eisentr...@enterprisedb.com> napsal:
>>
>>>
>>> This patch adds a new node type Boolean, to go alongside the "value"
>>> nodes Integer, Float, String, etc.  This seems appropriate given that
>>> Boolean values are a fundamental part of the system and are used a lot.
>>>
>>> Before, SQL-level Boolean constants were represented by a string with
>>> a cast, and internal Boolean values in DDL commands were usually
>>> represented by Integer nodes.  This takes the place of both of these
>>> uses, making the intent clearer and having some amount of type safety.
>>
>>
>> +1
>>
>> Regards
>>
>> Pavel
>>
>>


Re: Add Boolean node

2021-12-27 Thread Sascha Kuhl
Can that boolean node be cultural dependent validation for the value? By
the developer? By all?

Pavel Stehule  schrieb am Mo., 27. Dez. 2021,
10:09:

>
>
> po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
> peter.eisentr...@enterprisedb.com> napsal:
>
>>
>> This patch adds a new node type Boolean, to go alongside the "value"
>> nodes Integer, Float, String, etc.  This seems appropriate given that
>> Boolean values are a fundamental part of the system and are used a lot.
>>
>> Before, SQL-level Boolean constants were represented by a string with
>> a cast, and internal Boolean values in DDL commands were usually
>> represented by Integer nodes.  This takes the place of both of these
>> uses, making the intent clearer and having some amount of type safety.
>
>
> +1
>
> Regards
>
> Pavel
>
>


why does reindex invalidate relcache without modifying system tables

2021-12-27 Thread wenjing zeng
Hi Tom

I would like to ask you about the details of index build.
I found that in the index_update_stats function, i.e. the CREATE 
INDEX/REINDEX/Truncate INDEX process,
relchche is invalidated whether the index information is updated. I want to 
know why you're did this
The code is:
if (dirty)
 {
heap_inplace_update(pg_class, tuple);
/* the above sends a cache inval message */ } 
else 
{
 /* no need to change tuple, but force relcache inval 
anyway */ 
 CacheInvalidateRelcacheByTuple(tuple); 
}

There's a special line of comment here, and I think you wrote that part for 
some reason.

The reason I ask this question is that 
1 similar places like the vac_update_relstats /vac_update_datfrozenxid function 
don't do this.
2 Local Temp table with ON COMMIT DELETE ROWS builds index for each transaction 
commit.
This causes relcache of the temp table to be rebuilt over and over again.

Looking forward to your reply.

Thanks


Wenjing






Re: Add Boolean node

2021-12-27 Thread Pavel Stehule
po 27. 12. 2021 v 10:02 odesílatel Peter Eisentraut <
peter.eisentr...@enterprisedb.com> napsal:

>
> This patch adds a new node type Boolean, to go alongside the "value"
> nodes Integer, Float, String, etc.  This seems appropriate given that
> Boolean values are a fundamental part of the system and are used a lot.
>
> Before, SQL-level Boolean constants were represented by a string with
> a cast, and internal Boolean values in DDL commands were usually
> represented by Integer nodes.  This takes the place of both of these
> uses, making the intent clearer and having some amount of type safety.


+1

Regards

Pavel


Add Boolean node

2021-12-27 Thread Peter Eisentraut


This patch adds a new node type Boolean, to go alongside the "value" 
nodes Integer, Float, String, etc.  This seems appropriate given that 
Boolean values are a fundamental part of the system and are used a lot.


Before, SQL-level Boolean constants were represented by a string with
a cast, and internal Boolean values in DDL commands were usually 
represented by Integer nodes.  This takes the place of both of these 
uses, making the intent clearer and having some amount of type safety.From 4e1ef56b5443fa11d981eb6e407dfc7c244dc60e Mon Sep 17 00:00:00 2001
From: Peter Eisentraut 
Date: Mon, 27 Dec 2021 09:52:05 +0100
Subject: [PATCH v1] Add Boolean node

Before, SQL-level boolean constants were represented by a string with
a cast, and internal Boolean values in DDL commands were usually
represented by Integer nodes.  This takes the place of both of these
uses, making the intent clearer and having some amount of type safety.
---
 contrib/postgres_fdw/postgres_fdw.c   | 32 +++---
 src/backend/commands/define.c |  4 +
 src/backend/commands/functioncmds.c   | 14 +--
 src/backend/commands/sequence.c   |  4 +-
 src/backend/commands/tsearchcmds.c|  9 ++
 src/backend/commands/user.c   | 28 +++---
 src/backend/nodes/copyfuncs.c | 16 +++
 src/backend/nodes/equalfuncs.c| 11 +++
 src/backend/nodes/nodeFuncs.c |  1 +
 src/backend/nodes/outfuncs.c  |  8 ++
 src/backend/nodes/read.c  |  5 +
 src/backend/nodes/value.c | 12 +++
 src/backend/parser/gram.y | 98 +--
 src/backend/parser/parse_node.c   |  8 ++
 src/backend/replication/repl_gram.y   | 14 +--
 src/include/nodes/nodes.h |  1 +
 src/include/nodes/parsenodes.h|  1 +
 src/include/nodes/value.h |  8 ++
 src/test/isolation/expected/ri-trigger.out| 60 ++--
 .../regress/expected/create_function_3.out|  2 +-
 20 files changed, 210 insertions(+), 126 deletions(-)

diff --git a/contrib/postgres_fdw/postgres_fdw.c 
b/contrib/postgres_fdw/postgres_fdw.c
index fa9a099f13..4d61d88d7b 100644
--- a/contrib/postgres_fdw/postgres_fdw.c
+++ b/contrib/postgres_fdw/postgres_fdw.c
@@ -103,7 +103,7 @@ enum FdwModifyPrivateIndex
FdwModifyPrivateTargetAttnums,
/* Length till the end of VALUES clause (as an Integer node) */
FdwModifyPrivateLen,
-   /* has-returning flag (as an Integer node) */
+   /* has-returning flag (as a Boolean node) */
FdwModifyPrivateHasReturning,
/* Integer list of attribute numbers retrieved by RETURNING */
FdwModifyPrivateRetrievedAttrs
@@ -122,11 +122,11 @@ enum FdwDirectModifyPrivateIndex
 {
/* SQL statement to execute remotely (as a String node) */
FdwDirectModifyPrivateUpdateSql,
-   /* has-returning flag (as an Integer node) */
+   /* has-returning flag (as a Boolean node) */
FdwDirectModifyPrivateHasReturning,
/* Integer list of attribute numbers retrieved by RETURNING */
FdwDirectModifyPrivateRetrievedAttrs,
-   /* set-processed flag (as an Integer node) */
+   /* set-processed flag (as a Boolean node) */
FdwDirectModifyPrivateSetProcessed
 };
 
@@ -280,9 +280,9 @@ typedef struct PgFdwAnalyzeState
  */
 enum FdwPathPrivateIndex
 {
-   /* has-final-sort flag (as an Integer node) */
+   /* has-final-sort flag (as a Boolean node) */
FdwPathPrivateHasFinalSort,
-   /* has-limit flag (as an Integer node) */
+   /* has-limit flag (as a Boolean node) */
FdwPathPrivateHasLimit
 };
 
@@ -1245,9 +1245,9 @@ postgresGetForeignPlan(PlannerInfo *root,
 */
if (best_path->fdw_private)
{
-   has_final_sort = intVal(list_nth(best_path->fdw_private,
+   has_final_sort = boolVal(list_nth(best_path->fdw_private,

 FdwPathPrivateHasFinalSort));
-   has_limit = intVal(list_nth(best_path->fdw_private,
+   has_limit = boolVal(list_nth(best_path->fdw_private,

FdwPathPrivateHasLimit));
}
 
@@ -1879,7 +1879,7 @@ postgresPlanForeignModify(PlannerInfo *root,
return list_make5(makeString(sql.data),
  targetAttrs,
  makeInteger(values_end_len),
- makeInteger((retrieved_attrs != NIL)),
+ makeBoolean((retrieved_attrs != NIL)),
  retrieved_attrs);
 }
 
@@ -1916,7 +1916,7 @@ postgresBeginForeignModify(ModifyTableState *mtstate,
 

Re: Foreign key joins revisited

2021-12-27 Thread Joel Jacobson
On Sun, Dec 26, 2021, at 23:25, Corey Huinker wrote:
> My second guess would be:
> FROM permission p
> LEFT JOIN role AS r ON [FOREIGN] KEY [(p.col1 [, p.col2 ...])]
>
> where the key spec is only required when there are multiple foreign keys in 
> permission pointing to role.
>
> But my first guess would be that the standards group won't get around to it.

It's a quite nice idea. It would definitively mean an improvement, compared to 
today's SQL.

Benefits:
* Ability to assert the join is actually performed on foreign key columns.
* Conciser thanks to not always having to specify all key columns on both sides.

However, I see one problem with leaving out the key columns:
First, there is only one FK in permission pointing to role, and we write a 
query leaving out the key columns.
Then, another different FK in permission pointing to role is later added, and 
our old query is suddenly in trouble.

/Joel