from:"tanghy.f...@fujitsu.com"

RE: Failed transaction statistics to measure the logical replication progress

2022-02-21 Thread tanghy.f...@fujitsu.com

On Mon, Feb 21, 2022 11:46 AM [email protected] 
 wrote:
> 
> On Saturday, February 19, 2022 12:00 AM [email protected]
>  wrote:
> > On Friday, February 18, 2022 3:34 PM Tang, Haiying/唐 海英
> >  wrote:
> > > On Wed, Jan 12, 2022 8:35 PM [email protected]
> > >  wrote:
> > > 4) I noticed that the abort_count doesn't include aborted streaming
> > > transactions.
> > > Should we take this case into consideration?
> > Hmm, we can add this into this column, when there's no objection.
> > I'm not sure but someone might say those should be separate columns.
> I've addressed this point in a new v23 patch,
> since there was no opinion on this so far.
> 
> Kindly have a look at the attached one.
> 

Thanks for updating the patch.

I found a problem when using it. When a replication workers exits, the
transaction stats should be sent to stats collector if they were not sent before
because it didn't reach PGSTAT_STAT_INTERVAL. But I saw that the stats weren't
updated as expected.

I looked into it and found that the replication worker would send the 
transaction stats (if any) before it exits. But it got invalid subid in
pgstat_send_subworker_xact_stats(), which led to the following result:

postgres=# select pg_stat_get_subscription_worker(0, null);
 pg_stat_get_subscription_worker
-
 (0,,2,0,00,"",)
(1 row)

I think that's because subid has already been cleaned when trying to send the
stats. I printed the value of before_shmem_exit_list, the functions in this list
would be called in shmem_exit() when the worker exits.
logicalrep_worker_onexit() would clean up the worker info (including subid), and
pgstat_shutdown_hook() would send stats if any.  logicalrep_worker_onexit() was
called before calling pgstat_shutdown_hook().

(gdb) p before_shmem_exit_list
$1 = {{function = 0xa88f1e , arg = 0}, {function = 
0xb619e7 , arg = 0}, {function = 0xb07b5c 
, arg = 0}, {
function = 0xabdd93 , arg = 0}, {function = 
0xe30c89 , arg = 0}, {function = 0x0, arg = 0} }

Maybe we should make some modification to fix it.

Regards,
Tang

RE: Design of pg_stat_subscription_workers vs pgstats

2022-02-22 Thread tanghy.f...@fujitsu.com

On Tue, Feb 22, 2022 1:45 PM Masahiko Sawada  wrote:
> 
> I've attached a patch that changes pg_stat_subscription_workers view.
> It removes non-cumulative values such as error details such as
> error-XID and the error message from the view, and consequently the
> view now has only cumulative statistics counters: apply_error_count
> and sync_error_count. Since the new view name is under discussion I
> temporarily chose pg_stat_subscription_activity.
> 

Thanks for your patch.

Few comments:

1.
+   apply_error_count uint8
...
+   sync_error_count uint8

It seems that Postgres has no data type named uint8, should we change it to
bigint?

2.
+# Wait for the table sync error to be reported.
+$node_subscriber->poll_query_until(
+   'postgres',
+   qq[
+SELECT apply_error_count = 0 AND sync_error_count > 0
+FROM pg_stat_subscription_activity
+WHERE subname = 'tap_sub'
+]) or die "Timed out while waiting for table sync error";

We want to check table sync error here, but do we need to check
"apply_error_count = 0"? I am not sure if it is possible that the apply worker 
has
an unexpected error, which would cause this test to fail.

Regards,
Tang

RE: Optionally automatically disable logical replication subscriptions on error

2022-02-23 Thread tanghy.f...@fujitsu.com

Hi Osumi-san,

I have a comment on v21 patch.

I wonder if we really need subscription s2 in 028_disable_on_error.pl. I think
for subscription s2, we only tested some normal cases(which could be tested 
with s1), 
and didn't test any error case, which means it wouldn't be automatically 
disabled. 
Is there any reason for creating subscription s2?

Regards,
Tang

RE: Design of pg_stat_subscription_workers vs pgstats

2022-02-23 Thread tanghy.f...@fujitsu.com

On Thu, Feb 24, 2022 9:33 AM Masahiko Sawada  wrote:
> 
> Thank you for the comments! I've attached the latest version patch
> that incorporated all comments I got so far. The primary change from
> the previous version is that the subscription statistics live globally
> rather than per-database.
> 

Thanks for updating the patch.

Few comments:

1. 
I think we should add some doc for column stats_reset in 
pg_stat_subscription_activity view.

2.
+CREATE VIEW pg_stat_subscription_activity AS
 SELECT
-w.subid,
+a.subid,
 s.subname,
...
+a.apply_error_count,
+a.sync_error_count,
+   a.stats_reset
+FROM pg_subscription as s,
+pg_stat_get_subscription_activity(oid) as a;

The line "a.stats_reset" uses a Tab, and we'd better use spaces here.

Regards,
Tang

RE: Design of pg_stat_subscription_workers vs pgstats

2022-02-27 Thread tanghy.f...@fujitsu.com

On Mon, Feb 28, 2022 12:32 PM Amit Kapila  wrote:
> 
> On Mon, Feb 28, 2022 at 8:17 AM Masahiko Sawada 
> wrote:
> >
> > On Mon, Feb 28, 2022 at 11:33 AM Amit Kapila 
> wrote:
> > >
> > > >
> > > > (2) doc/src/sgml/monitoring.sgml
> > > >
> > > > +Resets statistics for a single subscription shown in the
> > > > +pg_stat_subscription_stats view to 
> > > > zero. If
> > > > +the argument is NULL, reset statistics for 
> > > > all
> > > > +subscriptions.
> > > > 
> > > >
> > > > I felt we could improve the first sentence.
> > > >
> > > > From:
> > > > Resets statistics for a single subscription shown in the..
> > > >
> > > > To(idea1):
> > > > Resets statistics for a single subscription defined by the argument to 
> > > > zero.
> > > >
> > >
> > > Okay, I can use this one.
> >
> > Are you going to remove the part "shown in the
> > pg_stat_subsctiption_stats view"? I think it's better to keep it in
> > order to make it clear which statistics the function resets as we have
> > pg_stat_subscription and pg_stat_subscription_stats.
> >
> 
> I decided to keep this part of the docs as it is and fixed a few other
> minor comments raised by you and Peter. Additionally, I have bumped
> the PGSTAT_FILE_FORMAT_ID. I'll push this patch tomorrow unless there
> are any other major comments.
> 

Thanks for your patch. I have finished the review/test for this patch.
The patch LGTM.

Regards,
Tang

Tuples unbalance distribution among workers in underlying parallel select with serial insert

2021-02-22 Thread tanghy.f...@fujitsu.com

Hi Hackers,

Recently, I took some performance measurements for CREATE TABLE AS. 
https://www.postgresql.org/message-id/34549865667a4a3bb330ebfd035f85d3%40G08CNEXMBPEKD05.g08.fujitsu.local

Then I found an issue about the tuples unbalance distribution(99% tuples read 
by one worker) among workers under specified case which lead the underlying 
parallel select part makes no more performance gain as we expect.
It's happening in master(HEAD). 

I think this is not a normal phenomenon, because we pay the costs which 
parallel mode needs, but we didn't get the benefits we want.
So, is there a way to improve it to achieve the same benefits as we use 
parallel select?

Below is test detail:
1. test specification environment
  CentOS 8.2, 128G RAM, 40 processors(Intel(R) Xeon(R) Silver 4210 CPU @ 
2.20GHz), disk SAS
2. test execute
  PSA test_uneven_workers.sql file includes my test data and steps.
3. test results
CREATE TABLE ... AS SELECT ... , underlying select query 130 million rows(about 
5G data size) from 200 million(about 8G source data size). Each case run 30 
times.

||  query 130 million   |
|||-|   
|max_parallel_workers_per_gather | Execution Time |%reg |
|||-|
|max_parallel_workers_per_gather = 2 |141002.030  |-1%  |
|max_parallel_workers_per_gather = 4 |140957.221  |-1%  |
|max_parallel_workers_per_gather = 8 |142445.061  | 0%  |
|max_parallel_workers_per_gather = 0 |142580.535  | |

According to above results, we almost can't get benefit, especially when we 
increase max_parallel_workers_per_gather to 8 or larger one.
Why the parallel select doesn't achieve the desired performance I think is 
because the tuples unbalance distributed among workers as showed in query plan 
. 

Query plan:
max_parallel_workers_per_gather = 8, look at worker 4, 99% tuples read by it.
postgres=# explain analyze verbose create table test1 as select 
func_restricted(),b,c from x1 where a%2=0 or a%3=0; 
  QUERY PLAN
---
 Gather  (cost=1000.00..2351761.77 rows=1995002 width=12) (actual 
time=0.411..64451.616 rows=1 loops=1)
   Output: func_restricted(), b, c
   Workers Planned: 7
   Workers Launched: 7
   ->  Parallel Seq Scan on public.x1  (cost=0.00..1652511.07 rows=285000 
width=8) (actual time=0.016..3553.626 rows=1667 loops=8)
 Output: b, c
 Filter: (((x1.a % 2) = 0) OR ((x1.a % 3) = 0))
 Rows Removed by Filter: 833
 Worker 0:  actual time=0.014..21.415 rows=126293 loops=1
 Worker 1:  actual time=0.015..21.564 rows=126293 loops=1
 Worker 2:  actual time=0.014..21.575 rows=126294 loops=1
 Worker 3:  actual time=0.016..21.701 rows=126293 loops=1
 Worker 4:  actual time=0.019..28263.677 rows=132449393 loops=1
 Worker 5:  actual time=0.019..21.470 rows=126180 loops=1
 Worker 6:  actual time=0.015..34.441 rows=126293 loops=1  Planning 
Time: 0.210 ms  Execution Time: 142392.808 ms

Occurrence condition:
1. query plan is kind of "serial insert + parallel select".
2. underlying select query large data size(e.g. query 130 million from 200 
million). It won't happen in small data size(millions of) from what I've tested 
so far.

According to above, IMHO, I guess it may be caused by the leader write rate 
can't catch the worker read rate, then the tuples of one worker blocked in the 
queue, become more and more.

Any thoughts ?

Regards,
Tang


test_unbalance_workers.sql
Description: test_unbalance_workers.sql

RE: Parallel INSERT (INTO ... SELECT ...)

2021-02-26 Thread tanghy.f...@fujitsu.com

From: Tsunakawa, Takayuki/綱川 貴之 
>the current patch showd nice performance improvement in some (many?) patterns. 
> 
>So, I think it can be committed in PG 14, when it has addressed the plan cache 
>issue that Amit Langote-san posed.

Agreed. 
I summarized my test results for the current patch(V18) in the attached 
file(Please use VSCode or Notepad++ to open it, the context is a bit long). 
As you can see, the performance is good in many patterns(Please refer to  my 
test script, test NO in text is correspond to number in sql file).
If you have any question on my test, please feel free to ask.

Regards,
Tang


max_parallel_workers_per_gather=2
-
Test NO |   test case   

|   query plan  
|   patched(ms) |   master(ms)  |   %reg
--
1   |   Test INSERT with underlying query.  

|   parallel INSERT+SELECT  |   20894.069   |   
27030.623   |   -23%
2   |   Test INSERT into a table with a foreign key.

|   parallel SELECT |   80921.960   |   
84416.775   |   -4%
3   |   Test INSERT with ON CONFLICT ... DO UPDATE  

|   non-parallel|   28111.186   |   
29531.922   |   -5%
4   |   Test INSERT with parallel_leader_participation 
= off;   |  
 parallel INSERT+SELECT  |   2799.354|   5131.874|  
 -45%
5   |   Test INSERT with max_parallel_workers=0;

|   non-parallel|   27006.385   |   
26962.279   |   0%
6   |   Test INSERT with parallelized aggregate 

|   parallel SELECT |   95482.307   |   
105090.301  |   -9%
7   |   Test INSERT with parallel bitmap heap scan  

|   parallel INSERT+SELECT  |   606.664 |   844.471 
|   -28%
8   |   Test INSERT with parallel append

|   parallel INSERT+SELECT  |   109.896 |   
239.198 |   -54%
9   |   Test INSERT with parallel index scan

|   parallel INSERT+SELECT  |   552.481 |   
1238.079|   -55%
10  |   Test INSERT with parallel-safe index expression 

|   parallel INSERT+SELECT  |   1453.686|   2908.489
|   -50%
11  |   Test INSERT with parallel-unsafe index 
expression  
 |   non-parallel|   2828.559|  
 2837.570|   0%
12  |   Test INSERT with underlying query - and 
RETURNING (no projection)   |   
parallel INSERT+SELECT  |   417.493 |   685.430 |   
-39%
13  |   Test INSERT with underlying ordered query - and 
RETURNING (no projection)   |   parallel SELECT 
|   971.095 |   1717.502|   -43%
14  |   Test INSERT with underlying ordered query - and 
RETURNING (with projection) |   parallel SELECT 
|   1002.287|

RE: row filtering for logical replication

2021-12-01 Thread tanghy.f...@fujitsu.com

On Thursday, December 2, 2021 5:21 AM Peter Smith  wrote:
> 
> PSA the v44* set of patches.
> 

Thanks for the new patch. Few comments:

1. This is an example in publication doc, but in fact it's not allowed. Should 
we
change this example?

+CREATE PUBLICATION active_departments FOR TABLE departments WHERE (active IS 
TRUE);

postgres=# CREATE PUBLICATION active_departments FOR TABLE departments WHERE 
(active IS TRUE);
ERROR:  invalid publication WHERE expression for relation "departments"
HINT:  only simple expressions using columns, constants and immutable system 
functions are allowed

2. A typo in 0002 patch.

+ * drops such a user-defnition or if there is any other error via its function,

"user-defnition" should be "user-definition".

Regards,
Tang

RE: Alter all tables in schema owner fix

2021-12-02 Thread tanghy.f...@fujitsu.com

On Friday, December 3, 2021 1:31 PM vignesh C  wrote:
> 
> On Fri, Dec 3, 2021 at 9:53 AM Greg Nancarrow  wrote:
> >
> > On Fri, Dec 3, 2021 at 2:06 PM vignesh C  wrote:
> > >
> > > Currently while changing the owner of ALL TABLES IN SCHEMA
> > > publication, it is not checked if the new owner has superuser
> > > permission or not. Added a check to throw an error if the new owner
> > > does not have superuser permission.
> > > Attached patch has the changes for the same. Thoughts?
> > >
> >
> > It looks OK to me, but just two things:
> >
> > 1) Isn't it better to name "CheckSchemaPublication" as
> > "IsSchemaPublication", since it has a boolean return and also
> > typically CheckXXX type functions normally do checking and error-out
> > if they find a problem.
> 
> Modified
> 
> > 2) Since superuser_arg() caches previous input arg (last_roleid) and
> > has a fast-exit, and has been called immediately before for the FOR
> > ALL TABLES case, it would be better to write:
> >
> > + if (CheckSchemaPublication(form->oid) && !superuser_arg(newOwnerId))
> >
> > as:
> >
> > + if (!superuser_arg(newOwnerId) && IsSchemaPublication(form->oid))
> 
> Modified
> 
> Thanks for the comments, the attached v2 patch has the changes for the same.
> 

Thanks for your patch.
I tested it and it fixed this problem as expected. It also passed "make 
check-world".

Regards,
Tang

RE: row filtering for logical replication

2021-12-06 Thread tanghy.f...@fujitsu.com

On Friday, December 3, 2021 10:09 AM Peter Smith  wrote:
> 
> On Thu, Dec 2, 2021 at 2:32 PM [email protected]
>  wrote:
> >
> > On Thursday, December 2, 2021 5:21 AM Peter Smith
>  wrote:
> > >
> > > PSA the v44* set of patches.
> > >
> >
> > Thanks for the new patch. Few comments:
> >
> > 1. This is an example in publication doc, but in fact it's not allowed. 
> > Should we
> > change this example?
> >
> > +CREATE PUBLICATION active_departments FOR TABLE departments WHERE
> (active IS TRUE);
> >
> > postgres=# CREATE PUBLICATION active_departments FOR TABLE departments
> WHERE (active IS TRUE);
> > ERROR:  invalid publication WHERE expression for relation "departments"
> > HINT:  only simple expressions using columns, constants and immutable system
> functions are allowed
> >
> 
> Thanks for finding this. Actually, the documentation looks correct to
> me. The problem was the validation walker of patch 0002 was being
> overly restrictive. It needed to also allow a BooleanTest node.
> 
> Now it works (locally) for me. For example.
> 
> test_pub=# create table departments(depno int primary key, active boolean);
> CREATE TABLE
> test_pub=# create publication pdept for table departments where
> (active is true) with (publish="insert");
> CREATE PUBLICATION
> test_pub=# create publication pdept2 for table departments where
> (active is false) with (publish="insert");
> CREATE PUBLICATION
> 
> This fix will be available in v45*.
> 

Thanks for looking into it.

I have another problem with your patch. The document says:

... If the subscription has several publications in
+   which the same table has been published with different filters, those
+   expressions get OR'ed together so that rows satisfying any of the 
expressions
+   will be replicated. Notice this means if one of the publications has no 
filter
+   at all then all other filters become redundant.

Then, what if one of the publications is specified as 'FOR ALL TABLES' or 'FOR
ALL TABLES IN SCHEMA'.

For example:
create table tbl (a int primary key);"
create publication p1 for table tbl where (a > 10);
create publication p2 for all tables;
create subscription sub connection 'dbname=postgres port=5432' publication p1, 
p2;

I think for "FOR ALL TABLE" publication(p2 in my case), table tbl should be
treated as no filter, and table tbl should have no filter in subscription sub. 
Thoughts?

But for now, the filter(a > 10) works both when copying initial data and later 
changes.

To fix it, I think we can check if the table is published in a 'FOR ALL TABLES'
publication or published as part of schema in function pgoutput_row_filter_init
(which was introduced in v44-0003 patch), also we need to make some changes in
tablesync.c.

Regards
Tang

RE: [PATCH]Comment improvement in publication.sql

2021-12-07 Thread tanghy.f...@fujitsu.com

On Wednesday, December 8, 2021 1:49 PM, vignesh C  wrote:

> The patch no longer applies, could you post a rebased patch.

Thanks for your kindly reminder. Attached a rebased patch.
Some changes in v4 patch has been fixed by 5a28324, so I deleted those changes.

Regards,
Tang 


v5-0001-Fix-comments-in-publication.sql.patch
Description: v5-0001-Fix-comments-in-publication.sql.patch

RE: parallel vacuum comments

2021-12-13 Thread tanghy.f...@fujitsu.com

On Monday, December 13, 2021 2:12 PM Masahiko Sawada  
wrote:
> 
> On Mon, Dec 13, 2021 at 2:09 PM Amit Kapila  wrote:
> >
> > On Mon, Dec 13, 2021 at 10:33 AM Masahiko Sawada
>  wrote:
> > >
> > > On Fri, Dec 10, 2021 at 9:08 PM Amit Kapila 
> wrote:
> > > >
> > > > On Thu, Dec 9, 2021 at 6:05 PM Masahiko Sawada
>  wrote:
> > > > >
> > > > > On Thu, Dec 9, 2021 at 7:44 PM Amit Kapila 
> wrote:
> > > > >
> > > > > Agreed with the above two points.
> > > > >
> > > > > I've attached updated patches that incorporated the above comments
> > > > > too. Please review them.
> > > > >
> > > >
> > > > I have made the following minor changes to the 0001 patch: (a) An
> > > > assert was removed from dead_items_max_items() which I added back. (b)
> > > > Removed an unnecessary semicolon from one of the statements in
> > > > compute_parallel_vacuum_workers(). (c) Changed comments at a few
> > > > places. (d) moved all parallel_vacuum_* related functions together.
> > > > (e) ran pgindent and slightly modify the commit message.
> > > >
> > > > Let me know what you think of the attached?
> > >
> > > Thank you for updating the patch!
> > >
> > > The patch also moves some functions, e.g., update_index_statistics()
> > > is moved without code changes. I agree to move functions for
> > > consistency but that makes the review hard and the patch complicated.
> > > I think it's better to do improving the parallel vacuum code and
> > > moving functions in separate patches.
> > >
> >
> > Okay, I thought it might be better to keep all parallel_vacuum_*
> > related functions together but we can keep that in a separate patch
> > Feel free to submit without those changes.
> 
> I've attached the patch. I've just moved some functions back but not
> done other changes.
> 

Thanks for your patch.

I tested your patch and tried some cases, like large indexes, different types 
of indexes, it worked well.

Besides, I noticed a typo as follows:

+   /* Estimate size for index vacuum stats -- PARALLEL_VACUUM_KEY_STATS */

"PARALLEL_VACUUM_KEY_STATS" should be "PARALLEL_VACUUM_KEY_INDEX_STATS".

Regards,
Tang

RE: row filtering for logical replication

2021-12-19 Thread tanghy.f...@fujitsu.com

On Wednesday, December 8, 2021 2:29 PM Amit Kapila  
wrote:
> 
> On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira  wrote:
> >
> > On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
> >
> > On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira  wrote:
> > >
> > > On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
> > >
> > > PS> I will update the commit message in the next version. I barely 
> > > changed the
> > > documentation to reflect the current behavior. I probably missed some
> changes
> > > but I will fix in the next version.
> > >
> > > I realized that I forgot to mention a few things about the UPDATE 
> > > behavior.
> > > Regardless of 0003, we need to define which tuple will be used to 
> > > evaluate the
> > > row filter for UPDATEs. We already discussed it circa [1]. This current 
> > > version
> > > chooses *new* tuple. Is it the best choice?
> >
> > But with 0003, we are using both the tuple for evaluating the row
> > filter, so instead of fixing 0001, why we don't just merge 0003 with
> > 0001?  I mean eventually, 0003 is doing what is the agreed behavior,
> > i.e. if just OLD is matching the filter then convert the UPDATE to
> > DELETE OTOH if only new is matching the filter then convert the UPDATE
> > to INSERT.  Do you think that even we merge 0001 and 0003 then also
> > there is an open issue regarding which row to select for the filter?
> >
> > Maybe I was not clear. IIUC we are still discussing 0003 and I would like to
> > propose a different default based on the conclusion I came up. If we merged
> > 0003, that's fine; this change will be useless. If we don't or it is 
> > optional,
> > it still has its merit.
> >
> > Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm 
> > still
> > processing if it is worth it. If you think that in general the row filter
> > contains the primary key and it is rare to change it, it will waste cycles
> > evaluating the same expression twice. It seems this behavior could be
> > controlled by a parameter.
> >
> 
> I think the first thing we should do in this regard is to evaluate the
> performance for both cases (when we apply a filter to both tuples vs.
> to one of the tuples). In case the performance difference is
> unacceptable, I think it would be better to still compare both tuples
> as default to avoid data inconsistency issues and have an option to
> allow comparing one of the tuples.
> 

I did some performance tests to see if 0003 patch has much overhead.
With which I compared applying first two patches and applying first three 
patches in four cases:
1) only old rows match the filter.
2) only new rows match the filter.
3) both old rows and new rows match the filter.
4) neither old rows nor new rows match the filter.

0003 patch checks both old rows and new rows, and without 0003 patch, it only
checks either old or new rows. We want to know whether it would take more time
if we check the old rows.

I ran the tests in asynchronous mode and compared the SQL execution time. I also
tried some complex filters, to see if the difference could be more obvious.

The result and the script are attached.
I didn’t see big difference between the result of applying 0003 patch and the
one not in all cases. So I think 0003 patch doesn’t have much overhead.

Regards,
Tang


perf.sh
Description: perf.sh

RE: row filtering for logical replication

2021-12-20 Thread tanghy.f...@fujitsu.com

On Monday, December 20, 2021 11:24 AM [email protected] 

> 
> On Wednesday, December 8, 2021 2:29 PM Amit Kapila
>  wrote:
> >
> > On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira  wrote:
> > >
> > > On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
> > >
> > > On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira  wrote:
> > > >
> > > > On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
> > > >
> > > > PS> I will update the commit message in the next version. I barely 
> > > > changed
> the
> > > > documentation to reflect the current behavior. I probably missed some
> > changes
> > > > but I will fix in the next version.
> > > >
> > > > I realized that I forgot to mention a few things about the UPDATE 
> > > > behavior.
> > > > Regardless of 0003, we need to define which tuple will be used to 
> > > > evaluate
> the
> > > > row filter for UPDATEs. We already discussed it circa [1]. This current 
> > > > version
> > > > chooses *new* tuple. Is it the best choice?
> > >
> > > But with 0003, we are using both the tuple for evaluating the row
> > > filter, so instead of fixing 0001, why we don't just merge 0003 with
> > > 0001?  I mean eventually, 0003 is doing what is the agreed behavior,
> > > i.e. if just OLD is matching the filter then convert the UPDATE to
> > > DELETE OTOH if only new is matching the filter then convert the UPDATE
> > > to INSERT.  Do you think that even we merge 0001 and 0003 then also
> > > there is an open issue regarding which row to select for the filter?
> > >
> > > Maybe I was not clear. IIUC we are still discussing 0003 and I would like 
> > > to
> > > propose a different default based on the conclusion I came up. If we 
> > > merged
> > > 0003, that's fine; this change will be useless. If we don't or it is 
> > > optional,
> > > it still has its merit.
> > >
> > > Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm 
> > > still
> > > processing if it is worth it. If you think that in general the row filter
> > > contains the primary key and it is rare to change it, it will waste cycles
> > > evaluating the same expression twice. It seems this behavior could be
> > > controlled by a parameter.
> > >
> >
> > I think the first thing we should do in this regard is to evaluate the
> > performance for both cases (when we apply a filter to both tuples vs.
> > to one of the tuples). In case the performance difference is
> > unacceptable, I think it would be better to still compare both tuples
> > as default to avoid data inconsistency issues and have an option to
> > allow comparing one of the tuples.
> >
> 
> I did some performance tests to see if 0003 patch has much overhead.
> With which I compared applying first two patches and applying first three 
> patches
> in four cases:
> 1) only old rows match the filter.
> 2) only new rows match the filter.
> 3) both old rows and new rows match the filter.
> 4) neither old rows nor new rows match the filter.
> 
> 0003 patch checks both old rows and new rows, and without 0003 patch, it only
> checks either old or new rows. We want to know whether it would take more time
> if we check the old rows.
> 
> I ran the tests in asynchronous mode and compared the SQL execution time. I 
> also
> tried some complex filters, to see if the difference could be more obvious.
> 
> The result and the script are attached.
> I didn’t see big difference between the result of applying 0003 patch and the
> one not in all cases. So I think 0003 patch doesn’t have much overhead.
> 

In previous test, I ran 3 times and took the average value, which may be 
affected by
performance fluctuations.

So, to make the results more accurate, I tested them more times (10 times) and
took the average value. The result is attached.

In general, I can see the time difference is within 3.5%, which is in an 
reasonable
performance range, I think.

Regards,
Tang

RE: row filtering for logical replication

2021-12-20 Thread tanghy.f...@fujitsu.com

On Monday, December 20, 2021 4:47 PM [email protected] 
 wrote:
> On Monday, December 20, 2021 11:24 AM [email protected]
> 
> >
> > On Wednesday, December 8, 2021 2:29 PM Amit Kapila
> >  wrote:
> > >
> > > On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira  wrote:
> > > >
> > > > On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
> > > >
> > > > On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira  wrote:
> > > > >
> > > > > On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
> > > > >
> > > > > PS> I will update the commit message in the next version. I barely 
> > > > > changed
> > the
> > > > > documentation to reflect the current behavior. I probably missed some
> > > changes
> > > > > but I will fix in the next version.
> > > > >
> > > > > I realized that I forgot to mention a few things about the UPDATE 
> > > > > behavior.
> > > > > Regardless of 0003, we need to define which tuple will be used to 
> > > > > evaluate
> > the
> > > > > row filter for UPDATEs. We already discussed it circa [1]. This 
> > > > > current
> version
> > > > > chooses *new* tuple. Is it the best choice?
> > > >
> > > > But with 0003, we are using both the tuple for evaluating the row
> > > > filter, so instead of fixing 0001, why we don't just merge 0003 with
> > > > 0001?  I mean eventually, 0003 is doing what is the agreed behavior,
> > > > i.e. if just OLD is matching the filter then convert the UPDATE to
> > > > DELETE OTOH if only new is matching the filter then convert the UPDATE
> > > > to INSERT.  Do you think that even we merge 0001 and 0003 then also
> > > > there is an open issue regarding which row to select for the filter?
> > > >
> > > > Maybe I was not clear. IIUC we are still discussing 0003 and I would 
> > > > like
> to
> > > > propose a different default based on the conclusion I came up. If we 
> > > > merged
> > > > 0003, that's fine; this change will be useless. If we don't or it is 
> > > > optional,
> > > > it still has its merit.
> > > >
> > > > Do we want to pay the overhead to evaluating both tuple for UPDATEs? I'm
> still
> > > > processing if it is worth it. If you think that in general the row 
> > > > filter
> > > > contains the primary key and it is rare to change it, it will waste 
> > > > cycles
> > > > evaluating the same expression twice. It seems this behavior could be
> > > > controlled by a parameter.
> > > >
> > >
> > > I think the first thing we should do in this regard is to evaluate the
> > > performance for both cases (when we apply a filter to both tuples vs.
> > > to one of the tuples). In case the performance difference is
> > > unacceptable, I think it would be better to still compare both tuples
> > > as default to avoid data inconsistency issues and have an option to
> > > allow comparing one of the tuples.
> > >
> >
> > I did some performance tests to see if 0003 patch has much overhead.
> > With which I compared applying first two patches and applying first three 
> > patches
> > in four cases:
> > 1) only old rows match the filter.
> > 2) only new rows match the filter.
> > 3) both old rows and new rows match the filter.
> > 4) neither old rows nor new rows match the filter.
> >
> > 0003 patch checks both old rows and new rows, and without 0003 patch, it 
> > only
> > checks either old or new rows. We want to know whether it would take more 
> > time
> > if we check the old rows.
> >
> > I ran the tests in asynchronous mode and compared the SQL execution time. I
> also
> > tried some complex filters, to see if the difference could be more obvious.
> >
> > The result and the script are attached.
> > I didn’t see big difference between the result of applying 0003 patch and 
> > the
> > one not in all cases. So I think 0003 patch doesn’t have much overhead.
> >
> 
> In previous test, I ran 3 times and took the average value, which may be 
> affected
> by
> performance fluctuations.
> 
> So, to make the results more accurate, I tested them more times (10 times) and
> took the average value. The result is attached.
> 
> In general, I can see the time difference is within 3.5%, which is in an 
> reasonable
> performance range, I think.
> 

Hi,

I ran tests for various percentages of rows being filtered (based on v49 
patch). The result and the script are attached.

In synchronous mode, with row filter patch, the fewer rows match the row 
filter, the less time it took. 
In the case that all rows match the filter, row filter patch took about the 
same time as the one on HEAD code.

In asynchronous mode, I could see time is reduced when the percentage of rows 
sent is small (<25%), other cases took about the same time as the one on HEAD.

I think the above result is good. It shows that row filter patch doesn’t have 
much overhead.

Regards,
Tang


Performance_reports.xlsx
Description: Performance_reports.xlsx


perf_various_percentages.sh
Description: perf_various_percentages.sh

RE: row filtering for logical replication

2021-12-20 Thread tanghy.f...@fujitsu.com

On Tuesday, December 21, 2021 3:03 PM, [email protected] 
 wrote:
> To: Amit Kapila ; Euler Taveira 
> Cc: Dilip Kumar ; Peter Smith ;
> Greg Nancarrow ; Hou, Zhijie/侯 志杰
> ; vignesh C ; Ajin Cherian
> ; Rahila Syed ; Peter Eisentraut
> ; Önder Kalacı ;
> japin ; Michael Paquier ; David
> Steele ; Craig Ringer ; Amit
> Langote ; PostgreSQL Hackers
> 
> Subject: RE: row filtering for logical replication
> 
> On Monday, December 20, 2021 4:47 PM [email protected]
>  wrote:
> > On Monday, December 20, 2021 11:24 AM [email protected]
> > 
> > >
> > > On Wednesday, December 8, 2021 2:29 PM Amit Kapila
> > >  wrote:
> > > >
> > > > On Mon, Dec 6, 2021 at 6:04 PM Euler Taveira  wrote:
> > > > >
> > > > > On Mon, Dec 6, 2021, at 3:35 AM, Dilip Kumar wrote:
> > > > >
> > > > > On Mon, Dec 6, 2021 at 6:49 AM Euler Taveira  
> > > > > wrote:
> > > > > >
> > > > > > On Fri, Dec 3, 2021, at 8:12 PM, Euler Taveira wrote:
> > > > > >
> > > > > > PS> I will update the commit message in the next version. I barely
> changed
> > > the
> > > > > > documentation to reflect the current behavior. I probably missed 
> > > > > > some
> > > > changes
> > > > > > but I will fix in the next version.
> > > > > >
> > > > > > I realized that I forgot to mention a few things about the UPDATE 
> > > > > > behavior.
> > > > > > Regardless of 0003, we need to define which tuple will be used to 
> > > > > > evaluate
> > > the
> > > > > > row filter for UPDATEs. We already discussed it circa [1]. This 
> > > > > > current
> > version
> > > > > > chooses *new* tuple. Is it the best choice?
> > > > >
> > > > > But with 0003, we are using both the tuple for evaluating the row
> > > > > filter, so instead of fixing 0001, why we don't just merge 0003 with
> > > > > 0001?  I mean eventually, 0003 is doing what is the agreed behavior,
> > > > > i.e. if just OLD is matching the filter then convert the UPDATE to
> > > > > DELETE OTOH if only new is matching the filter then convert the UPDATE
> > > > > to INSERT.  Do you think that even we merge 0001 and 0003 then also
> > > > > there is an open issue regarding which row to select for the filter?
> > > > >
> > > > > Maybe I was not clear. IIUC we are still discussing 0003 and I would
> like
> > to
> > > > > propose a different default based on the conclusion I came up. If we
> merged
> > > > > 0003, that's fine; this change will be useless. If we don't or it is
> optional,
> > > > > it still has its merit.
> > > > >
> > > > > Do we want to pay the overhead to evaluating both tuple for UPDATEs?
> I'm
> > still
> > > > > processing if it is worth it. If you think that in general the row 
> > > > > filter
> > > > > contains the primary key and it is rare to change it, it will waste 
> > > > > cycles
> > > > > evaluating the same expression twice. It seems this behavior could be
> > > > > controlled by a parameter.
> > > > >
> > > >
> > > > I think the first thing we should do in this regard is to evaluate the
> > > > performance for both cases (when we apply a filter to both tuples vs.
> > > > to one of the tuples). In case the performance difference is
> > > > unacceptable, I think it would be better to still compare both tuples
> > > > as default to avoid data inconsistency issues and have an option to
> > > > allow comparing one of the tuples.
> > > >
> > >
> > > I did some performance tests to see if 0003 patch has much overhead.
> > > With which I compared applying first two patches and applying first three
> patches
> > > in four cases:
> > > 1) only old rows match the filter.
> > > 2) only new rows match the filter.
> > > 3) both old rows and new rows match the filter.
> > > 4) neither old rows nor new rows match the filter.
> > >
> > > 0003 patch checks both old rows and new rows, and without 0003 patch, it
> only
> > > checks either old or new rows. We want to know whether it would take more
> time
> > > if we check the

RE: row filtering for logical replication

2021-12-28 Thread tanghy.f...@fujitsu.com

On Mon, Dec 27, 2021 9:16 PM [email protected]  
wrote:
> 
> On Thur, Dec 23, 2021 4:28 PM Peter Smith  wrote:
> > Here is the v54* patch set:
> 
> Attach the v55 patch set which add the following testcases in 0003 patch.
> 1. Added a test to cover the case where TOASTed values are not included in the
>new tuple. Suggested by Euler[1].
> 
>Note: this test is temporarily commented because it would fail without
>applying another bug fix patch in another thread[2] which log the detoasted
>value in old value. I have verified locally that the test pass after
>applying the bug fix patch[2].
> 
> 2. Add a test to cover the case that transform the UPDATE into INSERT. 
> Provided
>by Tang.
> 

Thanks for updating the patches.

A few comments:

1) v55-0001

-/*
- * Gets the relations based on the publication partition option for a specified
- * relation.
- */
 List *
 GetPubPartitionOptionRelations(List *result, PublicationPartOpt pub_partopt,
   Oid relid)

Do we need this change?

2) v55-0002
 * Multiple ExprState entries might be used if there are multiple
 * publications for a single table. Different publication actions don't
 * allow multiple expressions to always be combined into one, so there 
is
-* one ExprSTate per publication action. Only 3 publication actions are
+* one ExprState per publication action. Only 3 publication actions are
 * used for row filtering ("insert", "update", "delete"). The exprstate
 * array is indexed by ReorderBufferChangeType.
 */

I think this change can be merged into 0001 patch.

3) v55-0002
+static bool pgoutput_row_filter_update_check(enum ReorderBufferChangeType 
changetype, Relation relation,
+   
 HeapTuple oldtuple, HeapTuple newtuple,
+   
 RelationSyncEntry *entry, ReorderBufferChangeType *action);

Do we need parameter changetype here? I think it could only be
REORDER_BUFFER_CHANGE_UPDATE.

Regards,
Tang

RE: Failed transaction statistics to measure the logical replication progress

2021-12-30 Thread tanghy.f...@fujitsu.com

On Wednesday, December 22, 2021 6:14 PM [email protected] 
 wrote:
> 
> Attached the new patch v19.
> 

Thanks for your patch. I think it's better if you could add this patch to the 
commitfest.
Here are some comments:

1)
+   commit_count bigint
+  
+  
+   Number of transactions successfully applied in this subscription.
+   Both COMMIT and COMMIT PREPARED increment this counter.
+  
+ 
...

I think the commands (like COMMIT, COMMIT PREPARED ...) can be surrounded with
" ", thoughts?

2)
+extern void pgstat_report_subworker_xact_end(LogicalRepWorker *repWorker,
+   
 LogicalRepMsgType command,
+   
 bool bforce);

Should "bforce" be "force"?

3)
+ *  This should be called before the call of process_syning_tables() so to not

"process_syning_tables()" should be "process_syncing_tables()".

4)
+void
+pgstat_send_subworker_xact_stats(LogicalRepWorker *repWorker, bool force)
+{
+   static TimestampTz last_report = 0;
+   PgStat_MsgSubWorkerXactEnd msg;
+
+   if (!force)
+   {
...
+   if (!TimestampDifferenceExceeds(last_report, now, 
PGSTAT_STAT_INTERVAL))
+   return;
+   last_report = now;
+   }
+
...
+   if (repWorker->commit_count == 0 && repWorker->abort_count == 0)
+   return;
...

I think it's better to check commit_count and abort_count first, then check if
reach PGSTAT_STAT_INTERVAL.
Otherwise if commit_count and abort_count are 0, it is possible that the value
of last_report has been updated but it didn't send stats in fact. In this case,
last_report is not the real time that send last message.

Regards,
Tang

RE: [PATCH]Comment improvement in publication.sql

2021-12-30 Thread tanghy.f...@fujitsu.com

On Monday, December 13, 2021 11:53 PM vignesh C  wrote:
> 
> On Wed, Dec 8, 2021 at 11:07 AM [email protected]
>  wrote:
> >
> > On Wednesday, December 8, 2021 1:49 PM, vignesh C 
> wrote:
> >
> > > The patch no longer applies, could you post a rebased patch.
> >
> > Thanks for your kindly reminder. Attached a rebased patch.
> > Some changes in v4 patch has been fixed by 5a28324, so I deleted those
> changes.
> 
> Thanks for the updated patch, should we make a similar change in
> AlterPublicationTables Function header to mention Set too:
> /*
> * Add or remove table to/from publication.
> */
> static void
> AlterPublicationTables(AlterPublicationStmt *stmt, HeapTuple tup,
> List *tables, List *schemaidlist)
> 

Sorry for the late reply.

I am not sure if we need this change because the way to SET the tables in
publication is that drop tables and then add tables, instead of directly
replacing the list of tables in the publication. (We can see this point in
AlterPublicationTables function.). Thoughts?

Regards,
Tang

RE: Optionally automatically disable logical replication subscriptions on error

2022-01-05 Thread tanghy.f...@fujitsu.com

On Wednesday, January 5, 2022 8:53 PM [email protected] 
 wrote:
> 
> On Tuesday, December 28, 2021 11:53 AM Wang, Wei/王 威
>  wrote:
> > On Thursday, December 16, 2021 8:51 PM [email protected]
> >  wrote:
> > > Attached the updated patch v14.
> >
> > A comment to the timing of printing a log:
> Thank you for your review !
> 
> > After the log[1] was printed, I altered subscription's option
> > (DISABLE_ON_ERROR) from true to false before invoking
> > DisableSubscriptionOnError to disable subscription. Subscription was not
> > disabled.
> > [1] "LOG:  logical replication subscription "sub1" will be disabled due to 
> > an
> > error"
> >
> > I found this log is printed in function WorkerErrorRecovery:
> > +   ereport(LOG,
> > +   errmsg("logical replication subscription \"%s\" will
> > be disabled due to an error",
> > +  MySubscription->name));
> > This log is printed here, but in DisableSubscriptionOnError, there is a 
> > check to
> > confirm subscription's disableonerr field. If disableonerr is found changed 
> > from
> > true to false in DisableSubscriptionOnError, subscription will not be 
> > disabled.
> >
> > In this case, "disable subscription" is printed, but subscription will not 
> > be
> > disabled actually.
> > I think it is a little confused to user, so what about moving this message 
> > after
> > the check which is mentioned above in DisableSubscriptionOnError?
> Makes sense. I moved the log print after
> the check of the necessity to disable the subscription.
> 
> Also, I've scrutinized and refined the new TAP test as well for refactoring.
> As a result, I fixed wait_for_subscriptions()
> so that some extra codes that can be simplified,
> such as escaped variable and one part of WHERE clause, are removed.
> Other change I did is to replace two calls of wait_for_subscriptions()
> with polling_query_until() for the subscriber, in order to
> make the tests better and more suitable for the test purposes.
> Again, for this refinement, I've conducted a tight loop test
> to check no timing issue and found no problem.
> 

Thanks for updating the patch. Here are some comments:

1)
+   /*
+* We would not be here unless this subscription's disableonerr field 
was
+* true when our worker began applying changes, but check whether that
+* field has changed in the interim.
+*/
+   if (!subform->subdisableonerr)
+   {
+   /*
+* Disabling the subscription has been done already. No need of
+* additional work.
+*/
+   heap_freetuple(tup);
+   table_close(rel, RowExclusiveLock);
+   CommitTransactionCommand();
+   return;
+   }

I don't understand what does "Disabling the subscription has been done already"
mean, I think we only run here when subdisableonerr is changed in the interim.
Should we modify this comment? Or remove it because there are already some
explanations before.

2)
+   /* Set the subscription to disabled, and note the reason. */
+   values[Anum_pg_subscription_subenabled - 1] = BoolGetDatum(false);
+   replaces[Anum_pg_subscription_subenabled - 1] = true;

I didn't see the code corresponding to "note the reason". Should we modify the
comment?

3)
+   booldisableonerr;   /* Whether errors automatically disable 
*/

This comment is hard to understand. Maybe it can be changed to:

Indicates if the subscription should be automatically disabled when subscription
workers detect any errors.

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-06 Thread tanghy.f...@fujitsu.com

On Thursday, January 6, 2022 11:57 PM, Tom Lane  wrote:
> 
> Peter Eisentraut  writes:
> > So the ordering of the suggested completions is different.  I don't know
> > offhand how that ordering is determined.  Perhaps it's dependent on
> > locale, readline version, or operating system.  In any case, we need to
> > figure this out to make this test stable.
>
> I don't think we want to get into the business of trying to make that
> consistent across different readline/libedit versions.  How about
> adjusting the test case so that only one enum value is to be printed?
> 

Thanks for your suggestion. Agreed. 
Fixed the test case to show only one enum value.

Regards,
Tang


v10-0001-Support-tab-completion-with-a-query-result-for-u.patch
Description:  v10-0001-Support-tab-completion-with-a-query-result-for-u.patch

RE: Support tab completion for upper character inputs in psql

2022-01-06 Thread tanghy.f...@fujitsu.com

On Friday, January 7, 2022 1:08 PM, Japin Li  wrote:
> +/*
> + * pg_string_tolower - Fold a string to lower case if the string is not 
> quoted
> + * and only contains ASCII characters.
> + * For German/Turkish etc text, no change will be made.
> + *
> + * The returned value has to be freed.
> + */
> +static char *
> +pg_string_tolower_if_ascii(const char *text)
> +{
> 
> s/pg_string_tolower/pg_string_tolower_if_ascii/ for comments.
> 

Thanks for your review.
Comment fixed in the attached V11 patch.

Regards,
Tang


v11-0001-Support-tab-completion-with-a-query-result-for-u.patch
Description:  v11-0001-Support-tab-completion-with-a-query-result-for-u.patch

RE: row filtering for logical replication

2022-01-11 Thread tanghy.f...@fujitsu.com

On Tuesday, January 11, 2022 10:16 AM [email protected] 
 wrote:
> 
> Attach the v62 patch set which address the above comments and slightly
> adjust the commit message in 0002 patch.
> 

I saw a possible problem about Row-Filter tablesync SQL, which is related
to partition table.

If a parent table is published with publish_via_partition_root off, its child
table should be taken as no row filter when combining the row filters with OR.
But when using the current SQL, this publication is ignored.

For example:
create table parent (a int) partition by range (a);
create table child partition of parent default;
create publication puba for table parent with 
(publish_via_partition_root=false);
create publication pubb for table child where(a>10);

Using current SQL in patch:
(table child oid is 16387)
SELECT DISTINCT pg_get_expr(prqual, prrelid) FROM pg_publication p
INNER JOIN pg_publication_rel pr ON (p.oid = pr.prpubid)
WHERE pr.prrelid = 16387 AND p.pubname IN ( 'puba', 'pubb' )
AND NOT (select bool_or(puballtables)
FROM pg_publication
WHERE pubname in ( 'puba', 'pubb' ))
AND NOT EXISTS (SELECT 1
FROM pg_publication_namespace pn, pg_class c, pg_publication p
WHERE c.oid = 16387 AND c.relnamespace = pn.pnnspid AND p.oid = pn.pnpubid AND 
p.pubname IN ( 'puba', 'pubb' ));
pg_get_expr
-
 (a > 10)
(1 row)

I think there should be no filter in this case, because "puba" publish table 
child
without row filter. Thoughts?

To fix this problem, we could use pg_get_publication_tables function in
tablesync SQL to filter which publications the table belongs to. How about the
following SQL, it would return NULL for "puba".

SELECT DISTINCT pg_get_expr(pr.prqual, pr.prrelid)
FROM pg_publication p
LEFT OUTER JOIN pg_publication_rel pr
ON (p.oid = pr.prpubid AND pr.prrelid = 16387),
LATERAL pg_get_publication_tables(p.pubname) GPT
WHERE GPT.relid = 16387 AND p.pubname IN ( 'puba', 'pubb' );
 pg_get_expr
-
 (a > 10)

(2 rows)

Regards,
Tang

[PATCH]Add tab completion for foreigh table

2022-01-11 Thread tanghy.f...@fujitsu.com

Hi

Attached a patch to improve the tab completion for foreigh table.

Also modified some DOC description of ALTER TABLE at [1] in according with 
CREATE INDEX at [2].

In [1], we use "ALTER INDEX ATTACH PARTITION"
In [2], we use "ALTER INDEX ... ATTACH PARTITION"

I think the format in [2] is better.

[1] https://www.postgresql.org/docs/devel/sql-altertable.html
[2] https://www.postgresql.org/docs/devel/sql-createindex.html

Regards,
Tang


v1-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch
Description:  v1-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch

RE: Skipping logical replication transactions on subscriber side

2022-01-12 Thread tanghy.f...@fujitsu.com

On Wed, Jan 12, 2022 2:02 PM Masahiko Sawada  wrote:
> 
> I've attached an updated patch that incorporated all comments I got so far.
> 

Thanks for updating the patch. Here are some comments:

1)
+  Skip applying changes of the particular transaction.  If incoming data

Should "Skip" be "Skips" ?

2)
+  prepared by enabling two_phase on susbscriber.  After 
h
+  the logical replication successfully skips the transaction, the 
transaction

The "h" after word "After" seems redundant.

3)
+   Skipping the whole transaction includes skipping the cahnge that may not 
violate

"cahnge" should be "changes" I think.

4)
+/*
+ * True if we are skipping all data modification changes (INSERT, UPDATE, 
etc.) of
+ * the specified transaction at MySubscription->skipxid.  Once we start 
skipping
...
+ */
+static TransactionId skipping_xid = InvalidTransactionId;
+#define is_skipping_changes() (TransactionIdIsValid(skipping_xid))

Maybe we should modify this comment. Something like:
skipping_xid is valid if we are skipping all data modification changes ...

5)
+   if (!superuser())
+   ereport(ERROR,
+   
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+errmsg("must 
be superuser to set %s", "skip_xid")));

Should we change the message to "must be superuser to skip xid"?
Because the SQL stmt is "ALTER SUBSCRIPTION ... SKIP (xid = XXX)".

Regards,
Tang

RE: [PATCH]Add tab completion for foreigh table

2022-01-12 Thread tanghy.f...@fujitsu.com

On Thursday, January 13, 2022 12:38 PM, Fujii Masao 
 wrote:
> Isn't it better to tab-complete not only "PARTITION OF" but also "(" for 
> CREATE
> FOREIGN TABLE?

Thanks for your review. Left bracket completion added.

> IMO it's better to make the docs changes in separate patch because they are 
> not
> directly related to the improvement of tab-completion.

Agreed. The former one patch was divided into two. 
0001 patch, added tab completion for foreign table.
0002 patch, modified some doc description.

Regards,
Tang


v-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch
Description:  v-0001-add-tab-completion-for-PARTITION-OF-when-creating.patch


v2-0002-Modify-doc-description-for-ALTER-TABLE-in-accordi.patch
Description:  v2-0002-Modify-doc-description-for-ALTER-TABLE-in-accordi.patch

RE: Column Filtering in Logical Replication

2022-01-14 Thread tanghy.f...@fujitsu.com

On Friday, January 14, 2022 7:52 PM Amit Kapila  wrote:
> 
> On Wed, Jan 12, 2022 at 2:40 AM Justin Pryzby  wrote:
> >
> > Is there any coordination between the "column filter" patch and the "row
> > filter" patch ?  Are they both on track for PG15 ?  Has anybody run them
> > together ?
> >
> 
> The few things where I think we might need to define some common
> behavior are as follows:
> 

I tried some cases about the points you mentions, which can be taken as
reference.

> 1. Replica Identity handling: Currently the column filter patch gives
> an error during create/alter subscription if the specified column list
> is invalid (Replica Identity columns are missing). It also gives an
> error if the user tries to change the replica identity. However, it
> doesn't deal with cases where the user drops and adds a different
> primary key that has a different set of columns which can lead to
> failure during apply on the subscriber.
> 

An example for this scenario:
-- publisher --
create table tbl(a int primary key, b int);
create publication pub for table tbl(a);
alter table tbl drop CONSTRAINT tbl_pkey;
alter table tbl add primary key (b);

-- subscriber --
create table tbl(a int, b int);
create subscription sub connection 'port=5432 dbname=postgres' publication pub;

-- publisher --
insert into tbl values (1,1);

-- subscriber --
postgres=# select * from tbl;
 a | b
---+---
 1 |
(1 row)

update tbl set b=1 where a=1;
alter table tbl add primary key (b);

-- publisher --
delete from tbl;


The subscriber reported the following error message and DELETE failed in 
subscriber.
ERROR:  publisher did not send replica identity column expected by the logical 
replication target relation "public.tbl"
CONTEXT:  processing remote data during "DELETE" for replication target 
relation "public.tbl" in transaction 723 at 2022-01-14 13:11:51.514261+08

-- subscriber
postgres=# select * from tbl;
 a | b
---+---
 1 | 1
(1 row)

> I think another issue w.r.t column filter patch is that even while
> creating publication (even for 'insert' publications) it should check
> that all primary key columns must be part of published columns,
> otherwise, it can fail while applying on subscriber as it will try to
> insert NULL for the primary key column.
> 

For example:
-- publisher --
create table tbl(a int primary key, b int);
create publication pub for table tbl(a);
alter table tbl drop CONSTRAINT tbl_pkey;
alter table tbl add primary key (b);

-- subscriber --
create table tbl(a int, b int primary key);
create subscription sub connection 'port=5432 dbname=postgres' publication pub;

-- publisher --
insert into tbl values (1,1);

The subscriber reported the following error message and INSERT failed in 
subscriber.
ERROR:  null value in column "b" of relation "tbl" violates not-null constraint
DETAIL:  Failing row contains (1, null).

-- subscriber --
postgres=# select * from tbl;
 a | b
---+---
(0 rows)

> 2. Handling of partitioned tables vs. Replica Identity (RI): When
> adding a partitioned table with a column list to the publication (with
> publish_via_partition_root = false), we should check the Replica
> Identity of all its leaf partition as the RI on the partition is the
> one actually takes effect when publishing DML changes. We need to
> check RI while attaching the partition as well, as the newly added
> partitions will automatically become part of publication if the
> partitioned table is part of the publication. If we don't do this the
> later deletes/updates can fail.
> 

Please see the following 3 cases about partition.

Case1 (publish a parent table which has a partition table):

-- publisher --
create table parent (a int, b int) partition by range (a);
create table child partition of parent default;
create unique INDEX ON child (a,b);
alter table child alter a set not null;
alter table child alter b set not null;
alter table child replica identity using INDEX child_a_b_idx;
create publication pub for table parent(a) 
with(publish_via_partition_root=false);

-- subscriber --
create table parent (a int, b int) partition by range (a);
create table child partition of parent default;
create subscription sub connection 'port=5432 dbname=postgres' publication pub;

-- publisher --
insert into parent values (1,1);

-- subscriber --
postgres=# select * from parent;
 a | b
---+---
 1 |
(1 row)

-- add RI in subscriber to avoid other errors
update child set b=1 where a=1;
create unique INDEX ON child (a,b);
alter table child alter a set not null;
alter table child alter b set not null;
alter table child replica identity using INDEX child_a_b_idx;

-- publisher --
delete from parent;

The subscriber reported the following error message and DELETE failed in 
subscriber.
ERROR:  publisher did not send replica identity column expected by the logical 
replication target relation "public.child"
CONTEXT:  processing remote data during "DELETE" for replication target 
relation "public.child" in transaction 727

RE: Parallel Inserts in CREATE TABLE AS

2021-03-19 Thread tanghy.f...@fujitsu.com

From: Bharath Rupireddy 
>I analyzed performance of parallel inserts in CTAS for different cases
>with tuple size 32bytes, 59bytes, 241bytes and 1064bytes. We could
>gain if the tuple sizes are lower. But if the tuple size is larger
>i..e 1064bytes, there's a regression with parallel inserts.

Thanks for the update.
BTW, May be you have some more testcases that can reproduce this regression 
easily.
Can you please share some of the testcase (with big tuple size) with me.

Regards,
Tang

RE: [HACKERS] logical decoding of two-phase transactions

2021-03-22 Thread tanghy.f...@fujitsu.com

On Sunday, March 21, 2021 4:37 PM Amit Kapila  wrote:
>I have further updated the patch to implement unique GID on the
>subscriber-side as discussed in the nearby thread [1].

I did some tests(cross version & synchronous) on the latest patch set v65*, all 
tests passed. Here is the detail, please take it as a reference.

Case   |  version of publisher  |  version of subscriber  |  two_phase option   
 |  synchronous  |  expect result  |  result
---++-+--+---+-+-
 1 |  13|  14(patched)|  on 
 |  no   |  same as case3  |  ok
 2 |  13|  14(patched)|  off
 |  no   |  same as case3  |  ok
 3 |  13|  14(unpatched)  |  not support
 |  no   |  -  |  -
 4 |  14(patched)   |  13 |  not support
 |  no   |  same as case5  |  ok
 5 |  14(unpatched) |  13 |  not support
 |  no   |  -  |  -
 6 |  13|  14(patched)|  on 
 |  yes  |  same as case8  |  ok
 7 |  13|  14(patched)|  off
 |  yes  |  same as case8  |  ok
 8 |  13|  14(unpatched)  |  not support
 |  yes  |  -  |  -
 9 |  14(patched)   |  13 |  not support
 |  yes  |  same as case10 |  ok
 10|  14(unpatched) |  13 |  not support
 |  yes  |  -  |  -

remark:
(1)case3, 5 ,8, 10 is tested just for reference
(2)SQL been executed in each case
   scenario1begin…commit
   scenario2begin…prepare…commit

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2021-03-22 Thread tanghy.f...@fujitsu.com

On Tuesday, March 16, 2021 5:20 AM, Peter Eisentraut 
 wrote:

>The cases that complete with a query  
>result are not case insensitive right now.  This affects things like
>
>UPDATE T
>
>as well.  I think your first patch was basically right.  But we need to  
>understand that this affects all completions with query results, not  
>just the one you wanted to fix.  So you should analyze all the callers  
>and explain why the proposed change is appropriate.

Thanks for your review and suggestion. Please find attached patch V3 which was 
based on the first patch[1].
Difference from the first patch is:

Add tab completion support for all query results in psql.
complete_from_query
+complete_from_versioned_query
+complete_from_schema_query
+complete_from_versioned_schema_query

[1] 
https://www.postgresql.org/message-id/a63cbd45e3884cf9b3961c2a6a95dcb7%40G08CNEXMBPEKD05.g08.fujitsu.local

The modification to support case insensitive matching in " 
_complete_from_query" is based on "complete_from_const and "complete_from_list" 
.
Please let me know if you find anything insufficient.

Regards,
Tang

V3-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V3-0001-Support-tab-completion-with-a-query-result-for-upper.patch

RE: Logical Replication vs. 2PC

2021-03-24 Thread tanghy.f...@fujitsu.com

On Sunday, March 21, 2021 4:40 PM Amit Kapila  wrote:

>I have enhanced the patch for 2PC implementation on the
>subscriber-side as per the solution discussed here [1].

FYI.
I did the confirmation for the solution of unique GID problem raised at [1].
This problem in V61-patches at [2] is fixed in the latest V66-patches at [3].

B.T.W. NG log at V61-patches is attached, please take it as your reference.
   Test step is just the same as Amit said at [1].

[1] - 
https://www.postgresql.org/message-id/CAA4eK1+opiV4aFTmWWUF9h_32=hfpow9vzashart0ua5obr...@mail.gmail.com
[2] - 
https://www.postgresql.org/message-id/CAHut%2BPv3X7YH_nDEjH1ZJf5U6M6DHHtEjevu7PY5Dv5071jQ4A%40mail.gmail.com
[3] - 
https://www.postgresql.org/message-id/CAA4eK1JPEoYAkggmLqbdD%2BcF%3DkWNpLkZb_wJ8eqj0QD2AjBTBA%40mail.gmail.com

Regards,
Tang

subscriber.log
Description: subscriber.log

Copyright update for nbtsearch.c

2021-03-29 Thread tanghy.f...@fujitsu.com

Hi

Found one code committed at 2021.01.13 with copyright 2020.
Fix it in the attached patch. 

Regards,
Tang


0001-Update-copyright-for-2021.patch
Description: 0001-Update-copyright-for-2021.patch

RE: Copyright update for nbtsearch.c

2021-03-29 Thread tanghy.f...@fujitsu.com

On Tue, Mar 30, 2021 at 11:09:47AM +0800, Julien Rouhaud wrote:
> This is actually gistfuncs.c.

Thanks, you are right. There's typo in the mail title. 
Sorry for your confusion.

On Tuesday, March 30, 2021 12:08 PM, Michael Paquier wrote:
>Thanks.  If I look at the top of HEAD, it is not the only place.  I am
>to blame for some of them like src/common/sha1.  Will fix in a couple
>of minutes the whole set.

Thanks. Please feel free to fix this issue.

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2021-04-01 Thread tanghy.f...@fujitsu.com

On Wednesday, March 31, 2021 4:05 AM, David Zhang  wrote

>   8 postgres=# update tbl SET DATA =
>   9
>  10 postgres=# update TBL SET
>  11
>  12 postgres=#
>
>So, as you can see the difference is between line 8 and 10 in case 2. It 
>looks like the lowercase can auto complete more than the uppercase; 
>secondly, if you can add some test cases, it would be great.

Thanks for your test. I fix the bug and add some tests for it.
Please find attached the latest patch V4.

Differences from v3 are:
* fix an issue reported by Zhang [1] where a scenario was found which still 
wasn't able to realize tap completion in query.
* add some tap tests.

[1] 
https://www.postgresql.org/message-id/3140db2a-9808-c470-7e60-de39c431b3ab%40highgo.ca

Regards,
Tang


V4-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V4-0001-Support-tab-completion-with-a-query-result-for-upper.patch

Table refer leak in logical replication

2021-04-05 Thread tanghy.f...@fujitsu.com

Hi

I met a problem about trigger in logical replication.

I created a trigger after inserting data at subscriber, but there is a warning 
in the log of subscriber when the trigger fired:
WARNING: relcache reference leak: relation "xxx" not closed.

Example of the procedure:
--publisher--
create table test (a int primary key);
create publication pub for table test;

--subscriber--
create table test (a int primary key);
create subscription sub connection 'dbname=postgres' publication pub;
create function funcA() returns trigger as $$ begin return null; end; $$ 
language plpgsql;
create trigger my_trig after insert or update or delete on test for each row 
execute procedure funcA();
alter table test enable replica trigger my_trig;

--publisher--
insert into test values (6);

It seems an issue about reference leak. Anyone can fix this?

Regards,
Tang

RE: Table refer leak in logical replication

2021-04-05 Thread tanghy.f...@fujitsu.com

On Tuesday, April 6, 2021 2:25 PM Amit Langote  wrote:
>While updating the patch to do so, it occurred to me that maybe we
>could move the ExecInitResultRelation() call into
>create_estate_for_relation() too, in the spirit of removing
>duplicative code.  See attached updated patch.

Thanks for your fixing. The code LGTM.
Made a confirmation right now, the problem has been solved after applying your 
patch.

Regards,
Tang

Truncate in synchronous logical replication failed

2021-04-06 Thread tanghy.f...@fujitsu.com

Hi

I met a problem in synchronous logical replication. The client hangs when 
TRUNCATE TABLE at publisher.

Example of the procedure:
--publisher--
create table test (a int primary key);
create publication pub for table test;

--subscriber--
create table test (a int primary key);
create subscription sub connection 'dbname=postgres' publication pub;

Then, set synchronous_standby_names = 'sub’  on publisher, and reload publisher.

--publisher--
truncate test;

Then the client of publisher will wait for a long time. A moment later, the 
publisher and subscriber will report following errors.
Subscriber log
2021-04-07 12:13:07.700 CST [3542235] logical replication worker ERROR:  
terminating logical replication worker due to timeout
2021-04-07 12:13:07.722 CST [3542217] postmaster LOG:  background worker 
"logical replication worker" (PID 3542235) exited with exit code 1
2021-04-07 12:13:07.723 CST [3542357] logical replication worker LOG:  logical 
replication apply worker for subscription "sub" has started
2021-04-07 12:13:07.745 CST [3542357] logical replication worker ERROR:  could 
not start WAL streaming: ERROR:  replication slot "sub" is active for PID 
3542236
Publisher log
2021-04-07 12:13:07.745 CST [3542358] walsender ERROR:  replication slot "sub" 
is active for PID 3542236
2021-04-07 12:13:07.745 CST [3542358] walsender STATEMENT:  START_REPLICATION 
SLOT "sub" LOGICAL 0/169ECE8 (proto_version '2', publication_names '"pub"')

I checked the PG-DOC, found it says that “Replication of TRUNCATE commands is 
supported”[1], so maybe TRUNCATE is not supported in synchronous logical 
replication? 

If my understanding is right, maybe PG-DOC can be modified like this. Any 
thought?
Replication of TRUNCATE commands is supported
->
Replication of TRUNCATE commands is supported in asynchronous mode

[1]https://www.postgresql.org/docs/devel/logical-replication-restrictions.html

Regards,
Tang

RE: Truncate in synchronous logical replication failed

2021-04-07 Thread tanghy.f...@fujitsu.com

On Wednesday, April 7, 2021 5:28 PM Amit Kapila  wrote

>Can you please check if the behavior is the same for PG-13? This is
>just to ensure that we have not introduced any bug in PG-14.

Yes, same failure happens at PG-13, too.

Regards,
Tang

RE: Added schema level support for publication.

2021-10-13 Thread tanghy.f...@fujitsu.com

On Wednesday, October 13, 2021 4:10 PM Greg Nancarrow  
wrote:
> Also, I found the following scenario where the data is double-published:
> 
> (1) PUB:  CREATE PUBLICATION pub FOR TABLE sch1.sale_201901, TABLE
> sch1.sale_201902 WITH (publish_via_partition_root=true);
> (2) SUB:  CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres
> host=localhost port=5432' PUBLICATION pub;
> (3) PUB:  INSERT INTO sch.sale VALUES('2019-01-01', 'AU', 'cpu', 5),
> ('2019-01-02', 'AU', 'disk', 8);
> (4) SUB:  SELECT * FROM sch.sale;
> (5) PUB:  ALTER PUBLICATION pub ADD ALL TABLES IN SCHEMA sch;
> (6) SUB:  ALTER SUBSCRIPTION sub REFRESH PUBLICATION;
> (7) SUB:  SELECT * FROM sch.sale;
> 
> sale_date  | country_code | product_sku | units
> +--+-+---
>  2019-01-01 | AU   | cpu | 5
>  2019-01-02 | AU   | disk| 8
>  2019-01-01 | AU   | cpu | 5
>  2019-01-02 | AU   | disk| 8
> 
> 

I changed your test step in (5) and used "ADD TABLE" command as below:
ALTER PUBLICATION pub ADD TABLE sch.sale;

I could get the same result on HEAD.
So I think it's not a problem related to this patch.
Maybe we can post this issue in a new thread.

Regards
Tang

RE: Added schema level support for publication.

2021-10-18 Thread tanghy.f...@fujitsu.com

On Monday, October 18, 2021 8:23 PM vignesh C  wrote:
> 
> Thanks for the comments, the attached v42 patch has the fixes for the same.

Thanks for your new patch.

I tried your patch and found that the permission check for superuser didn't 
work.

For example:
postgres=# create role r1;
CREATE ROLE
postgres=# grant all privileges on database postgres to r1;
GRANT
postgres=# set role r1;
SET
postgres=> create schema s1;
CREATE SCHEMA
postgres=> create publication pub for all tables in schema s1;
CREATE PUBLICATION

Role r1 is not superuser, but this role could create publication for all tables 
in schema
successfully, I think it is related the following change. List schemaidlist was
not assigned yet. I think we should check it later.

@@ -165,6 +265,12 @@ CreatePublication(ParseState *pstate, 
CreatePublicationStmt *stmt)
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
 errmsg("must be superuser to create FOR ALL 
TABLES publication")));
 
+   /* FOR ALL TABLES IN SCHEMA requires superuser */
+   if (list_length(schemaidlist) > 0 && !superuser())
+   ereport(ERROR,
+   errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
+   errmsg("must be superuser to create FOR ALL 
TABLES IN SCHEMA publication"));
+
rel = table_open(PublicationRelationId, RowExclusiveLock);
 
/* Check if name is used */

Regards
Tang

RE: Added schema level support for publication.

2021-10-18 Thread tanghy.f...@fujitsu.com

On Tuesday, October 19, 2021 12:57 PM Amit Kapila  
wrote:
> 
> On Tue, Oct 19, 2021 at 9:15 AM [email protected]
>  wrote:
> >
> > On Monday, October 18, 2021 8:23 PM vignesh C 
> wrote:
> > >
> > > Thanks for the comments, the attached v42 patch has the fixes for the 
> > > same.
> >
> > Thanks for your new patch.
> >
> > I tried your patch and found that the permission check for superuser didn't 
> > work.
> >
> > For example:
> > postgres=# create role r1;
> > CREATE ROLE
> > postgres=# grant all privileges on database postgres to r1;
> > GRANT
> > postgres=# set role r1;
> > SET
> > postgres=> create schema s1;
> > CREATE SCHEMA
> > postgres=> create publication pub for all tables in schema s1;
> > CREATE PUBLICATION
> >
> > Role r1 is not superuser, but this role could create publication for all 
> > tables in
> schema
> > successfully, I think it is related the following change. List schemaidlist 
> > was
> > not assigned yet. I think we should check it later.
> >
> 
> It seems this got broken in yesterday's patch version. Do you think it
> is a good idea to add a test for this case?
> 

Agreed. Thanks for your suggestion.

I tried to add this test to publication.sql, a patch diff file for this test is 
attached.

Regards
Tang


Topup-permissions-test_diff
Description: Topup-permissions-test_diff

RE: Added schema level support for publication.

2021-10-19 Thread tanghy.f...@fujitsu.com

On Tuesday, October 19, 2021 11:42 PM vignesh C  wrote:
> 
> This issue got induced in the v42 version, attached v43 patch has the
> fixes for the same.
> 

Thanks for your new patch. I confirmed that this issue has be fixed.

All regression tests passed. 
I also tested V43 in some other scenarios and found no issue.
So the v43 patch LGTM.

Regards
Tang

RE: Skipping logical replication transactions on subscriber side

2021-10-26 Thread tanghy.f...@fujitsu.com

On Thursday, October 21, 2021 12:59 PM Masahiko Sawada  
wrote:
> 
> I've attached updated patches. In this version, in addition to the
> review comments I go so far, I've changed the view name from
> pg_stat_subscription_errors to pg_stat_subscription_workers as per the
> discussion on including xact info to the view on another thread[1].
> I’ve also changed related codes accordingly.
> 

Thanks for your patch.
I have some minor comments on your 0001 and 0002 patch.

1. For 0001 patch, src/backend/catalog/system_views.sql
+CREATE VIEW pg_stat_subscription_workers AS
+SELECT
+   e.subid,
+   s.subname,
+   e.subrelid,
+   e.relid,
+   e.command,
+   e.xid,
+   e.count,
+   e.error_message,
+   e.last_error_time,
+   e.stats_reset
+FROM (SELECT
+  oid as subid,
...

Some places use TABs, I think it's better to use spaces here, to be consistent
with other places in this file.

2. For 0002 patch, I think we can add some changes to tab-complete.c, maybe
something like this:

diff --git a/src/bin/psql/tab-complete.c b/src/bin/psql/tab-complete.c
index ecae9df8ed..96665f6115 100644
--- a/src/bin/psql/tab-complete.c
+++ b/src/bin/psql/tab-complete.c
@@ -1654,7 +1654,7 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION  */
else if (Matches("ALTER", "SUBSCRIPTION", MatchAny))
COMPLETE_WITH("CONNECTION", "ENABLE", "DISABLE", "OWNER TO",
- "RENAME TO", "REFRESH PUBLICATION", 
"SET",
+ "RENAME TO", "REFRESH PUBLICATION", 
"SET", "RESET",
  "ADD PUBLICATION", "DROP 
PUBLICATION");
/* ALTER SUBSCRIPTION  REFRESH PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) &&
@@ -1670,6 +1670,12 @@ psql_completion(const char *text, int start, int end)
/* ALTER SUBSCRIPTION  SET ( */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && 
TailMatches("SET", "("))
COMPLETE_WITH("binary", "slot_name", "streaming", 
"synchronous_commit");
+   /* ALTER SUBSCRIPTION  RESET */
+   else if (Matches("ALTER", "SUBSCRIPTION", MatchAny, "RESET"))
+   COMPLETE_WITH("(");
+   /* ALTER SUBSCRIPTION  RESET ( */
+   else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && 
TailMatches("RESET", "("))
+   COMPLETE_WITH("binary", "streaming", "synchronous_commit");
/* ALTER SUBSCRIPTION  SET PUBLICATION */
else if (HeadMatches("ALTER", "SUBSCRIPTION", MatchAny) && 
TailMatches("SET", "PUBLICATION"))
{

Regards
Tang

RE: Added schema level support for publication.

2021-10-28 Thread tanghy.f...@fujitsu.com

On Friday, October 29, 2021 12:35 PM, Amit Kapila  
wrote:
> 
> On Thu, Oct 28, 2021 at 9:55 AM vignesh C  wrote:
> >
> > Thanks for committing the patch, please find the remaining patches attached.
> > Thanks Hou Zhijie and Greg Nancarrow for sharing a few comments
> > offline, I have fixed those in the attached patch.
> >
> 
> Pushed the first test case patch. About
> v48-0002-Add-new-pg_publication_objects-view-to-display-T, I think it
> doesn't display anything for "for all tables" publication. Instead of
> selecting from pg_publication_rel, you can use the existing view
> pg_publication_tables to solve that problem.
> 
> Having said that, I am not completely sure about the value of this new
> view pg_publication_objects which displays all objects of
> publications. I see that users might want to see all the objects that
> the publication publishes and when we include other objects like
> sequences it might be more helpful.
> 
> Sawada-San, others, what do you think? Is it really useful to have such a 
> view?
> 
> One point to think is if we introduce such a view then how it should
> behave for schema objects? Do we need to display only schemas or
> additionally all the tables in the schema as well? If you follow the
> above suggestion of mine then I think it will display both schemas
> published and tables in that schema that will be considered for
> publishing.
> 

Personally, if I want to see ALL the published objects in a publication, I 
would use
'\dRp+' command. I think there might not be too much value to have this view.

Regards
Tang

RE: Skipping logical replication transactions on subscriber side

2021-11-02 Thread tanghy.f...@fujitsu.com

On Friday, October 29, 2021 1:24 PM Masahiko Sawada  
wrote:
> 
> I've attached a new version patch. Since the syntax of skipping
> transaction id is under the discussion I've attached only the error
> reporting patch for now.
> 
> 

Thanks for your patch. Some comments on 026_error_report.pl file.

1. For test_tab_streaming table, the test only checks initial table sync and
doesn't check anything related to the new view pg_stat_subscription_workers. Do
you want to add more test cases for it?

2. The subscriptions are created with two_phase option on, but I didn't see two
phase transactions. Should we add some test cases for two phase transactions?

3. Errors reported by table sync worker will be cleaned up if the table sync
worker finish, should we add this case to the test? (After checking the table
sync worker's error in the view, delete data which caused the error, then check
the view again after table sync worker finished.)

Regards
Tang

RE: row filtering for logical replication

2021-11-08 Thread tanghy.f...@fujitsu.com

On Friday, November 5, 2021 1:14 PM, Peter Smith  wrote:
> 
> PSA new set of v37* patches.
> 

Thanks for your patch. I have a problem when using this patch.

The document about "create publication" in patch says:

   The WHERE clause should contain only columns that are
   part of the primary key or be covered  by REPLICA
   IDENTITY otherwise, DELETE operations will not
   be replicated.

But I tried this patch, the columns which could be contained in WHERE clause 
must be
covered by REPLICA IDENTITY, but it doesn't matter if they are part of the 
primary key. 
(We can see it in Case 4 of publication.sql, too.) So maybe we should modify 
the document.

Regards
Tang

[BUG]Invalidate relcache when setting REPLICA IDENTITY

2021-11-09 Thread tanghy.f...@fujitsu.com

Hi

I think I found a bug related to logical replication(REPLICA IDENTITY in 
specific).
If I change REPLICA IDENTITY after creating publication,  the DELETE/UPDATE 
operations won't be replicated as expected. 

For example:
-- publisher
CREATE TABLE tbl(a int, b int);
ALTER TABLE tbl ALTER COLUMN a SET NOT NULL;
CREATE UNIQUE INDEX idx_a ON tbl(a);
ALTER TABLE tbl ALTER COLUMN b SET NOT NULL;
CREATE UNIQUE INDEX idx_b ON tbl(b);
ALTER TABLE tbl REPLICA IDENTITY USING INDEX idx_a;
CREATE PUBLICATION pub FOR TABLE tbl;
ALTER TABLE tbl REPLICA IDENTITY USING INDEX idx_b;
INSERT INTO tbl VALUES (1,1);

-- subscriber
CREATE TABLE tbl(a int, b int);
ALTER TABLE tbl ALTER COLUMN b SET NOT NULL;
CREATE UNIQUE INDEX idx_b ON tbl(b);
ALTER TABLE tbl REPLICA IDENTITY USING INDEX idx_b;
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres port=5432' PUBLICATION pub;
SELECT * FROM tbl;

-- publisher
postgres=# UPDATE tbl SET a=-a;
UPDATE 1
postgres=# SELECT * FROM tbl;
a  | b
+---
 -1 | 1 
(1 row)

-- subscriber
postgres=# SELECT * FROM tbl;
 a | b
---+---
 1 | 1
(1 row)

(a in subscriber should be -1)

But if I restart a psql client before executing UPDATE operation in 
publication, it works well. 
So I think the problem is: when using "ALTER TABLE ... REPLICA IDENTITY USING 
INDEX ...", relcahe was not invalidated.

Attach a patch to fix it, I also added a test case for it.(Hou helped me to 
write/review this patch, which is very kind of him)

FYI: I also tested that same problem could be reproduced on PG10 ~ PG14. 
(Logical replication is introduced in PG10.)
So I think backpatch is needed here.

Regards
Tang


0001-Invalidate-relcache-when-setting-REPLICA-IDENTITY.patch
Description:  0001-Invalidate-relcache-when-setting-REPLICA-IDENTITY.patch

RE: Logical replication timeout problem

2021-11-12 Thread tanghy.f...@fujitsu.com

On Friday, November 12, 2021 2:24 PM Amit Kapila  
wrote:
> 
> On Thu, Nov 11, 2021 at 11:15 PM Fabrice Chapuis
>  wrote:
> >
> > Hello,
> > Our lab is ready now. Amit,  I compile Postgres 10.18 with your patch.Tang, 
> > I
> used your script to configure logical replication between 2 databases and to
> generate 10 million entries in an unreplicated foo table. On a standalone 
> instance
> no error message appears in log.
> > I activate the physical replication between 2 nodes, and I got following 
> > error:
> >
> > 2021-11-10 10:49:12.297 CET [12126] LOG:  attempt to send keep alive
> message
> > 2021-11-10 10:49:12.297 CET [12126] STATEMENT:  START_REPLICATION
> 0/300 TIMELINE 1
> > 2021-11-10 10:49:15.127 CET [12064] FATAL:  terminating logical replication
> worker due to administrator command
> > 2021-11-10 10:49:15.127 CET [12036] LOG:  worker process: logical 
> > replication
> worker for subscription 16413 (PID 12064) exited with exit code 1
> > 2021-11-10 10:49:15.155 CET [12126] LOG:  attempt to send keep alive
> message
> >
> > This message look like strange because no admin command have been executed
> during data load.
> > I did not find any error related to the timeout.
> > The message coming from the modification made with the patch comes back all
> the time: attempt to send keep alive message. But there is no "sent keep alive
> message".
> >
> > Why logical replication worker exit when physical replication is configured?
> >
> 
> I am also not sure why that happened may be due to
> max_worker_processes reaching its limit. This can happen because it
> seems you configured both publisher and subscriber in the same
> cluster. Tang, did you also see the same problem?
> 

No.
I used the default max_worker_processes value, ran logical replication and
physical replication at the same time. I also changed the data in table on
publisher. But didn't see the same problem.

Regards
Tang

RE: row filtering for logical replication

2021-11-14 Thread tanghy.f...@fujitsu.com

On Wednesday, November 10, 2021 7:46 AM Peter Smith  
wrote:
> 
> On Tue, Nov 9, 2021 at 2:03 PM [email protected]
>  wrote:
> >
> > On Friday, November 5, 2021 1:14 PM, Peter Smith 
> wrote:
> > >
> > > PSA new set of v37* patches.
> > >
> >
> > Thanks for your patch. I have a problem when using this patch.
> >
> > The document about "create publication" in patch says:
> >
> >The WHERE clause should contain only columns that are
> >part of the primary key or be covered  by REPLICA
> >IDENTITY otherwise, DELETE operations will
> not
> >be replicated.
> >
> > But I tried this patch, the columns which could be contained in WHERE clause
> must be
> > covered by REPLICA IDENTITY, but it doesn't matter if they are part of the
> primary key.
> > (We can see it in Case 4 of publication.sql, too.) So maybe we should 
> > modify the
> document.
> >
> 
> PG Docs is changed in v38-0004 [1]. Please check if it is OK.
> 

Thanks, this change looks good to me.

Regards
Tang

RE: row filtering for logical replication

2021-11-14 Thread tanghy.f...@fujitsu.com

On Friday, November 12, 2021 6:20 PM Ajin Cherian  wrote:
> 
> Attaching version 39-
> 

Thanks for the new patch.

I met a problem when using "ALTER PUBLICATION ... SET TABLE ... WHERE ...", the
publisher was crashed after executing this statement.

Here is some information about this problem.

Steps to reproduce:
-- publisher
create table t(a int primary key, b int);
create publication pub for table t where (a>5);

-- subscriber
create table t(a int primary key, b int);
create subscription sub connection 'dbname=postgres port=5432' publication pub;

-- publisher
insert into t values (1, 2);
alter publication pub set table t where (a>7);


Publisher log:
2021-11-15 13:36:54.997 CST [3319891] LOG:  logical decoding found consistent 
point at 0/15208B8
2021-11-15 13:36:54.997 CST [3319891] DETAIL:  There are no running 
transactions.
2021-11-15 13:36:54.997 CST [3319891] STATEMENT:  START_REPLICATION SLOT "sub" 
LOGICAL 0/0 (proto_version '3', publication_names '"pub"')
double free or corruption (out)
2021-11-15 13:36:55.072 CST [3319746] LOG:  received fast shutdown request
2021-11-15 13:36:55.073 CST [3319746] LOG:  aborting any active transactions
2021-11-15 13:36:55.105 CST [3319746] LOG:  background worker "logical 
replication launcher" (PID 3319874) exited with exit code 1
2021-11-15 13:36:55.105 CST [3319869] LOG:  shutting down
2021-11-15 13:36:55.554 CST [3319746] LOG:  server process (PID 3319891) was 
terminated by signal 6: Aborted
2021-11-15 13:36:55.554 CST [3319746] DETAIL:  Failed process was running: 
START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', publication_names 
'"pub"')
2021-11-15 13:36:55.554 CST [3319746] LOG:  terminating any other active server 
processes


Backtrace is attached. I think maybe the problem is related to the below change 
in 0003 patch:

+   free(entry->exprstate);

Regards
Tang
(gdb) bt
#0  0x7f132335970f in raise () from /lib64/libc.so.6
#1  0x7f1323343b25 in abort () from /lib64/libc.so.6
#2  0x7f132339c897 in __libc_message () from /lib64/libc.so.6
#3  0x7f13233a2fdc in malloc_printerr () from /lib64/libc.so.6
#4  0x7f13233a4d80 in _int_free () from /lib64/libc.so.6
#5  0x7f1317d4de6a in rel_sync_cache_relation_cb (arg=, 
relid=) at pgoutput.c:1884
#6  0x00dd4191 in LocalExecuteInvalidationMessage 
(msg=msg@entry=0x2f6c7c8) at inval.c:653
#7  0x00accfc2 in ReorderBufferExecuteInvalidations (nmsgs=4, 
msgs=0x2f6c7a8) at reorderbuffer.c:3261
#8  0x00ad1f6f in ReorderBufferProcessTXN (rb=rb@entry=0x2f6c4a0, 
txn=0x2f8a478, commit_lsn=commit_lsn@entry=22148728, snapshot_now=,
command_id=command_id@entry=0, streaming=streaming@entry=false) at 
reorderbuffer.c:2333
#9  0x00ad35f6 in ReorderBufferReplay (txn=, 
rb=rb@entry=0x2f6c4a0, xid=xid@entry=713, commit_lsn=commit_lsn@entry=22148728,
end_lsn=end_lsn@entry=22148912, 
commit_time=commit_time@entry=690256439713242, origin_id=0, origin_lsn=0) at 
reorderbuffer.c:2622
#10 0x00ad4200 in ReorderBufferCommit (rb=0x2f6c4a0, xid=xid@entry=713, 
commit_lsn=22148728, end_lsn=22148912, 
commit_time=commit_time@entry=690256439713242,
origin_id=origin_id@entry=0, origin_lsn=0) at reorderbuffer.c:2646
#11 0x00ab1b59 in DecodeCommit (ctx=ctx@entry=0x2f6a490, 
buf=buf@entry=0x7ffd7dc0a6c0, parsed=parsed@entry=0x7ffd7dc0a550, 
xid=xid@entry=713,
two_phase=) at decode.c:744
#12 0x00ab20c3 in DecodeXactOp (ctx=ctx@entry=0x2f6a490, 
buf=buf@entry=0x7ffd7dc0a6c0) at decode.c:279
#13 0x00ab3fd9 in LogicalDecodingProcessRecord (ctx=0x2f6a490, 
record=0x2f6a850) at decode.c:142
#14 0x00b0d23d in XLogSendLogical () at walsender.c:2992
#15 0x00b11ba3 in WalSndLoop (send_data=send_data@entry=0xb0d1d5 
) at walsender.c:2422
#16 0x00b122a5 in StartLogicalReplication (cmd=cmd@entry=0x2f26e38) at 
walsender.c:1315
#17 0x00b13b18 in exec_replication_command (
cmd_string=cmd_string@entry=0x2e8f590 "START_REPLICATION SLOT \"sub\" 
LOGICAL 0/0 (proto_version '3', publication_names '\"pub\"')") at 
walsender.c:1762
#18 0x00bbae43 in PostgresMain (dbname=, 
username=) at postgres.c:4493
#19 0x00a84906 in BackendRun (port=port@entry=0x2eb5580) at 
postmaster.c:4584
#20 0x00a8ae73 in BackendStartup (port=port@entry=0x2eb5580) at 
postmaster.c:4312
#21 0x00a8b2b2 in ServerLoop () at postmaster.c:1801
#22 0x00a8e010 in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x2e88b00) at postmaster.c:1473
#23 0x0091cb1a in main (argc=3, argv=0x2e88b00) at main.c:198

RE: row filtering for logical replication

2021-11-16 Thread tanghy.f...@fujitsu.com

On Friday, November 12, 2021 6:20 PM Ajin Cherian  wrote:
> 
> Attaching version 39-
>

I met another problem when filtering out with the operator '~'.
Data can't be replicated as expected.

For example:
-- publisher
create table t (a text primary key);
create publication pub for table t where (a ~ 'aaa');

-- subscriber
create table t (a text primary key);
create subscription sub connection 'port=5432' publication pub;

-- publisher
insert into t values ('ab');
insert into t values ('abc');
postgres=# select * from t where (a ~ 'aaa');
a
-
 ab
 abc
(2 rows)

-- subscriber
postgres=# select * from t;
   a

 ab
(1 row)

The second record can’t be replicated.

By the way, when only applied 0001 patch, I couldn't reproduce this bug.
So, I think it was related to the later patches.

Regards
Tang

RE: Skipping logical replication transactions on subscriber side

2021-11-18 Thread tanghy.f...@fujitsu.com

On Tuesday, November 16, 2021 2:31 PM Masahiko Sawada  
wrote:
> 
> Right. I've fixed this issue and attached an updated patch.
> 
>

Thanks for your patch.

I read the discussion about stats entries for table sync worker[1], the
statistics are retained after table sync worker finished its jobs and user can 
remove
them via pg_stat_reset_subscription_worker function.

But I notice that, if a table sync worker finished its jobs, the error reported 
by
this worker will not be shown in the pg_stat_subscription_workers view. (It 
seemed caused by this condition: "WHERE srsubstate <> 'r'") Is it intentional? 
I think this may cause a result that users don't know the statistics are still 
exist, and won't remove the statistics manually. And that is not friendly to 
users' storage, right?

[1] 
https://www.postgresql.org/message-id/CAD21AoAT42mhcqeB1jPfRL1%2BEUHbZk8MMY_fBgsyZvJeKNpG%2Bw%40mail.gmail.com

Regards
Tang

RE: row filtering for logical replication

2021-11-22 Thread tanghy.f...@fujitsu.com

On Thursday, November 18, 2021 9:34 AM Peter Smith  
wrote:
> 
> PSA new set of v40* patches.
> 

I found a problem on v40. The check for Replica Identity in WHERE clause is not 
working properly.

For example:
postgres=# create table tbl(a int primary key, b int);
CREATE TABLE
postgres=# create publication pub1 for table tbl where (a>10 and b>10);
CREATE PUBLICATION

I think it should report an error because column b is not part of Replica 
Identity.
This seems due to "return true" in rowfilter_expr_replident_walker function,
maybe we should remove it.

Besides, a small comment on 0004 patch:

+* Multiple row-filter expressions for the same publication 
will later be
+* combined by the COPY using OR, but this means if any of the 
filters is

Should we change it to: 
Multiple row-filter expressions for the same table ...

Regards,
Tang

[BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-23 Thread tanghy.f...@fujitsu.com

Hi, 

I think I found a problem related to replica identity. According to PG doc at 
[1], replica identity includes only columns marked NOT NULL. 
But in fact users can accidentally break this rule as follows:

create table tbl (a int not null unique);
alter table tbl replica identity using INDEX tbl_a_key;
alter table tbl alter column a drop not null;
insert into tbl values (null);

As a result, some operations on newly added null value will cause unexpected 
failure as below:

postgres=# delete from tbl;
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.

In the log, I can also see an assertion failure when deleting null value:
TRAP: FailedAssertion("!nulls[i]", File: "heapam.c", Line: 8439, PID: 274656)

To solve the above problem, I think it's better to add a check when executing  
ALTER COLUMN DROP NOT NULL,
and report an error if this column is part of replica identity.

Attaching a patch that disallow DROP NOT NULL on a column if it's in a REPLICA 
IDENTITY index. Also added a test in it.
Thanks Hou for helping me write/review this patch.

By the way, replica identity was introduced in PG9.4, so this problem exists in
all supported versions.

[1] https://www.postgresql.org/docs/current/sql-altertable.html

Regards,
Tang


0001-Disallow-DROP-NOT-NULL-for-column-in-the-REPLICA-IDE.patch
Description:  0001-Disallow-DROP-NOT-NULL-for-column-in-the-REPLICA-IDE.patch

RE: [BUG]Missing REPLICA IDENTITY check when DROP NOT NULL

2021-11-24 Thread tanghy.f...@fujitsu.com

On Wed, Nov 24, 2021 7:29 PM, Michael Paquier  wrote:
> 
> On Wed, Nov 24, 2021 at 07:04:51AM +, [email protected] wrote:
> > create table tbl (a int not null unique);
> > alter table tbl replica identity using INDEX tbl_a_key;
> > alter table tbl alter column a drop not null;
> > insert into tbl values (null);
> 
> Oops.  Yes, that's obviously not good.
> 
> > To solve the above problem, I think it's better to add a check when
> > executing  ALTER COLUMN DROP NOT NULL,
> > and report an error if this column is part of replica identity.
> 
> I'd say that you are right to block the operation.  I'll try to play a
> bit with this stuff tomorrow.
> 
> > Attaching a patch that disallow DROP NOT NULL on a column if it's in
> > a REPLICA IDENTITY index. Also added a test in it.
> 
>if (indexStruct->indkey.values[i] == attnum)
>ereport(ERROR,
>(errcode(ERRCODE_INVALID_TABLE_DEFINITION),
> -   errmsg("column \"%s\" is in a primary key",
> +   errmsg(ngettext("column \"%s\" is in a primary key",
> +   "column \"%s\" is in a REPLICA IDENTITY 
> index",
> +   indexStruct->indisprimary),
> colName)));
> Using ngettext() looks incorrect to me here as it is used to get the
> plural form of a string, so you'd better make these completely
> separated instead.

Thanks for your comment. I agree with you.
I have fixed it and attached v2 patch.

Regards,
Tang


v2-0001-Disallow-DROP-NOT-NULL-on-a-column-in-the-REPLICA.patch
Description:  v2-0001-Disallow-DROP-NOT-NULL-on-a-column-in-the-REPLICA.patch

RE: Skipping logical replication transactions on subscriber side

2021-11-25 Thread tanghy.f...@fujitsu.com

On Friday, November 26, 2021 9:30 AM Masahiko Sawada  
wrote:
> 
> Indeed. Attached an updated patch. Thanks!

Thanks for your patch. A small comment:

+   OID of the relation that the worker is synchronizing; null for the
+   main apply worker

Should we modify it to "OID of the relation that the worker was synchronizing 
..."?

The rest of the patch LGTM.

Regards
Tang

RE: row filtering for logical replication

2021-11-29 Thread tanghy.f...@fujitsu.com

On Thursday, November 25, 2021 11:22 AM Peter Smith  
wrote:
> 
> Thanks for all the review comments so far! We are endeavouring to keep
> pace with them.
> 
> All feedback is being tracked and we will fix and/or reply to everything ASAP.
> 
> Meanwhile, PSA the latest set of v42* patches.
> 
> This version was mostly a patch restructuring exercise but it also
> addresses some minor review comments in passing.
> 

Thanks for your patch.
I have two comments on the document in 0001 patch.

1.
+   New row is used and it contains all columns. A NULL value
+   causes the expression to evaluate to false; avoid using columns without

I don't quite understand this sentence 'A NULL value causes the expression to 
evaluate to false'. 
The expression contains NULL value can also return true. Could you be more 
specific?

For example:

postgres=# select null or true;
 ?column?
--
 t
(1 row)

2.
+   at all then all other filters become redundant. If the subscriber is a
+   PostgreSQL version before 15 then any row 
filtering
+   is ignored.

If the subscriber is a PostgreSQL version before 15, it seems row filtering will
be ignored only when copying initial data, the later changes will not be 
ignored in row
filtering. Should we make it clear in document?

Regards,
Tang

RE: Skipping logical replication transactions on subscriber side

2022-01-17 Thread tanghy.f...@fujitsu.com

On Mon, Jan 17, 2022 2:18 PM Masahiko Sawada  wrote:
> 
> I've attached an updated patch. Please review it.
> 

Thanks for updating the patch. Few comments:

1)
/* Two_phase is only supported in v15 and higher */
if (pset.sversion >= 15)
appendPQExpBuffer(&buf,
- ", subtwophasestate 
AS \"%s\"\n",
- gettext_noop("Two 
phase commit"));
+ ", subtwophasestate 
AS \"%s\"\n"
+ ", subskipxid AS 
\"%s\"\n",
+ gettext_noop("Two 
phase commit"),
+ gettext_noop("Skip 
XID"));

appendPQExpBuffer(&buf,
  ",  subsynccommit AS \"%s\"\n"

I think "skip xid" should be mentioned in the comment. Maybe it could be 
changed to:
"Two_phase and skip XID are only supported in v15 and higher"

2) The following two places are not consistent in whether "= value" is surround
with square brackets.

+ALTER SUBSCRIPTION name SKIP ( 
skip_option [= value] [, ... ] )

+SKIP ( skip_option = value [, ... ] )

Should we modify the first place to:
+ALTER SUBSCRIPTION name SKIP ( 
skip_option = value [, ... ] )

Because currently there is only one skip_option - xid, and a parameter must be
specified when using it.

3)
+* Protect subskip_xid of pg_subscription from being concurrently 
updated
+* while clearing it.

"subskip_xid" should be "subskipxid" I think.

4)
+/*
+ * Start skipping changes of the transaction if the given XID matches the
+ * transaction ID specified by skip_xid option.
+ */

The option name was "skip_xid" in the previous version, and it is "xid" in
latest patch. So should we modify "skip_xid option" to "skip xid option", or
"skip option xid", or something else?

Also the following place has similar issue:
+ * the subscription if hte user has specified skip_xid. Once we start skipping

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-19 Thread tanghy.f...@fujitsu.com

On Sunday, January 16, 2022 3:51 AM, Tom Lane  said:
> Peter Eisentraut  writes:
> > The rest of the patch seems ok in principle, since AFAICT enums are the
> > only query result in tab-complete.c that are not identifiers and thus
> > subject to case issues.
> 
> I spent some time looking at this patch.  I'm not very happy with it,
> for two reasons:
> 
> 1. The downcasing logic in the patch bears very little resemblance
> to the backend's actual downcasing logic, which can be found in
> src/backend/parser/scansup.c's downcase_identifier().  Notably,
> the patch's restriction to only convert all-ASCII strings seems
> indefensible, because that's not how things really work.  I fear
> we can't always exactly duplicate the backend's behavior, because
> it's dependent on the server's locale and encoding; but I think
> we should at least get it right in the common case where psql is
> using the same locale and encoding as the server.

Thanks for your suggestion, I removed ASCII strings check function
and added single byte encoding check just like downcase_identifier.
Also added PGCLIENTENCODING setting in the test script to make 
test cases pass.
Now the patch supports tab-completion with none-quoted upper characters
available when client encoding is in single byte.

> 2. I don't think there's been much thought about the larger picture
> of what is to be accomplished.  Right now, we successfully
> tab-complete inputs that are prefixes of the canonical spelling (per
> quote_identifier) of the object's name, and don't try at all for
> non-canonical spellings.  I'm on board with trying to allow some of
> the latter but I'm not sure that this patch represents much forward
> progress.  To be definite about it, suppose we have a DB containing
> just two tables whose names start with "m", say mytab and mixedTab.
> Then:
> 
> (a) m immediately completes mytab, ignoring mixedTab
> 
> (b) "m immediately completes "mixedTab", ignoring mytab
> 
> (c) "my fails to find anything
> 
> (d) mi fails to find anything
> 
> (e) M fails to find anything
> 
> This patch proposes to improve case (e), but to my taste cases (a)
> through (c) are much bigger problems.  It'd be nice if (d) worked too
> --- that'd require injecting a double-quote where the user had not
> typed one, but we already do the equivalent thing with single-quotes
> for file names, so why not?  (Although after fighting with readline
> yesterday to try to get it to handle single-quoted enum labels sanely,
> I'm not 100% sure if (d) is possible.)
> 
> Also, even for case (e), what we have with this patch is that it
> immediately completes mytab, ignoring mixedTab.  Is that what we want?
> Another example is that miX fails to find anything, which seems
> like a POLA violation given that mY completes to mytab.
>
> I'm not certain how many of these alternatives can be supported
> without introducing ambiguity that wasn't there before (which'd
> manifest as failing to complete in cases where the existing code
> chooses an alternative just fine).  But I really don't like the
> existing behavior for (b) and (c) --- I should be able to spell
> a name with double quotes if I want, without losing completion
> support.

You are right, it's more convenient in that way.
I haven't thought about it before. By now, the patch suppose:
If user needs to type a table with name in upper character, 
they should input the double quotes by themselves. If the double 
quote is input by a user, only table name with upper character could be 
searched.

I may try to implement as you expected but it seems not so easy. 
(as you said, without introducing ambiguity that wasn't there before)
I'd appreciate if someone could give me a hint/hand on this.

> BTW, another thing that maybe we should think about is how this
> interacts with the pattern matching capability in \d and friends.
> If people can tab-complete non-canonical spellings, they might
> expect the same spellings to work in \d.  I don't say that this
> patch has to fix that, but we might want to look and be sure we're
> not painting ourselves into a corner (especially since I see
> that we already perform tab-completion in that context).

Yes. Agreed, if we solve the previous problem, 
meta-command tab completion should also be considered.

Regards,
Tang


v12-0001-Support-tab-completion-with-a-query-result-for-u.patch
Description:  v12-0001-Support-tab-completion-with-a-query-result-for-u.patch

RE: row filtering for logical replication

2022-01-20 Thread tanghy.f...@fujitsu.com

On Thu, Jan 20, 2022 9:13 AM [email protected]  
wrote:
> Attach the V68 patch set which addressed the above comments and changes.
> The version patch also fix the error message mentioned by Greg[1]
> 

I saw a problem about this patch, which is related to Replica Identity check.

For example:
-- publisher --
create table tbl (a int);
create publication pub for table tbl where (a>10) with (publish='delete');
insert into tbl values (1);
update tbl set a=a+1;

postgres=# update tbl set a=a+1;
ERROR:  cannot update table "tbl"
DETAIL:  Column "a" used in the publication WHERE expression is not part of the 
replica identity.

I think it shouldn't report the error because the publication didn't publish 
UPDATES.
Thoughts?

Regards,
Tang

RE: Skipping logical replication transactions on subscriber side

2022-01-23 Thread tanghy.f...@fujitsu.com

On Fri, Jan 21, 2022 7:55 PM Amit Kapila  wrote:
> 
> 2.
> +stop_skipping_changes(bool clear_subskipxid, XLogRecPtr origin_lsn,
> +   TimestampTz origin_timestamp)
> +{
> + Assert(is_skipping_changes());
> +
> + ereport(LOG,
> + (errmsg("done skipping logical replication transaction %u",
> + skip_xid)));
> 
> Isn't it better to move this LOG at the end of this function? Because
> clear* functions can give an error, so it is better to move it after
> that. I have done that in the attached.
> 

+   /* Stop skipping changes */
+   skip_xid = InvalidTransactionId;
+
+   ereport(LOG,
+   (errmsg("done skipping logical replication transaction 
%u",
+   skip_xid)));


I think we can move the LOG before resetting skip_xid, otherwise skip_xid would
always be 0 in the LOG.

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-24 Thread tanghy.f...@fujitsu.com

On Monday, January 24, 2022 6:36 PM, Peter Eisentraut 
 wrote:
> The way your patch works now is that the case-insensitive behavior you
> are implementing only works if the client encoding is a single-byte
> encoding.  This isn't what downcase_identifier() does;
> downcase_identifier() always works for ASCII characters.  As it is, this
> patch is nearly useless, since very few people use single-byte client
> encodings anymore.  Also, I think it would be highly confusing if the
> tab completion behavior depended on the client encoding in a significant
> way.

Thanks for your review. I misunderstood the logic of downcase_identifier().
Modified the code to support ASCII characters input. 

> Also, as I had previously suspected, your patch treats the completion of
> enum labels in a case-insensitive way (since it all goes through
> _complete_from_query()), but enum labels are not case insensitive.  You
> can observe this behavior using this test case:
> 
> +check_completion("ALTER TYPE enum1 RENAME VALUE 'F\t\t", qr|foo|, "FIXME");
> +
> +clear_line();

Your suspect is correct. I didn't aware enum labels are case sensitive.
I've added this test to the tap tests. 

> You should devise a principled way to communicate to
> _complete_from_query() whether it should do case-sensitive or
> -insensitive completion.  We already have COMPLETE_WITH() and
> COMPLETE_WITH_CS() etc. to do this in other cases, so it should be
> straightforward to adapt a similar system.

I tried to add a flag(casesensitive) in the _complete_from_query().
Now the attached patch passed all the added tap tests.

Regards,
Tang


v13-0001-Support-tab-completion-with-a-query-result-for-u.patch
Description:  v13-0001-Support-tab-completion-with-a-query-result-for-u.patch

RE: Support tab completion for upper character inputs in psql

2022-01-25 Thread tanghy.f...@fujitsu.com

On Tuesday, January 25, 2022 6:44 PM, Julien Rouhaud  wrote:
> Thanks for updating the patch.  When you do so, please check and update the
> commitfest entry accordingly to make sure that people knows it's ready for
> review.  I'm switching the entry to Needs Review.
> 

Thanks for your reminder. I'll watch out the status change as you suggested.

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-27 Thread tanghy.f...@fujitsu.com

On Friday, January 28, 2022 5:24 AM, Tom Lane  wrote:
> Here's a fleshed-out patch series for this idea.

Thanks for you patch. 
I did some tests on it and here are something cases I feel we need to confirm 
whether they are suitable.

1) postgres=# create table atest(id int, "iD" int, "ID" int);
2) CREATE TABLE
3) postgres=# alter table atest rename i[TAB]
4) id"iD"
5) postgres=# alter table atest rename I[TAB]
6) id"iD"

The tab completion for 5) ignored "ID", is that correct?

7) postgres=# create table "aTest"("myCol" int, mycol int);
8) CREATE TABLE
9) postgres=# alter table a[TAB]
10) ALL IN TABLESPACE  atest  "aTest"
11) postgres=# alter table aT[TAB]  -> atest

I think what we are trying to do is to ease the burden of typing double quote 
for user.
But in line 11), the tab completion for "alter table aT[TAB]" is attest,
which makes the tab completion output of "aTest" at 10) no value.
Because if user needs to alter table aTest they still needs to 
type double quote manually.

Another thing is the inconsistency  of the output result.
12) postgres=# alter table atest rename i[TAB]
13) id"iD"
14) postgres=# alter table atest rename "i[TAB]
15) "id"  "iD"

By applying the new fix, Line 15 added the output of "id".
I think it's good to keep user input '"' and convenient when using tab 
completion.
One the other hand, I'm not so comfortable with the output of "iD" in line 13.
If user doesn't type double quote, why we add double quote to the output?
Could we make the output of 13) like below?
12) postgres=# alter table atest rename i[TAB]
??) id  iD

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-29 Thread tanghy.f...@fujitsu.com

On Saturday, January 29, 2022 1:03 AM, Tom Lane  wrote:
> "[email protected]"  writes:
> > I did some tests on it and here are something cases I feel we need to 
> > confirm
> > whether they are suitable.
> 
> > 1) postgres=# create table atest(id int, "iD" int, "ID" int);
> > 2) CREATE TABLE
> > 3) postgres=# alter table atest rename i[TAB]
> > 4) id"iD"
> > 5) postgres=# alter table atest rename I[TAB]
> > 6) id"iD"
> 
> > The tab completion for 5) ignored "ID", is that correct?
> 
> Perhaps I misunderstood your original complaint, but what I thought
> you were unhappy about was that unquoted ID is a legal spelling of
> "id" and so I ought to be willing to complete that.  These
> examples with case variants of the same word are of some interest,
> but people aren't really going to create tables with these sorts of
> names, so we shouldn't let them drive the design IMO.
> 
> Anyway, the existing behavior for these examples is
> 
> alter table atest rename i --- completes immediately to id
> alter table atest rename I --- offers nothing
> 
> It's certainly arguable that the first case is right as-is and we
> shouldn't change it.  I think that could be handled by tweaking my
> patch so that it wouldn't offer completions that start with a quote
> unless the input word does.  That would also cause I to complete
> immediately to id, which is arguably fine.
> 
> > I think what we are trying to do is to ease the burden of typing double 
> > quote
> for user.
> 
> I'm not thinking about it that way at all.  To me, the goal is to make
> tab completion do something sensible when presented with legal variant
> spellings of a word.  The two cases where it currently fails to do
> that are (1) unquoted input that needs to be downcased, and (2) input
> that is quoted when it doesn't strictly need to be.
> 
> To the extent that we can supply a required quote that the user
> failed to type, that's fine, but it's not a primary goal of the patch.
> Examples like these make me question whether it's even something we
> want; it's resulting in extraneous matches that people might find more
> annoying than helpful.  Now I *think* that these aren't realistic
> cases and that in real cases adding quotes will be helpful more often
> than not, but it's debatable.
> 
> > One the other hand, I'm not so comfortable with the output of "iD" in line
> 13.
> > If user doesn't type double quote, why we add double quote to the output?
> 
> That's certainly a valid argument.
> 
> > Could we make the output of 13) like below?
> > 12) postgres=# alter table atest rename i[TAB]
> > ??) id  iD
> 
> That doesn't seem sensible at all.

Thanks for your kindly explanation. 
I'm fine with the current tap completion style with your V16 patch.

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2022-01-29 Thread tanghy.f...@fujitsu.com

On Saturday, January 29, 2022 7:17 AM, Tom Lane  wrote:
> Sigh ... per the cfbot, this was already blindsided by 95787e849.
> As I said, I don't want to sit on this for very long.

Thanks for your V16 patch, I tested it. 
The results LGTM.

Regards,
Tang

RE: [BUG]Update Toast data failure in logical replication

2022-02-08 Thread tanghy.f...@fujitsu.com

On Mon, Feb 7, 2022 2:55 PM Amit Kapila  wrote:
> 
> On Sat, Feb 5, 2022 at 6:10 AM Amit Kapila  wrote:
> >
> > On Fri, Feb 4, 2022 at 9:06 PM Alvaro Herrera  
> > wrote:
> > >
> > >
> > >  I have some suggestions
> > > on the comments and docs though.
> > >
> >
> > Thanks, your suggestions look good to me. I'll take care of these in
> > the next version.
> >
> 
> Attached please find the modified patches.
> 

Thanks for your patch. I tried it and it works well.
Two small comments:

1)
+static Bitmapset *HeapDetermineColumnsInfo(Relation relation,
+   
   Bitmapset *interesting_cols,
+   
   Bitmapset *external_cols,
+   
   HeapTuple oldtup, HeapTuple newtup,
+   
   bool *id_has_external);

+HeapDetermineColumnsInfo(Relation relation,
+Bitmapset *interesting_cols,
+Bitmapset *external_cols,
+HeapTuple oldtup, HeapTuple 
newtup,
+bool *has_external)

The declaration and the definition of this function use different parameter
names for the last parameter (id_has_external and has_external), it's better to
be consistent.

2)
+   /*
+* Check if the old tuple's attribute is stored externally and 
is a
+* member of external_cols.
+*/
+   if (VARATT_IS_EXTERNAL((struct varlena *) 
DatumGetPointer(value1)) &&
+   bms_is_member(attrnum - 
FirstLowInvalidHeapAttributeNumber,
+ external_cols))
+   *has_external = true;

If has_external is already true, it seems we don't need this check, so should we
check has_external first?

Regards,
Tang

RE: [BUG]Update Toast data failure in logical replication

2022-02-08 Thread tanghy.f...@fujitsu.com

On Tue, Feb 8, 2022 3:18 AM Andres Freund  wrote:
> 
> On 2022-02-07 08:44:00 +0530, Amit Kapila wrote:
> > Right, and it is getting changed. We are just printing the first 200
> > characters (by using SQL [1]) from the decoded tuple so what is shown
> > in the results is the initial 200 bytes.
> 
> Ah, I knew I must have been missing something.
> 
> 
> > The complete decoded data after the patch is as follows:
> 
> Hm. I think we should change the way the strings are shortened - otherwise we
> don't really verify much in that test. Perhaps we could just replace the long
> repetitive strings with something shorter in the output?
> 
> E.g. using something like regexp_replace(data,
> '(1234567890|9876543210){200}', '\1{200}','g')
> inside the substr().
> 
> Wonder if we should deduplicate the number of different toasted strings in the
> file to something that'd allow us to have a single "redact_toast" function or
> such. There's too many different ones to have a reasonbly simple redaction
> function right now. But that's perhaps better done separately.
> 

I tried to make the output shorter using your suggestion like the following 
SQL, 
please see the attached patch, which is based on v8 patch[1].

SELECT substr(regexp_replace(data, '(1234567890|9876543210){200}', 
'\1{200}','g'), 1, 200) FROM pg_logical_slot_get_changes('regression_slot', 
NULL, NULL, 'include-xids', '0', 'skip-empty-xacts', '1');

Note that some strings are still longer than 200 characters even though they 
have 
been shorter, so they can't be shown entirely.

e.g.
table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' 
new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum 
toasted_col1[text]:unchanged-toast-datum toasted_col2[te

The entire string is:
table public.toasted_key: UPDATE: old-key: toasted_key[text]:'1234567890{200}' 
new-tuple: id[integer]:1 toasted_key[text]:unchanged-toast-datum 
toasted_col1[text]:unchanged-toast-datum toasted_col2[text]:'9876543210{200}'

Maybe it's better to change the substr length to 250 to show the entire string, 
or we 
can do it as separate HEAD only improvement where we can deduplicate some of the
other long strings as well. Thoughts?

[1] 
https://www.postgresql.org/message-id/CAA4eK1L_Z_2LDwMNbGrwoO%2BFc-2Q04YORQSA9UfGUTMQpy2O1Q%40mail.gmail.com

Regards,
Tang


improve_toast_test.diff
Description: improve_toast_test.diff

RE: [BUG]Update Toast data failure in logical replication

2022-02-10 Thread tanghy.f...@fujitsu.com

On Thu, Feb 10, 2022 9:34 PM Amit Kapila  wrote:
> 
> On Mon, Feb 7, 2022 at 1:27 PM Dilip Kumar  wrote:
> >
> > On Mon, Feb 7, 2022 at 12:25 PM Amit Kapila 
> wrote:
> > >
> > > Attached please find the modified patches.
> >
> > I have looked into the latest modification and back branch patches and
> > they look fine to me.
> >
> 
> Today, while looking at this patch again, I think I see one problem
> with the below change (referring pg10 patch):
> + if (attrnum < 0)
> + {
> + if (attrnum != ObjectIdAttributeNumber &&
> + attrnum != TableOidAttributeNumber)
> + {
> + modified = bms_add_member(modified,
> +   attrnum -
> +   FirstLowInvalidHeapAttributeNumber);
> + continue;
> + }
> + }
> ...
> ...
> + /* No need to check attributes that can't be stored externally. */
> + if (isnull1 || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1)
> + continue;
> 
> I think it is possible that we use TupleDescAttr for system attribute
> (in this case ObjectIdAttributeNumber/TableOidAttributeNumber) which
> will be wrong as it contains only user attributes, not system
> attributes. See comments atop TupleDescData.
> 
> I think this check should be modified to  if (attrnum < 0 || isnull1
> || TupleDescAttr(tupdesc, attrnum - 1)->attlen != -1). What do you
> think?
> 

I agree with you.

> * Another minor comment:
> + if (!heap_attr_equals(RelationGetDescr(relation), attrnum, value1,
> +   value2, isnull1, isnull2))
> 
> I think here we can directly use tupdesc instead of 
> RelationGetDescr(relation).
> 

+1.

Attached the patches which fixed the above two comments and the first comment in
my previous mail [1], the rest is the same as before.
I ran the tests on all branches, they all passed as expected.

[1] 
https://www.postgresql.org/message-id/OS0PR01MB61134DD41BE6D986B9DB80CCFB2E9%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Tang


13-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Description:  13-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch


14-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Description:  14-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch


HEAD-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch
Description:  HEAD-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-at.patch


10-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Description:  10-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch


11-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Description:  11-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch


12-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch
Description:  12-v9-0001-WAL-log-unchanged-toasted-replica-identity-key-.patch

RE: Failed transaction statistics to measure the logical replication progress

2022-02-17 Thread tanghy.f...@fujitsu.com

On Wed, Jan 12, 2022 8:35 PM [email protected] 
 wrote:
> 
> The attached v21 has a couple of other minor updates
> like a modification of error message text.
> 
> 

Thanks for updating the patch. Here are some comments.

1) I saw the following description about pg_stat_subscription_workers view in
existing doc:

   The pg_stat_subscription_workers view will contain
   one row per subscription worker on which errors have occurred, ...

It only says  "which errors have occurred", maybe we should also mention
transactions here, right?

2)
/* --
+ * pgstat_send_subworker_xact_stats() -
+ *
+ * Send a subworker's transaction stats to the collector.
+ * The statistics are cleared upon return.

Should "The statistics are cleared upon return" changed to "The statistics are
cleared upon sending"? Because if it doesn't reach PGSTAT_STAT_INTERVAL and the
transaction stats are not sent, the function will return without clearing out
statistics.

3)
+   Assert(command == LOGICAL_REP_MSG_COMMIT ||
+  command == LOGICAL_REP_MSG_STREAM_COMMIT ||
+  command == LOGICAL_REP_MSG_COMMIT_PREPARED ||
+  command == LOGICAL_REP_MSG_ROLLBACK_PREPARED);
+
+   switch (command)
+   {
+   case LOGICAL_REP_MSG_COMMIT:
+   case LOGICAL_REP_MSG_STREAM_COMMIT:
+   case LOGICAL_REP_MSG_COMMIT_PREPARED:
+   MyLogicalRepWorker->commit_count++;
+   break;
+   case LOGICAL_REP_MSG_ROLLBACK_PREPARED:
+   MyLogicalRepWorker->abort_count++;
+   break;
+   default:
+   ereport(ERROR,
+   errmsg("invalid logical message type 
for transaction statistics of subscription"));
+   break;
+   }

I'm not sure that do we need the assert, because it will report an error later
if command is an invalid value.

4) I noticed that the abort_count doesn't include aborted streaming 
transactions.
Should we take this case into consideration?

Regards,
Tang

RE: Support tab completion for upper character inputs in psql

2021-04-14 Thread tanghy.f...@fujitsu.com

On Thursday, April 8, 2021 4:14 PM, Peter Eisentraut 
 wrote

>Seeing the tests you provided, it's pretty obvious that the current 
>behavior is insufficient.  I think we could probably think of a few more 
>tests, for example exercising the "If case insensitive matching was 
>requested initially, adjust the case according to setting." case, or 
>something with quoted identifiers.

Thanks for your review and suggestions on my patch. 
I've added more tests in the latest patch V5, the added tests helped me find 
some bugs in my patch and I fixed them.
Now the patch can support not only the SET/SHOW [PARAMETER] but also UPDATE 
["aTable"|ATABLE], also UPDATE atable SET ["aColumn"|ACOLUMN].

I really hope someone can have more tests suggestions on my patch or kindly do 
some tests on my patch and share me if any bugs happened.

Differences from V4 are:
* fix some bugs related to quoted identifiers.
* add some tap tests.

Regards,
Tang


V5-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V5-0001-Support-tab-completion-with-a-query-result-for-upper.patch

RE: Support tab completion for upper character inputs in psql

2021-04-22 Thread tanghy.f...@fujitsu.com

On Wednesday, April 21, 2021 1:24 PM, Peter Smith  Wrote

>I tried playing a bit with your psql patch V5 and I did not find any
>problems - it seemed to work as advertised.
>
>Below are a few code review comments.

Thanks for you review. I've updated the patch to V6 according to your comments.

>1. Patch applies with whitespace warnings.
Fixed.

>2. Unrelated "code tidy" fixes maybe should be another patch?
Agreed. Will post this modification on another thread.

>3. Unnecessary NULL check?
Agreed. NULL check removed.

>4. Memory not freed in multiple places?
oops. Memory free added.

>5. strncmp replacement?
Agreed. Thanks for your advice. Since this modification has little relation 
with my patch here.
I will merge this with comment(2) and push this on another patch.

>6. byte_length == 0?
>The byte_length was not being checked before, so why is the check needed now?

We need to make sure the empty input to be case sensitive as before(HEAD).
For example
CREATE TABLE onetab1 (f1 int);
update onetab1 SET [tab]

Without the check of "byte_length == 0", pg_strdup_keyword_case will make the 
column name "f1" to be upper case "F1".
Namely, the output will be " update onetab1 SET F1" which is not so good.

I added some tab tests for this empty input case, too. 

>7. test typo "ralation"
>8. test typo "case-insensitiveq"
Thanks, typo fixed. 

Any further comment is very welcome.

Regards,
Tang


V6-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V6-0001-Support-tab-completion-with-a-query-result-for-upper.patch

use pg_strncasecmp to replace strncmp when compare "pg_"

2021-04-22 Thread tanghy.f...@fujitsu.com

Hi

When try to improve the tab compleation feature in [1], I found an existing 
problem and a typo.
The patch was attached, please kindly to take a look at it. Thanks.

[1]
https://www.postgresql.org/message-id/OS0PR01MB61131A4347D385F02F60E123FB469%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Tang


0001-use-pg_strncasecmp-to-replace-strncmp-when-compare-p.patch
Description:  0001-use-pg_strncasecmp-to-replace-strncmp-when-compare-p.patch

RE: Support tab completion for upper character inputs in psql

2021-04-26 Thread tanghy.f...@fujitsu.com

Hi 

I've updated the patch to V7 based on the following comments. 

On Friday, April 23, 2021 11:58 AM, Kyotaro Horiguchi  
wrote
>All usages of pg_string_tolower don't need a copy.
>So don't we change the function to in-place converter?

Refer to your later discussion with Tom. Keep the code as it is.

>   if (completion_info_charp)
>+  {
>   e_info_charp = escape_string(completion_info_charp);
>+  if(e_info_charp[0] == '"')
>+  completion_case_sensitive = true;
>+  else
>+  {
>+  le_str = pg_string_tolower(e_info_charp);
>
>It seems right to lower completion_info_charp and ..2 but it is not
>right that change completion_case_sensitive here, which only affects
>the returned candidates.  

Agreed, code " completion_case_sensitive = true;" removed.

>By the way COMP_KEYWORD_CASE suggests that *keywords* are completed
>following the setting. However, they are not keywords, but
>identifiers. And some people (including me) might dislike that
>keywords and identifiers follow the same setting.  Specifically I
>sometimes want keywords to be upper-cased but identifiers (always) be
>lower-cased.

Changed my design based on your suggestion. Now the upper character inputs for 
identifiers will always turn to lower case(regardless COMP_KEYWORD_CASE) which 
I think can be accepted by most of PG users. 
  Eg: SET BYT / SET Byt
  output when apply V6 patch: SET BYTEA_OUTPUT
  output when apply V7 patch: SET bytea_output

On Friday, April 23, 2021 12:26 PM, Kyotaro Horiguchi  
wrote
>Oh, I accidentally found a doubious behsbior.
>
>=# alter table public.
>public.c1public.d1public."t"   public.t public."tt"  
>
>The "t" and "tt" are needlessly lower-cased.

Good catch. I didn’t think of schema stuff before. 
Bug fixed. Add tap tests for this scenario.

Please let me know if you find more insufficient issue in the patch. Any 
further suggestion is very welcome.

Regards,
Tang


V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch
Description:  V7-0001-Support-tab-completion-with-a-query-result-for-upper.patch

RE: use pg_strncasecmp to replace strncmp when compare "pg_"

2021-04-26 Thread tanghy.f...@fujitsu.com

On Friday, April 23, 2021 2:06 PM, Tom Lane  wrote

>>Kyotaro Horiguchi  writes:
>> That doesn't matter at all for now since we match schema identifiers
>> case-sensitively.  Maybe it should be a part of the patch in [1].
>
>Yeah --- maybe this'd make sense as part of a full patch to improve
>tab-complete.c's handling of case folding, but I'm suspicious that
>applying it on its own would just make things less consistent.

Thanks for your reply. Merged this patch to [1]. Any further comment on [1] is 
very welcome.

[1] 
https://www.postgresql.org/message-id/OS0PR01MB6113CA04E06D5BF221BC4FE2FB429%40OS0PR01MB6113.jpnprd01.prod.outlook.com

Regards,
Tang

RE: Truncate in synchronous logical replication failed

2021-04-26 Thread tanghy.f...@fujitsu.com

On Tuesday, April 27, 2021 1:17 PM, Amit Kapila  wrote

>Seeing no other suggestions, I have pushed this in HEAD only. Thanks!

Sorry for the later reply on this issue.
I tested with the latest HEAD, the issue is fixed now. Thanks a lot.

Regards
Tang

RE: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-27 Thread tanghy.f...@fujitsu.com

> I have modified the patch based on the above comments.

Thanks for your patch.
I tested again after applying your patch and the problem is fixed.

Regards
Tang

[BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication

2021-04-28 Thread tanghy.f...@fujitsu.com

Hi

I met an assertion failure at the publisher in lazy_scan_heap() when 
synchronous running logical replication. Could someone please take a look at it?

Here's what I did to produce the problem.

First, use './configure --enable-cassert' to build the PG.
Then, I created multiple publications at publisher and multiple subscriptions 
at subscriber.
Then, set the value of synchronous_standby_names and reload, make them in 
synchronous commit mode. After that, an assertion failed at publisher when I 
COMMIT and ROLLBACK transactions concurrently: 

>TRAP: FailedAssertion("!all_visible_according_to_vm || 
>prunestate.all_visible", File: "vacuumlazy.c", Line: 1347, PID: 1274675)

BTW, in asynchronous mode, the same problem can also happen but in a low 
frequency.(I tried many times, but the problem happened only 2 times)
As for synchronous mode, I found it seems easier to reproduce the problem with 
setting "autovacuum_naptime = 1".
But it still can't be 100% to reproduced it. (I tested it 5 times, 3 of them 
reproduced it.) 

The script and the log are attached. It took about 6min to run it(without 
problem) on my machine and it could be less than 6min if the server crashed.

Regards
Tang
<>


pub.log
Description: pub.log

RE: [BUG]"FailedAssertion" reported in lazy_scan_heap() when running logical replication

2021-04-28 Thread tanghy.f...@fujitsu.com

On Thursday, April 29, 2021 1:22 PM, Peter Geoghegan  wrote

>Is setting all_visible_according_to_vm false as below enough to avoid
>the assertion failure?
>
>diff --git a/src/backend/access/heap/vacuumlazy.c
>b/src/backend/access/heap/vacuumlazy.c
>index c3fc12d76c..76c17e063e 100644
>--- a/src/backend/access/heap/vacuumlazy.c
>+++ b/src/backend/access/heap/vacuumlazy.c
>@@ -1146,6 +1146,7 @@ lazy_scan_heap(LVRelState *vacrel, VacuumParams
>*params, bool aggressive)
> {
> ReleaseBuffer(vmbuffer);
> vmbuffer = InvalidBuffer;
>+all_visible_according_to_vm = false;
> }
>
> /* Remove the collected garbage tuples from table and indexes */

Thanks for your reply.
I tried your patch but the problem seems not be fixed.

Regards,
Tang

RE: [BUG] "FailedAssertion" reported when streaming in logical replication

2021-04-28 Thread tanghy.f...@fujitsu.com


On Thursday, April 29, 2021 3:18 PM, Dilip Kumar  wrote

>I tried to think about how to write a test case for this scenario, but
>I think it will not be possible to generate an automated test case for this.  
>Basically, we need 2 concurrent transactions and out of that,
>we need one transaction which just has processed only one change i.e
>XLOG_HEAP2_NEW_CID and another transaction should overflow the logical
>decoding work mem, so that we select the wrong transaction which
>doesn't have the base snapshot.  But how to control that the
>transaction which is performing the DDL just write the
>XLOG_HEAP2_NEW_CID wal and before it writes any other WAL we should
>get the WAl from other transaction which overflows the buffer.

Thanks for your updating.
Actually, I tried to make the automated test for the problem, too. But made no 
process on this.
Agreed on your opinion " not be possible to generate an automated test case for 
this ".

If anyone figure out a good solution for the test automation of this case. 
Please be kind to share that with us. Thanks.

Regards,
Tang

Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-09 Thread tanghy.f...@fujitsu.com

Hi

When using psql help with SQL commands, I found an inconsistency tab-completion 
for command "DELETE" as follows.

=# \h de[TAB]
deallocate   declare  delete from

=# \help[TAB]
ABORT  CLUSTERDELETE FROM

=# \help[ENTER]
Available help:
...
ANALYZE  CREATE OPERATOR CLASSDELETE
...

=# \h delete
Command: DELETE
Description: delete rows of a table
...

You see, the tab-completion for "DELETE" is "DELETE FROM" which is not same as 
help-command said(which is "DELETE").
I tried to figure out why "FROM" is introduced here, but no good result got. In 
[1] someone changed "DELETE" to "DELETE FROM" but no reason added.

IMO, the "FROM" is unnecessary just like "INTO" for "INSERT" command. So I 
tried to fix the inconsistency by removing "FROM" from "DELETE FROM" in 
tab-complete.c.
Please see the attached patch. Any comment or different thought is very welcome.

[1]
https://github.com/postgres/postgres/commit/4c1f9a0f0bb41c31b26bb88ba8c5d3fca4521dd7

Regards,
Tang


0001-Remove-FROM-in-DELETE-FROM.patch
Description: 0001-Remove-FROM-in-DELETE-FROM.patch

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-09 Thread tanghy.f...@fujitsu.com

On Monday, May 10, 2021 2:48 PM, Julien Rouhaud   worte
>I think the behavior now is correct.  The goal of autocompletion is to save
>keystrokes and time.  As the only valid keyword after a DELETE (at least in a
>DeleteStmt) is FROM, it's a good thing that you get back "DELETE FROM" directly
>rather than asking that to autocomplete in multiple steps.
>
>Now, the \help command is for commands, which is a different thing as the
>command in that case is DELETE not DELETE FROM, even if you will have to follow
>your DELETE with a FROM.

Thanks for your reply. I totally agree with you on the convenience of "DELETE 
FROM" autocompletion.
But I also noticed some autocompletion for "DELETE" in some cases is just 
"DELETE" already. 

=# EXPLAIN[TAB]
ANALYZE  DECLARE  DELETE   INSERT   SELECT   UPDATE   VERBOSE

=# COPY ([TAB]
DELETE  INSERT  SELECT  TABLE   UPDATE  VALUES  WITH

Maybe we should keep the behavior consistent? 
I mean we can change all "DELETE" to "DELETE FROM"  or just remove "FROM" for 
consistency.

On Monday, May 10, 2021 2:51 PM, Dilip Kumar  wrote
>I agree with Julien.  But, I also agree with the consistency point
>from Tang.  So maybe we can fix the insert and add INSERT INTO in the
>tab completion?

Yeah. Change "INSERT" to "INSERT INTO" can be a good solution, too.
But just like I mentioned above, some cases in tab-completion make "DELETE" to 
"DELETE FROM", some cases make "DELETE" to "DELETE".
I'm not sure which cases could change "INSERT" to "INSERT INTO".
Please share with me your thought on it.

Regards,
Tang

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-10 Thread tanghy.f...@fujitsu.com

On Monday, May 10, 2021 4:15 PM, Julien Rouhaud  wrote
>We should change all to DELETE FROM (apart from \help of course), and same for
>INSERT, change to INSERT INTO everywhere it makes sense.

Thanks for the reply. Your advice sounds reasonable to me.
So I tried to change all "DELETE" to "DELETE FROM" and "INSERT" to "INSERT 
INTO" in the attached patch except 
the follow cases which I think is in accordance with what PG-Doc said.
 CREATE POLICY
 CREATE [ OR REPLACE ] RULE
 CREATE [ OR REPLACE ] TRIGGER
 ALTER DEFAULT PRIVILEGES

After applying the patch, the tap-tests for psql is passed.
Please be free to tell me anything insufficient you found in my fix. Thanks.

Regards,
Tang


0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch
Description:  0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-11 Thread tanghy.f...@fujitsu.com

On Tuesday, May 11, 2021 2:53 PM, Michael Paquier  wrote
>else if (TailMatches("DELETE", "FROM", MatchAny))
>COMPLETE_WITH("USING", "WHERE");
>-   /* XXX: implement tab completion for DELETE ... USING */
>
>Why are you removing that?  This sentence is still true, no?

IIRC, XXX in comment is used to flag something that is bogus but works.
When the sentence introduced here in f5ab0a14, the fix for "DELETE ... USING" 
is not as good as it is now.(I guess that's why the comment was added). And for 
now, IMHO, we can remove the comment directly. 

If my understanding here is wrong, please let me know and that would be great 
to learn more about PG.

Regards,
Tang

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-11 Thread tanghy.f...@fujitsu.com

On Tuesday, May 11, 2021 5:44 PM, Dilip Kumar  wrote:
>But your patch is doing nothing to add the implementation for DELETE..
>USING.  Basically, the tab completion support for DELETEUSING is
>still pending right?

I see, maybe I have a misunderstanding here, I thought tab completion for 
"DELETEUSING" means the code before it as follows.
> >else if (TailMatches("DELETE", "FROM", MatchAny))
> >COMPLETE_WITH("USING", "WHERE");

So I just thought the tab completion support for DELETEUSING is not pending 
anymore.
According to your feedback, maybe something beyond my knowledge is need to be 
done for DELETEUSING.

Besides, you are right, the fix in the patch has nothing to do with the comment 
here.
Patch updated to V2 with the sentence moved back. Thanks.

Regards,
Tang

V2_0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch
Description:  V2_0001-TAB-completion-modification-for-INSERT-INTO-and-DELE.patch

RE: Remove "FROM" in "DELETE FROM" when using tab-completion

2021-05-11 Thread tanghy.f...@fujitsu.com

On Tuesday, May 11, 2021 6:55 PM, Dilip Kumar  wrote:
>Basically, it just complete with USING, now after USING tab-completion
>support is not yet there, e.g. DELETE FROM t1 USING t1 WHERE cond.
>but the current code will not suggest anything after USING.

Thanks for your kindly explanation. That's really nice of you.
Understand now.

Regards,
Tang

RE: "ERROR: deadlock detected" when replicating TRUNCATE

2021-05-17 Thread tanghy.f...@fujitsu.com

On Monday, May 17, 2021 5:47 PM, Amit Kapila  wrote
> +$node_publisher->safe_psql('postgres',
> + "ALTER SYSTEM SET synchronous_standby_names TO 'any 2(sub5_1,
> sub5_2)'");
> +$node_publisher->safe_psql('postgres', "SELECT pg_reload_conf()");
> 
> Do you really need these steps to reproduce the problem? IIUC, this
> has nothing to do with synchronous replication.

Agreed. 
I tested in asynchronous mode, and could reproduce this problem, too.

The attached patch removed the steps for setting synchronous replication.
And the test passed after applying Peter's patch.
Please take it as your reference.

Regards
Tang



v3_test_for_deadlock.patch
Description: v3_test_for_deadlock.patch

RE: [HACKERS] logical decoding of two-phase transactions

2021-05-18 Thread tanghy.f...@fujitsu.com

Hi Ajin

>The above patch had some changes missing which resulted in some tap
>tests failing. Sending an updated patchset. Keeping the patchset
>version the same.

Thanks for your patch.  I see a problem about Segmentation fault when using it. 
Please take a look at this.
The steps to reproduce the problem are as follows.

--publisher--
create table test (a int primary key, b varchar);
create publication pub for table test;

--subscriber--
create table test (a int primary key, b varchar);
create subscription sub connection 'dbname=postgres' publication pub 
with(two_phase=on);

Then, I prepare, commit, rollback transactions and TRUNCATE table in a sql as 
follows:
-
BEGIN;
INSERT INTO test SELECT i, md5(i::text) FROM generate_series(1, 1) s(i);
PREPARE TRANSACTION 't1';
COMMIT PREPARED 't1';

BEGIN;
INSERT INTO test SELECT i, md5(i::text) FROM generate_series(10001, 2) s(i);
PREPARE TRANSACTION 't2';
ROLLBACK PREPARED 't2';

TRUNCATE test;
-

To make sure the problem produce easily, I looped above operations in my sql 
file about 10 times, then I can 100% reproduce it and got segmentation fault in 
publisher log as follows:
-
2021-05-18 16:30:56.952 CST [548189] postmaster LOG:  server process (PID 
548222) was terminated by signal 11: Segmentation fault
2021-05-18 16:30:56.952 CST [548189] postmaster DETAIL:  Failed process was 
running: START_REPLICATION SLOT "sub" LOGICAL 0/0 (proto_version '3', two_phase 
'on', publication_names '"pub"')
-

Here is the core dump information :
-
#0  0x0090afe4 in pq_sendstring (buf=buf@entry=0x251ca80, str=0x0) at 
pqformat.c:199
#1  0x00ab0a2b in logicalrep_write_begin_prepare (out=0x251ca80, 
txn=txn@entry=0x25346e8) at proto.c:124
#2  0x7f9528842dd6 in pgoutput_begin_prepare (ctx=ctx@entry=0x2514700, 
txn=txn@entry=0x25346e8) at pgoutput.c:495
#3  0x7f9528843f70 in pgoutput_truncate (ctx=0x2514700, txn=0x25346e8, 
nrelations=1, relations=0x262f678, change=0x25370b8) at pgoutput.c:905
#4  0x00aa57cb in truncate_cb_wrapper (cache=, 
txn=, nrelations=, relations=, 
change=)
at logical.c:1103
#5  0x00abf333 in ReorderBufferApplyTruncate (streaming=false, 
change=0x25370b8, relations=0x262f678, nrelations=1, txn=0x25346e8, 
rb=0x2516710)
at reorderbuffer.c:1918
#6  ReorderBufferProcessTXN (rb=rb@entry=0x2516710, txn=0x25346e8, 
commit_lsn=commit_lsn@entry=27517176, snapshot_now=, 
command_id=command_id@entry=0,
streaming=streaming@entry=false) at reorderbuffer.c:2278
#7  0x00ac0b14 in ReorderBufferReplay (txn=, 
rb=rb@entry=0x2516710, xid=xid@entry=738, commit_lsn=commit_lsn@entry=27517176,
end_lsn=end_lsn@entry=27517544, 
commit_time=commit_time@entry=674644388404356, origin_id=0, origin_lsn=0) at 
reorderbuffer.c:2591
#8  0x00ac1713 in ReorderBufferCommit (rb=0x2516710, xid=xid@entry=738, 
commit_lsn=27517176, end_lsn=27517544, 
commit_time=commit_time@entry=674644388404356,
origin_id=origin_id@entry=0, origin_lsn=0) at reorderbuffer.c:2615
#9  0x00a9f702 in DecodeCommit (ctx=ctx@entry=0x2514700, 
buf=buf@entry=0x7ffdd027c2b0, parsed=parsed@entry=0x7ffdd027c140, 
xid=xid@entry=738,
two_phase=) at decode.c:742
#10 0x00a9fc6c in DecodeXactOp (ctx=ctx@entry=0x2514700, 
buf=buf@entry=0x7ffdd027c2b0) at decode.c:278
#11 0x00aa1b75 in LogicalDecodingProcessRecord (ctx=0x2514700, 
record=0x2514ac0) at decode.c:142
#12 0x00af6db1 in XLogSendLogical () at walsender.c:2876
#13 0x00afb6aa in WalSndLoop (send_data=send_data@entry=0xaf6d49 
) at walsender.c:2306
#14 0x00afbdac in StartLogicalReplication (cmd=cmd@entry=0x24da288) at 
walsender.c:1206
#15 0x00afd646 in exec_replication_command (
cmd_string=cmd_string@entry=0x2452570 "START_REPLICATION SLOT \"sub\" 
LOGICAL 0/0 (proto_version '3', two_phase 'on', publication_names '\"pub\"')") 
at walsender.c:1646
#16 0x00ba3514 in PostgresMain (argc=argc@entry=1, 
argv=argv@entry=0x7ffdd027c560, dbname=, username=) at postgres.c:4482
#17 0x00a7284a in BackendRun (port=port@entry=0x2477b60) at 
postmaster.c:4491
#18 0x00a78bba in BackendStartup (port=port@entry=0x2477b60) at 
postmaster.c:4213
#19 0x00a78ff9 in ServerLoop () at postmaster.c:1745
#20 0x00a7bbdf in PostmasterMain (argc=argc@entry=3, 
argv=argv@entry=0x244bae0) at postmaster.c:1417
#21 0x0090dc80 in main (argc=3, argv=0x244bae0) at main.c:209
-

I noticed that it called pgoutput_truncate function and pgoutput_begin_prepare 
function. It seems odd because TRUNCATE is not in a prepared transaction in my 
case.

I tried to debug this to learn more and found that in pgoutput_truncate 
function, the value of in_prepared_txn was true. Later, it got a segmentation 
fault when it tried to get gid in logicalrep_write_begin_prepare function - it 
has no gid so we got the segmentation fault.

FYI:
I also

RE: "ERROR: deadlock detected" when replicating TRUNCATE

2021-05-20 Thread tanghy.f...@fujitsu.com

On Thursday, May 20, 2021 3:05 PM, Amit Kapila  wrote:
> Okay, I have prepared the patches for all branches (11...HEAD). Each
> version needs minor changes in the test, the code doesn't need much
> change. Some notable changes in the tests:
> 1. I have removed the conf change for max_logical_replication_workers
> on the publisher node. We only need this for the subscriber node.
> 2. After creating the new subscriptions wait for initial
> synchronization as we do for other tests.
> 3. synchronous_standby_names need to be reset for the previous test.
> This is only required for HEAD.
> 4. In PG-11, we need to specify the application_name in the connection
> string, otherwise, it took the testcase file name as application_name.
> This is the same as other tests are doing in PG11.
> 
> Can you please once verify the attached patches?

I have tested your patches for all branches(11...HEAD). All of them passed. 
B.T.W, I also confirmed that the bug exists in these branches without your fix.

The changes in tests LGTM. 
But I saw whitespace warnings when applied the patches for PG11 and PG12, 
please take a look at this.

Regards
Tang

RE: [HACKERS] logical decoding of two-phase transactions

2021-05-24 Thread tanghy.f...@fujitsu.com

> > 13.
> > @@ -507,7 +558,16 @@ CreateSubscription(CreateSubscriptionStmt *stmt,
> > bool isTopLevel)
> >   {
> >   Assert(slotname);
> >
> > - walrcv_create_slot(wrconn, slotname, false,
> > + /*
> > + * Even if two_phase is set, don't create the slot with
> > + * two-phase enabled. Will enable it once all the tables are
> > + * synced and ready. This avoids race-conditions like prepared
> > + * transactions being skipped due to changes not being applied
> > + * due to checks in should_apply_changes_for_rel() when
> > + * tablesync for the corresponding tables are in progress. See
> > + * comments atop worker.c.
> > + */
> > + walrcv_create_slot(wrconn, slotname, false, false,
> >
> > Can't we enable two_phase if copy_data is false? Because in that case,
> > all relations will be in a READY state. If we do that then we should
> > also set two_phase state as 'enabled' during createsubscription. I
> > think we need to be careful to check that connect option is given and
> > copy_data is false before setting such a state. Now, I guess we may
> > not be able to optimize this to not set 'enabled' state when the
> > subscription has no rels.
> >
> 
> Fixed in v77-0001

I noticed this modification in v77-0001 and executed "CREATE SUBSCRIPTION ... 
WITH (two_phase = on, copy_data = false)", but it crashed.
-
postgres=# CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub 
WITH(two_phase = on, copy_data = false);
WARNING:  relcache reference leak: relation "pg_subscription" not closed
WARNING:  snapshot 0x34278d0 still active
NOTICE:  created replication slot "sub" on publisher
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
The connection to the server was lost. Attempting reset: Failed.
!?>
-

There are two warnings and a segmentation fault in subscriber log:
-
2021-05-24 15:08:32.435 CST [2848572] WARNING:  relcache reference leak: 
relation "pg_subscription" not closed
2021-05-24 15:08:32.435 CST [2848572] WARNING:  snapshot 0x32ce8b0 still active
2021-05-24 15:08:33.012 CST [2848555] LOG:  server process (PID 2848572) was 
terminated by signal 11: Segmentation fault
2021-05-24 15:08:33.012 CST [2848555] DETAIL:  Failed process was running: 
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub 
WITH(two_phase = on, copy_data = false);
-

The backtrace about segmentation fault is attached. It happened in table_close 
function, we got it because "CurrentResourceOwner" was NULL.

I think it was related with the first warning, which reported "relcache 
reference leak". The backtrace information is attached, too. When updating 
two-phase state in CreateSubscription function, it released resource owner and 
set "CurrentResourceOwner" as NULL in CommitTransaction function.

The second warning about "snapshot still active" was also happened in 
CommitTransaction function. It called AtEOXact_Snapshot function, checked 
leftover snapshots and reported the warning.
I debugged and found the snapshot was added in function PortalRunUtility by 
calling PushActiveSnapshot function, the address of "ActiveSnapshot" at this 
time was same as the address in warning.

In summary, when creating subscription with two_phase = on and copy_data = 
false, it calls UpdateTwoPhaseState function in CreateSubscription function to 
set two_phase state as 'enabled', and it checked and released relcache and 
snapshot too early so the NG happened. I think some change should be made to 
avoid it. Thought?

FYI
 I also tested the new released V78* at [1], the above NG still exists.
[1] 
https://www.postgresql.org/message-id/CAFPTHDab56twVmC%2B0a%3DRNcRw4KuyFdqzW0JAcvJdS63n_fRnOQ%40mail.gmail.com

Regards
Tang

#0  ResourceArrayRemove (resarr=resarr@entry=0x80, value=value@entry=46738512) 
at resowner.c:309
#1  0x00e46e96 in ResourceOwnerForgetRelationRef (owner=0x0, 
rel=rel@entry=0x2c92c50) at resowner.c:1127
#2  0x00dcdfb9 in RelationDecrementReferenceCount 
(rel=rel@entry=0x2c92c50) at relcache.c:2083
#3  0x00dcdff0 in RelationClose (relation=relation@entry=0x2c92c50) at 
relcache.c:2101
#4  0x004a7a50 in relation_close (relation=relation@entry=0x2c92c50, 
lockmode=lockmode@entry=3) at relation.c:213
#5  0x005bf6df in table_close (relation=relation@entry=0x2c92c50, 
lockmode=lockmode@entry=3) at table.c:169
#6  0x007c6687 in CreateSubscription (stmt=stmt@entry=0x2c30338, 
isTopLevel=isTopLevel@entry=true) at subscriptioncmds.c:590
#7  0x00bae4c8 in ProcessUtilitySlow (pstate=pstate@entry=0x2c52a40, 
pstmt=pstmt@entry=0x2c306a8,
queryString=queryString@entry=0x2c2f610 "create subscription sub connection 
'dbname=postgres' publication pub with(two_phase=on, copy_data = false);",
context=context@entry=PROCESS_UTILITY_TOPLEVEL, params=params@entry=0x0, 
queryEnv=queryEnv@entry=0x0, dest=0x2c30798, qc=0x7

RE: Added schema level support for publication.

2021-05-25 Thread tanghy.f...@fujitsu.com

On Monday, May 24, 2021 at 8:31 PM vignesh C  wrote:
> The earlier patch does not apply on the head. The v4 patch attached
> has the following changes:
> a) Rebased it on head. b) Removed pubschemas, pubtables columns and
> replaced it with pubtype in pg_publication table. c) List the schemas
> in describe publication. d) List the publication in list schemas. e)
> Add support for "FOR SCHEMA CURRENT_SCHEMA". f) Tab completion for
> "FOR SCHEMA" in create publication and alter publication. g) Included
> the newly added structure type to typedefs.lst

Thanks for your patch.

I ran "make check-world" after applying your patch but it failed on my machine. 
I saw the following log:
--
parallel group (2 tests):  subscription publication
 publication  ... FAILED  108 ms
 subscription ... ok   87 ms


diff -U3 
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
 
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
--- 
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/expected/publication.out
  2021-05-25 15:44:52.261683712 +0800
+++ 
/home/fnst/data/postgresql_schema/postgresql/src/test/regress/results/publication.out
   2021-05-25 15:48:41.393672595 +0800
@@ -359,10 +359,10 @@
 "public"

 \dn public;
- List of schemas
-  Name  |  Owner
-+-
- public | vignesh
+List of schemas
+  Name  | Owner
++---
+ public | fnst
 Publications:
 "testpub3_forschema"
--

I think the owner of CURRENT_SCHEMA should not be written into publication.out 
because the result is related to the user. 
Maybe we can use "ALTER SCHEMA public OWNER TO owner" to change its default 
owner before this test case. Thoughts?

Regards
Tang

RE: [HACKERS] logical decoding of two-phase transactions

2021-05-26 Thread tanghy.f...@fujitsu.com

On Wed, May 26, 2021 10:13 PM Ajin Cherian  wrote:
> 
> I've attached a patch that fixes this issue. Do test and confirm.
> 

Thanks for your patch.
I have tested and confirmed that the issue I reported has been fixed.

Regards
Tang

[BUG]Update Toast data failure in logical replication

2021-05-27 Thread tanghy.f...@fujitsu.com

Hi

I think I just found a bug in logical replication. Data couldn't be 
synchronized while updating toast data. Could anyone take a look at it?

Here is the steps to proceduce the BUG:
--publisher--
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
CREATE PUBLICATION pub FOR TABLE toasted_key;

--subscriber--
CREATE TABLE toasted_key (
id serial,
toasted_key text PRIMARY KEY,
toasted_col1 text,
toasted_col2 text
);
CREATE SUBSCRIPTION sub CONNECTION 'dbname=postgres' PUBLICATION pub;

--publisher--
ALTER TABLE toasted_key ALTER COLUMN toasted_key SET STORAGE EXTERNAL;
ALTER TABLE toasted_key ALTER COLUMN toasted_col1 SET STORAGE EXTERNAL;
INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES(repeat('1234567890', 
200), repeat('9876543210', 200));
UPDATE toasted_key SET toasted_col2 = toasted_col1;

--subscriber--
SELECT count(*) FROM toasted_key WHERE toasted_col2 = toasted_col1;

The above command is supposed to output "count = 1" but in fact it outputs 
"count = 0" which means UPDATE operation failed at the subscriber. Right?

I debugged and found the subscriber could receive message from publisher, and 
in apply_handle_update_internal function, it invoked FindReplTupleInLocalRel 
function but failed to find a tuple.
FYI, I also tested DELETE operation(DELETE FROM toasted_key;), which also 
invoked FindReplTupleInLocalRel function, and the result is ok.

Regards
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-05-27 Thread tanghy.f...@fujitsu.com

On Friday, May 28, 2021 2:16 PM, [email protected] wrote: 
>I think I just found a bug in logical replication. Data couldn't be 
>synchronized while updating toast data. 
>Could anyone take a look at it?

FYI. The problem also occurs in PG-13. I will try to check from which version 
it got introduced.

Regards,
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-05-28 Thread tanghy.f...@fujitsu.com

On Friday, May 28, 2021 3:02 PM, [email protected] wrote: 
> FYI. The problem also occurs in PG-13. I will try to check from which version 
> it
> got introduced.

I reproduced it in PG-10,11,12,13.
I think the problem has been existing since Logical replication introduced in 
PG-10.

Regards
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-05-30 Thread tanghy.f...@fujitsu.com

On Friday, May 28, 2021 6:51 PM, Dilip Kumar  wrote:
> Seems you did not set the replica identity for updating the tuple.
> Try this before updating, and it should work.

Thanks for your reply. I tried it.

> ALTER TABLE toasted_key REPLICA IDENTITY USING INDEX toasted_key_pkey;

This didn't work.

> or
> 
> ALTER TABLE toasted_key REPLICA IDENTITY FULL.

It worked.

And I noticed if the length of PRIMARY KEY (toasted_key) is short, data could 
be synchronized successfully with default replica identity. 
Could you tell me why we need to set replica identity?

Regards
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-05-31 Thread tanghy.f...@fujitsu.com

On Mon, May 31, 2021 5:12 PM Dilip Kumar  wrote:
> 
> The problem is if the key attribute is not changed we don't log it as
> it should get logged along with the updated tuple, but if it is
> externally stored then the complete key will never be logged because
> there is no log from the toast table.  For fixing this if the key is
> externally stored then always log that.
> 
> Please test with the attached patch.

Thanks for your patch. I tested it and the bug was fixed.
I'm still trying to understand your fix, please allow me to ask more(maybe 
silly) questions if I found any.

+* if the key hasn't changedand we're only logging the key, we're done.

I think "changedand" should be "changed and".

Regards
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-05-31 Thread tanghy.f...@fujitsu.com

Hi

I have some questions with your patch. Here are two cases I used to check the 
bug.

Case1(PK toasted_key is short), data could be synchronized on HEAD.
---
INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES('111', 
repeat('9876543210', 200));
UPDATE toasted_key SET toasted_col2 = toasted_col1;
---

Case2(PK toasted_key is very long), data couldn’t be synchronized on 
HEAD.(which is the bug)
---
INSERT INTO toasted_key(toasted_key, toasted_col1) VALUES(repeat('9876543210', 
200), '111');
UPDATE toasted_key SET toasted_col2 = toasted_col1;
---

So I think the bug is only related with the length of primary key.
I noticed that in case1, ExtractReplicaIdentity function returned NULL on HEAD. 
But after your fix, it didn’t return NULL. There is no problem with this case 
on HEAD, but the patch modified its return value. I’m not sure if it would 
bring new problems. Have you checked it?

Regards
Tang

RE: [BUG]Update Toast data failure in logical replication

2021-06-02 Thread tanghy.f...@fujitsu.com

On Wed, Jun 2, 2021 2:44 PM Dilip Kumar  wrote: 
> Attached patch fixes that, I haven't yet added the test case.  Once
> someone confirms on the approach then I will add a test case to the
> patch.

key_tuple = heap_form_tuple(desc, values, nulls);
*copy = true;
...
key_tuple = toast_flatten_tuple(oldtup, desc);
heap_freetuple(oldtup);
}
+   /*
+* If key tuple doesn't have any external data and key is not changed 
then
+* just free the key tuple and return NULL.
+*/
+   else if (!key_changed)
+   {
+   heap_freetuple(key_tuple);
+   return NULL;
+   }
 
return key_tuple;
 }

I think "*copy = false" should be added before return NULL because we don't 
return a modified copy tuple here. Thoughts?

Regards
Tang

tab-complete for CREATE TYPE ... SUBSCRIPT

2021-06-02 Thread tanghy.f...@fujitsu.com

Hi

Attached a patch to support tab completion for CREATE TYPE ... SUBSCRIPT 
introduced at c7aba7c14e.

Regards,
Tang


0001-psql-tab-complete-CREATE-TYPE-.-SUBSCRIPT.patch
Description: 0001-psql-tab-complete-CREATE-TYPE-.-SUBSCRIPT.patch

1 2 >

1 - 100 of 162 matches

Mail list logo